My teacher in c++ told me that call by reference should only be used if I'm not going to change anything on the arrays inside the function.
I have some really big vectors that I'm passing around in my program. All the vectors will be modified inside the functions. My matrices are of sizes about [256*256][256][50]...
Is there some particular reason not to use call-by reference here?
AFAIK call by reference should be way faster and consume less memory?
Besides all common discussions on when and how to pass by possibly const reference for non-primitive types, arrays are quite special here.
Due to backwards compatibility with C, and there due to your specific problem: arrays can be huge, arrays are never really passed by value in either C or C++. The array will decay into a pointer to the first element, so when you write:
void foo( type array[100] );
The compiler is actually processing:
void foo( type *array );
Regardless of what the size of the array is (two common pitfalls there: trusting that array is an array inside foo and believing that it will be guaranteed to be 100 elements on it.
Now, in C++ you can actually pass arrays by reference, but the reference must be of the concrete type of the array, that includes the size:
void foo_array( type (&array)[100] );
The funny syntax there is telling the compiler that the function will take an array of exactly 100 elements of type type. The advantage there is that the compiler can perform size checking for you:
// assuming 'type' is defined
int main() {
type array0[99];
type array1[100];
foo( array0 ); // compiles, but if size=100 is assumed it will probably break
// equivalent to: foo( &array0[0] )
// foo2( array0 ); // will not compile, size is not 100
foo2( array1 ); // compiles, size is guaranteed to be 100
}
Now, the problem is that your function will only work for an array of exactly 100 elements, and in some cases, you might want to perform the same operation in different array sizes. The two solutions are: template the function in the size of the array which will provide a size-safe implementation for each used size --greater compile time and binary size, the template is compiled for every different size-- or using the pass-by-value syntax, which will make the array decay --not safe in size, that must be passed as extra argument, lesser compile time and binary size. A third option is combining both:
void foo( type *array, int size );
template <size_t N>
void foo( type (&array)[N] ) {
foo( array, N );
}
In this case, while there will be one templated foo for each size, the compiler will most probably inline the call and the generated code would be equivalent to the caller providing the array and size. No extra computations needed and type safety for real arrays.
Now, pass-by-reference is very rarely used with arrays.
My teacher in c++ told me that call by reference should only be used if I'm not going to change anything on the arrays inside the function.
It should be used when you are not changing something inside the function or you change things and want the changes to be reflected to the original array or don't care about the changes to be reflected in the original array.
It shouldn't be used if you don't want your function to change your original array (you need to preserve the original values after the call) and the callee function changes the values of the passed argument.
Your teacher is wrong. If you need to modify arrays, pass by reference is the way to go. If you don't want something modified, pass by const reference.
To prevent accidental changes, use pass-by-const-reference; that way, by default*, the passed-in array can't get changed by the called function.
* Can be overridden with const_cast.
You can pass by reference if:
you won't modify passed object
you want to modify object and don't want to keep old object untouched
When you pass something by reference, then only pointer is passed to function. If you pass whole object then you need to copy it, so it will consume more cpu and memory.
Generally speaking, objects should always be passed by reference. Otherwise a copy of the object will be generated and if the object is substantially big, this will affect performance.
Now if the method or function you are calling does not modify the object, it is a good idea to declare the function as follows:
void some_function(const some_object& o);
This will generate a compile error if you attempt to modify the object's state inside the function body.
Also it should be noted that arrays are always passed by reference.
Hold on a second.. I'm scared at how people are answering this one. Arrays, as far as I remember, are always passed by reference.
void function(int array[])
{
std::cout << array[0] << '\n';
}
// somewhere else..
int array[2] = { 1, 2 };
function(array); // No copy happens here; it is passed by reference
Further, you can't say the array argument is a reference explicitly, as that would be the syntax for creating an array of references (something that's not allowed).
void function(int &array[]) // error here
{ /* ... */ }
So what do you mean?
Further, many are saying that you should only do that if you modify the contents of the array inside the function. Then, what about reference-to-const?
void function(const int arr[])
{
std::cout << arr[0] << '\n';
}
-- edit
Will somebody please point me out how to not pass an array by reference in C++?
-- edit
Oh, so you're talking about vectors. Okay, then the rules of thumb are:
Pass by reference only when you want to modify the contents of the vector.
Pass by reference-to-const whenever you can.
Pass by value only when the object in question is really, really small (like a struct containing an integer, for example), or when it makes sense to (can't think of a case out of the top of my head).
Did I miss something?
-- edit
In the case of plain C arrays, it's a good idea to pass them by reference (like in void function(int (&array)[100])) when you want to ensure that the array has a given definite size.
Thanks, dribeas.
Usually, in introductory courses, they tell you that so you don't accidentally change something you didn't want to.
Like if you passed in userName by reference, and accidentally changed it to mrsbuxley that probably would cause errors, or at the very least be confusing later on.
I don't see any reason why you can't pass by reference. Alternatively you could pass pointers around, but I think pass by reference is better sometimes as it avoids null pointer exceptions.
If your teacher has suggested this as some kind of convention, then feel free to break it if it makes sense to. You can always document this in a comment above the function.
Our house style is to NEVER pass an object by value but to always pass a reference or const reference. Not only do we have data structures that can contain 100s of MB of data and pass by value would be an application killer, but also if we were passing 3D points and vectors by value the our applications would grind to a halt.
It is always a good choice to pass object by reference but we need to be careful and first we have to decide what is our purpose/ purpose of our function?
You have to make a choice here, whether we are gonna only read the data of an object or modify it.
Suppose you got an interface like
void increament_value(int& a);
so in this you can modify value an object which we are passing, but it is a disaster when you passing your sensitive data, you might lose you original data and can not revert it, right?
so c++ provides you a functionality to not to change the value of an object whose reference you are passing to a function, and it is always a good choice to pass a const reference of an object for e.g.,
double get_discounted_amount(const double &amount,double discount){
return (amount*discount/100);
}
This guarantees that your actual value of an object is not gonna change, but again it depends on purpose of your interface whether you wanna change it or only use(read) it
Related
I have one quick question about the passing of arrays in C++ which I don't understand.
Basically when you want to pass a array of type integer to another function you have to pass an address to that array instead of directly passing the whole block of contiguous memory. Exactly why is the case?
Also, why is that char arrays can directly be passed to another function in C++ without the need to pass an address instead??
I have tried looking for learning materials for this online (such as cplusplus.com) but I haven't managed to find and explanation for this.
Thanks for your time, Dan.
As long as C++ is concerned, passing char arrays and int arrays are same.
There are 2 ways to pass arrays in c++.
Address is passed
int fn(int *arrays, int len);
int fn(int arrays[], int len); // Similar to above, still staying as sytax hangover from anci c
Array reference is passed
int fn(int (&array)[SIZE]); // Actual array passed as reference
You can templatized above function as
template<size_t SIZE>
int fn(int (&array)[SIZE]);
Above method allows you to pass array of anysize to this function. But beware, a different function is created from template for each size. If your function's side effect changes a local state (static variable for ex), this should be used with care.
If you don't want to change contents, use const with arguments.
If you want a copy of array in function argument, consider using stl container like std::array or std::vector or embed array in your class.
It isn't entirely clear from your question exactly what you're trying and what problems you've had, but I'll try to give you useful answers anyway.
Firstly, what you're talking about is probably int[] or int* (or some other type), which isn't an array itself... its a pointer to a chunk of memory, which can be accessed as if it were an array. Because all you have is a pointer, the array has to be passed around by reference.
Secondly, passing around an array as a "whole block of contiguous memory" is rather inefficient... passing the point around might only involve moving a 32 or 64 bit value. Passing by reference is often a sensible thing with memory buffers, and you can explicitly use functions like memcpy to copy data if you needed to.
Thirdly, I don't understand what you mean about char arrays being "directly" passable, but other types of arrays cannot be. There's nothing magic about char arrays when it comes to passing or storing them... they're just arrays like any other. The principle difference is that compilers allow you to use string literals to create char arrays.
Lastly, if you're using C++11, you might want to consider the new std::array<T> class. It provides various handy facilities, including automatic memory management and keeping track of its own size. You can pass these by value, template<class T> void foo(std::array<T> bar) or by reference template<class T> void foo(std::array<T>& bar), as you like.
You can't pass any array by value. You can pass by value either a struct containing array or std::array from C++11.
There are many questions about "when do I use reference and when pointers?". They confused me a little bit. I thought a reference wouldn't take any memory because it's just the address.
Now I made a simple Date class and showed them the community of code-review. They told me not to use the reference in the following example. But why?
Someone told me that it'll allocate the same memory a pointer would allocate. That's the opposite of what I learned.
class A{
int a;
public:
void setA(const int& b) { a = b; } /* Bad! - But why?*/
};
class B{
int b;
public:
void setB(int c) { b = c; } /* They told me to do this */
};
So when do I use references or pointers in arguments and when just a simple copy? Without the reference in my example, is the constant unnecessary?
It is not guaranteed to be bad. But it is unnecessary in this specific case.
In many (or most) contexts, references are implemented as pointers in disguise. Your example happens to be one of those cases. Assuming that the function does not get inlined, parameter b will be implemented "under the hood" as a pointer. So, what you really pass into setA in the first version is a pointer to int, i.e. something that provides indirect access to your argument value. In the second version you pass an immediate int, i.e. something that provides direct access to your argument value.
Which is better and which is worse? Well, a pointer in many cases has greater size than an int, meaning that the first variant might passes larger amount of data. This might be considered "bad", but since both data types will typically fit into the hardware word size, it will probably make no appreciable difference, especially if parameters are passed in CPU registers.
Also, in order to read b inside the function you have to dereference that disguised pointer. This is also "bad" from the performance point of view.
These are the formal reasons one would prefer to pass by value any parameters of small size (smaller or equal to pointer size). For parameters or bigger size, passing by const reference becomes a better idea (assuming you don't explicitly require a copy).
However, in most cases a function that simple will probably be inlined, which will completely eliminate the difference between the two variants, regardless of which parameter type you use.
The matter of const being unnecessary in the second variant is a different story. In the first variant that const serves two important purposes:
1) It prevents you from modifying the parameter value, and thus protects the actual argument from modification. If the reference weren't const, you would be able to modify the reference parameter and thus modify the argument.
2) It allows you to use rvalues as arguments, e.g. call some_obj.setA(5). Without that const such calls would be impossible.
In the second version neither of this is an issue. There's no need to protect the actual argument from modification, since the parameter is a local copy of that argument. Regardless of what you do to the parameter, the actual argument will remain unchanged. And you can already use rvalues as arguments to SetA regardless of whether the parameter is declared const or not.
For this reason people don't normally use top-level const qualifiers on parameters passed by value. But if you do declare it const, it will simply prevent you from modifying the local b inside the function. Some people actually like that, since it enforces the moderately popular "don't modify original parameter values" convention, for which reason you might sometimes see top-level const qualifiers being used in parameter declarations.
If you has light-weight type like a int or long you should use passing by value, because there won't be additional costs from work with references. But when you passing some heavy types, you should use references
I agree with the reviewer. And here's why:
A (const or non-const) reference to a small simple type, such as int will be more complex (in terms of number of instructions). This is because the calling code will have to pass the address of the argument into setA, and then inside setA the value has to be dereferenced from the address stored in b. In the case where b is a plain int, it just copies the value itself. So there is at least one step of a memory reference in saving. This may not make much of a difference in a long runtime of a large program, but if you keep adding one extra cycle everywhere you do this, then it does soon add up to noticeably slower.
I had a look at a piece of code that went something like this:
class X
{
vector v;
public:
...
void find(int& index, int b);
....
}
bool X::find(int &index, int b)
{
while(v[index] != b)
{
if (index == v.size()-1)
{
return false;
}
index++;
}
return true;
}
Rewriting this code to:
bool X::find(int &index, int b)
{
int i = index;
while(v[i] != b)
{
if (i == v.size()-1)
{
index = i;
return false;
}
i++;
}
index = i;
return true;
}
meant that this function went from about 30% of the total execution of some code that called find quite a bit, to about 5% of the execution time of the same test. Because the compiler put i in a register, and only updated the reference value when it finished searching.
References are implemented as pointers (that's not a requirement, but it's universally true, I believe).
So in your first one, since you're just passing an "int", passing the pointer to that int will take about the same amount of space to pass (same or more registers, or same or more stack space, depending on your architecture), so there's no savings there. Plus now you have to dereference that pointer, which is an extra operation (and will almost surely cause you to go to memory, which you might not have to do with the second one, again, depending on your architecture).
Now, if what you're passing is much larger than an int, then the first one could be better because you're only passing a pointer. [NB that there are cases where it still might make sense to pass by value even for a very large object. Those cases are usually when you plan to create your own copy anyway. In that case, it's better to let the compiler do the copy, because the overall approach may improve it's ability to optimize. Those cases are very complex, and my opinion is that if you're asking this question, you should study C++ more before you try to tackle them. Although they do make for interesting reading.]
Passing primitives as const-reference does not save you anything. A pointer and an int use the same amount of memory. If you pass a const-reference, the machine will have to allocate memory for a pointer and copy the pointer address, which has the same cost as allocating and copying an integer. If your Date class uses a single 64-bit integer (or double) to store the date, then you don't need to use const-reference. However, if your Data class becomes more complex and stores additional fields, then passing the Date object by const-reference should have a lower cost than passing it by value.
Do I need to put "&" when I pass a 2D array to a function or 2D arrays automatically do so by reference as 1Ds.
void fnc (int& arr[5][5]) {
}
It will be passed by value if you don't specify pass by reference &.
However arrays decay to pointers, so you're basically passing a pointer by value, meaning the memory it points to will be the same.
In common terms, modifying arr inside the function will modify the original arr (a copy is not created).
Also, 1D arrays also aren't passed "automatically" by reference, it just appears so since they decay to pointers.
If you really want to pass the array by reference it would need to be:
void fnc(int (&arr)[5][5]);
Without the inner parentheses, as Mr Anubis says, you will be attempting to pass an array of references which is unlikely to be helpful.
Normally one would just write
void fnc(int arr[][5]);
(You could write arr[5][5], but the first 5 is ignored which can cause confusion.)
This passes the address of the array, rather than the array itself, which I think is what you are trying to achieve.
You should also consider a vector of vectors or other higher-level data structure; raw arrays have many traps for the unwary.
Hey there,
I wonder if it's worth passing primitive single values like int, float, double or char by pointer? Probably it's not worth!? But if you would simply pass everything by pointer, is this making the program slower?
Should you always just pass arrays as pointer?
Thanks!
I wonder if it's worth passing primitive single values like int, float, double or char by pointer?
What are you trying to accomplish? Do you want to be able to write to the passed in value? Or do you just need to use it? If you want to write to it, the idiomatic way is to pass by reference. If you don't need to write to it, you're best avoiding any risk that you'll write to it accidentally and pass by value. Pass by value will make a copy of the variable for local use. (as an aside, if you don't want to make a copy AND want some level of safety, you can pass by const reference)
But if you would simply pass everything by pointer, is this making the program slower?
Difficult to say. Depends on a lot of things. In both pass by value and pass by reference (or pointer) your making a new primitive type. In pass by value, you're making a copy. In pass by reference/pointer you're passing an address to the original. In the latter case, however, you're requiring an extra fetch of memory that may or may not be cached. Its very difficult to say 100% without measuring it.
That all being said, I doubt the difference is even noticeable. The compiler may be able to optimize out the copy in many pass-by-value cases, as indicated in this article. (thanks Space C0wb0y).
Should you always just pass arrays as pointer?
From this.
In C++ it is not possible to pass a complete block of memory by value as a parameter to a function, but we are allowed to pass its address.
To pass an array:
int foo(int bar[], unsigned int length)
{
// do stuff with bar but don't go past length
}
I'd recommended avoiding arrays and using std::vector which has more easily understood copy semantics.
It's probably not worth passing primitive values by pointer if your concern is speed -- you then have the overhead of the "indirection" to access the value.
However, pointers often are the "width of the bus", meaning the processor can send the whole value at once, and not "shift" values to send-down-the-bus. So, it is possible pointers are transferred on the bus faster than smaller types (like char). That's why the old Cray computers used to make their char values 32 bits (the width of the bus at that time).
When dealing with large objects (such as classes or arrays) passing pointer is faster than copying the whole object onto the stack. This applies to OOP for example
Look in your favorite C++ textbook for a discussion of "output parameters".
Some advantages of using a pointer for output parameters instead of a reference are:
No surprising behavior, no action at a distance, the semantics are clear at the call site as well as the caller.
Compatibility with C (which your question title suggests is important)
Usable by other languages, functions exported from a shared library or DLL should not use C++-only features such as references.
You should rarely have to pass anything by pointer. If you need to modify the value of the parameter, or want to prevent a copy, pass by reference, otherwise pass by value.
Note that preventing a copy can also be done by copy-elision, so you have to be very careful not to fall into the trap of premature optimization. This can actually make your code slower.
There's is no real answer to your question except few rules that I tend to bare in mind:
char is 8 bytes and a pointer is 4 bytes so never pass a single char as a pointer.
after things like int and float are the same size as a pointer but a pointer has to be referenced so that technically takes more time
if we go to the pentium i386 assembler:
loading the value in a register of a parameter "a" in C which is an int:
movl 8(%ebp),%eax
the same thing but passed as a pointer:
movl 8(%ebp),%eax
movl (%eax),%eax
Having to dereference the pointer takes another memory operation so theorically (not sure it is in real life) passing pointers is longer...
After there's the memory issue. If you want to code effectively everything composed type (class,structure,arrays...) has to be passed by pointer.
Just imagine doing a recursive function with a type of 16bytes that is passed by copy for 1000 calls that makes 16000 bytes in the stack (you don't really want that do you ? :) )
So to make it short and clear: Look at the size of your type if it's bigger than a pointer pass it by pointer else pass it by copy...
Pass primitive types by value and objects as const references. Avoid pointers as much as you can. Dereferencing pointers have some overhead and it clutters code. Compare the two versions of the factorial function below:
// which version of factorial is shorter and easy to use?
int factorial_1 (int* number)
{
if ((*number) <= 1)
return 1;
int tmp = (*number) - 1;
return (*number) * factorial_1 (&tmp);
}
// Usage:
int r = 10;
factorial_1 (&r); // => 3628800
int factorial_2 (int number)
{
return (number <= 1) ? 1 : (number * factorial_2 (number - 1));
}
// Usage:
// No need for the temporary variable to hold the argument.
factorial_1 (10); // => 3628800
Debugging becomes hard, as you cannot say when and where the value of an object could change:
int a = 10;
// f cound modify a, you cannot guarantee g that a is still 10.
f (&a);
g (&a);
Prefer the vector class over arrays. It can grow and shrink as needed and keeps track of its size. The way vector elements are accessed is compatible with arrays:
int add_all (const std::vector<int>& vec)
{
size_t sz = vec.size ();
int sum = 0;
for (size_t i = 0; i < sz; ++i)
sum += vec[i];
}
NO, the only time you'd pass a non-const reference is if the function requires an output parameter.
Is there some kind of subtle difference between those:
void a1(float &b) {
b=1;
};
a1(b);
and
void a1(float *b) {
(*b)=1;
};
a1(&b);
?
They both do the same (or so it seems from main() ), but the first one is obviously shorter, however most of the code I see uses second notation. Is there a difference? Maybe in case it's some object instead of float?
Both do the same, but one uses references and one uses pointers.
See my answer here for a comprehensive list of all the differences.
Yes. The * notation says that what's being pass on the stack is a pointer, ie, address of something. The & says it's a reference. The effect is similar but not identical:
Let's take two cases:
void examP(int* ip);
void examR(int& i);
int i;
If I call examP, I write
examP(&i);
which takes the address of the item and passes it on the stack. If I call examR,
examR(i);
I don't need it; now the compiler "somehow" passes a reference -- which practically means it gets and passes the address of i. On the code side, then
void examP(int* ip){
*ip += 1;
}
I have to make sure to dereference the pointer. ip += 1 does something very different.
void examR(int& i){
i += 1;
}
always updates the value of i.
For more to think about, read up on "call by reference" versus "call by value". The & notion gives C++ call by reference.
In the first example with references, you know that b can't be NULL. With the pointer example, b might be the NULL pointer.
However, note that it is possible to pass a NULL object through a reference, but it's awkward and the called procedure can assume it's an error to have done so:
a1(*(float *)NULL);
In the second example the caller has to prefix the variable name with '&' to pass the address of the variable.
This may be an advantage - the caller cannot inadvertently modify a variable by passing it as a reference when they thought they were passing by value.
Aside from syntactic sugar, the only real difference is the ability for a function parameter that is a pointer to be null. So the pointer version can be more expressive if it handles the null case properly. The null case can also have some special meaning attached to it. The reference version can only operate on values of the type specified without a null capability.
Functionally in your example, both versions do the same.
The first has the advantage that it's transparent on the call-side. Imagine how it would look for an operator:
cin >> &x;
And how it looks ugly for a swap invocation
swap(&a, &b);
You want to swap a and b. And it looks much better than when you first have to take the address. Incidentally, bjarne stroustrup writes that the major reason for references was the transparency that was added at the call side - especially for operators. Also see how it's not obvious anymore whether the following
&a + 10
Would add 10 to the content of a, calling the operator+ of it, or whether it adds 10 to a temporary pointer to a. Add that to the impossibility that you cannot overload operators for only builtin operands (like a pointer and an integer). References make this crystal clear.
Pointers are useful if you want to be able to put a "null":
a1(0);
Then in a1 the method can compare the pointer with 0 and see whether the pointer points to any object.
One big difference worth noting is what's going on outside, you either have:
a1(something);
or:
a1(&something);
I like to pass arguments by reference (always a const one :) ) when they are not modified in the function/method (and then you can also pass automatic/temporary objects inside) and pass them by pointer to signify and alert the user/reader of the code calling the method that the argument may and probably is intentionally modified inside.