Passing integers as constant references versus copying - c++

This might be a stupid question, but I notice that in a good number of APIs, a lot of method signatures that take integer parameters that aren't intended to be modified look like:
void method(int x);
rather than:
void method(const int &x);
To me, it looks like both of these would function exactly the same. (EDIT: apparently not in some cases, see answer by R Samuel Klatchko) In the former, the value is copied and thus can't change the original. In the latter, a constant reference is passed, so the original can't be changed.
What I want to know is why one over the other - is it because the performance is basically the same or even better with the former? e.g. passing a 16-bit value or 32-bit value rather than a 32-bit or 64-bit address? This was the only logical reason I could think of, I just want to know if this is correct, and if not, why and when one should prefer int x over const int &x and vice versa.

It's not just the cost of passing a pointer (that's essentially what a reference is), but also the de-referencing in the called method's body to retrieve the underlying value.
That's why passing an int by value will be virtually guaranteed to be faster (Also, the compiler can optimize and simply pass the int via processor registers, eliminating the need to push it onto the stack).

To me, it looks like both of these would function exactly the same.
It depends on exactly what the reference is to. Here is an admittedly made up example that would change based on whether you pass a reference or a value:
static int global_value = 0;
int doit(int x)
{
++global_value;
return x + 1;
}
int main()
{
return doit(global_value);
}
This code will behave differently depending on whether you have int doit(int) or int doit(const int &)

Integers are usually the size of the processor's native word and can pass easily into a registers. From this perspective, there is no difference between passing by value or passing by constant reference.
When in doubt, print the assembly language listing for your functions to find out how the compiler is passing the argument. Print out for both pass by value and pass by constant reference.
Also, when passing by value, the function can modify the copy. When passing by constant reference, the function cannot modify the variable (it's marked as const).

There will probably be a very, very small de-optimization for passing by reference, since at the very least one dereference will need to occur to get the actual value (unless the call is inlined, the compiler cannot simply pass the value due to the fact that the call site and function might be separately compiled, and it's valid and well-defined to cast away the const for a passed parameter that isn't actually const itself - see What are the benefits to passing integral types by const ref). Note, however, that the 'de-optimization' is likely to be so small as to be difficult to measure.
Most people seem to dislike pass-by-const-ref for built-ins because of this (some very much). However, I think that it it might be preferable in some cases if you want the compiler to assist you in ensuring that the value isn't accidentally changed within the function. It's not a big thing, but sometimes it might help.

Depending on the underlying instruction set, an integer parameter can be passed as register or on the stack. Register is definitely faster than memory access, which would always be required in case of const refs (considering early cache-less architectures)
You cannot pass an int literal as a const int&
Explicit type-casts allow you cast a const int& into * (const int *) opening the possibility to change the value of the passed reference

Related

benefits of passing const reference vs values in function in c++ for primitive types

I want to know what might be the possible advantages of passing by value over passing by const reference for primitive types like int, char, float, double, etc. to function? Is there any performance benefit for passing by value?
Example:
int sum(const int x,const int y);
or
int sum(const int& x,const int& y);
For the second case, I have hardly seen people using this. I know there is benefit of passing by reference for big objects.
In every ABI I know of, references are passed via something equivalent to pointers. So when the compiler cannot inline the function or otherwise must follow the ABI, it will pass pointers there.
Pointers are often larger than values; but more importantly, pointers do not point at registers, and while the top of the stack is almost always going to be in cache, what it points at may not. In addition, many ABIs have primitives passed via register, which can be faster than via memory.
The next problem is within the function. Whenever the code flow could possible modify an int, data from a const int& parameter must be reloaded! While the reference is to const, the data it refers to can be changed via other paths.
The most common ways this can happen is when you leave the code the complier can see while understanding the function body or modify memory through a global variable, or follow a pointer to touch an int elsewhere.
In comparison, an int argument whose address is not taken cannot be legally modified through other means than directly. This permits the compiler to understand it isn't being mutated.
This isn't just a problem for the complier trying to optimize and getting confused. Take something like:
struct ui{
enum{ defFontSize=9;};
std:optional<int> fontSize;
void reloadFontSize(){
fontSize=getFontSizePref();
fontSizeChanged(*fontSize),
}
void fontSizeChanged(int const& sz){
if(sz==defFontSize)
fontSize=std:nullopt;
else
fontSize=sz;
drawText(sz);
}
void drawText(int sz){
std::cout << "At size " << sz <<"\n";
}
};
and the optional, to whom we are passing a reference, gets destroyed and used after destruction.
A bug like this can be far less obvious than this. If we defaulted to passing by value, it could not happen.
Typically, primitive types are not passed by reference, but sometimes there is a point in that. E.g, on x64 machine long double is 16 bytes long and pointer is 8 bytes long. So it will be a little bit better to use a reference in this case.
In your example, there is no point in that: usual int is 4 bytes long, so you can pass two integers instead of one pointer.
You can use sizeof() to measure the size of the type.

Usage of const and references in parameters in c++ [duplicate]

This question already has answers here:
Pass int by const reference or by value , any difference? [duplicate]
(4 answers)
Closed 4 years ago.
There are multiple ways of making a method. I'm not quite sure when to use const and reference in method parameters.
Imagine a method called 'getSum' that returns the sum of two integers. The parameters in such a method can have multiple forms.
int getSum1(int, int);
int getSum2(int&, int&);
int getSum3(const int, const int);
int getSum4(const int&, const int&);
Correct me if I'm wrong, but here's how I see these methods:
getSum1 - Copies integers and calculates
getSum2 - Doesn't copy integers, but uses the values directly from memory and calculates
getSum3 - Promises that the values won't change
getSum4 - Promises that the values won't change & doesn't copy the integers, but uses the values directly from memory
So here are some questions:
So is getSum2 faster than getSum1 since it doesn't copy the integers, but uses them directly?
Since the values aren't changed, I don't think 'const' makes any difference in this situation, but should it still be there for const correctness?
Would it be the same with doubles?
Should a reference only be used with very large parameters? e.g. if I were to give it a whole class, then it would make no sense to copy the whole thing
For integers, this is irrelevant in practice. Processors work with registers (and an int fits in a register in all but the most exotic hardware), copying a register is basically the cheapest operation (after a noop) and it may not even be necessary if the compiler allocates registers in a smart way.
Use this if you want to change the passed ints. Non-const reference parameters generally indicate that you intend to modify the argument (for example, store multiple return values).
This does exactly the same as 1. for basically the same reason. You cannot change the passed ints but nobody would be any the wiser if you did (i.e. used 1. instead).
Again, this will effectively do the same thing as 1. for ints (or doubles, if your CPU handles them natively) because the compiler understands that passing a const pointer to an int (or double) is the same as providing a copy, but the latter avoids unnecessary trips to memory. Unless you take a pointer to the arguments (in which case the compiler would have to guarantee it points to the int on the call site) this is thus pointless.
Note that the above is not in terms of the C++ abstract machine but in terms of what happens with modern hardware/compilers. If you are working on hardware without dedicated floating point capabilities or where ints don't fit in registers, you have to be more careful. I don't have an overview over current embedded hardware trends, but unless you literally write code for toasters, you should be good.
If you are not dealing with ints but with (large) classes, then the semantic differences are much stronger:
The function receives a copy. Note that if you pass in a temporary, that copy may be move-constructed (or even better, elided).
Same as in the "int section", use this over 4. only if you want to change the passed value.
You receive a copy that cannot be changed. This is generally not very useful outside of specific circumstances (or for marginal code clarity increases).
This should be the default to pass a large class (well, pretty much anything bigger than a pointer) if you intend to only read from (or call const methods on) it.
You are correct. the values of a and b would not be copied. But the addresses to a and b would be copied, and in this case you would not gain any speed since int and pointer to int are of the same (or about the same) size. You would gain speed if the size of the arguments to the function is large, like a struct or class as you mention in Q4.
2)
Const means that you can not change the value of the parameter. If it is not declared as a const you can change it inside the function, but the original value or variable you used when calling the function will not be changed.
int getSum1(int a, int b)
{
a = a + 5;
return a + b;
}
int a, b, foo;
a = 10;
b = 5;
foo = getSum1(a, b);
In this case foo has the value 20
a equals 10
b equals 5
Since the modification of a is only local to the function getSum1()

C++ - Reference, Pointers in Arguments

There are many questions about "when do I use reference and when pointers?". They confused me a little bit. I thought a reference wouldn't take any memory because it's just the address.
Now I made a simple Date class and showed them the community of code-review. They told me not to use the reference in the following example. But why?
Someone told me that it'll allocate the same memory a pointer would allocate. That's the opposite of what I learned.
class A{
int a;
public:
void setA(const int& b) { a = b; } /* Bad! - But why?*/
};
class B{
int b;
public:
void setB(int c) { b = c; } /* They told me to do this */
};
So when do I use references or pointers in arguments and when just a simple copy? Without the reference in my example, is the constant unnecessary?
It is not guaranteed to be bad. But it is unnecessary in this specific case.
In many (or most) contexts, references are implemented as pointers in disguise. Your example happens to be one of those cases. Assuming that the function does not get inlined, parameter b will be implemented "under the hood" as a pointer. So, what you really pass into setA in the first version is a pointer to int, i.e. something that provides indirect access to your argument value. In the second version you pass an immediate int, i.e. something that provides direct access to your argument value.
Which is better and which is worse? Well, a pointer in many cases has greater size than an int, meaning that the first variant might passes larger amount of data. This might be considered "bad", but since both data types will typically fit into the hardware word size, it will probably make no appreciable difference, especially if parameters are passed in CPU registers.
Also, in order to read b inside the function you have to dereference that disguised pointer. This is also "bad" from the performance point of view.
These are the formal reasons one would prefer to pass by value any parameters of small size (smaller or equal to pointer size). For parameters or bigger size, passing by const reference becomes a better idea (assuming you don't explicitly require a copy).
However, in most cases a function that simple will probably be inlined, which will completely eliminate the difference between the two variants, regardless of which parameter type you use.
The matter of const being unnecessary in the second variant is a different story. In the first variant that const serves two important purposes:
1) It prevents you from modifying the parameter value, and thus protects the actual argument from modification. If the reference weren't const, you would be able to modify the reference parameter and thus modify the argument.
2) It allows you to use rvalues as arguments, e.g. call some_obj.setA(5). Without that const such calls would be impossible.
In the second version neither of this is an issue. There's no need to protect the actual argument from modification, since the parameter is a local copy of that argument. Regardless of what you do to the parameter, the actual argument will remain unchanged. And you can already use rvalues as arguments to SetA regardless of whether the parameter is declared const or not.
For this reason people don't normally use top-level const qualifiers on parameters passed by value. But if you do declare it const, it will simply prevent you from modifying the local b inside the function. Some people actually like that, since it enforces the moderately popular "don't modify original parameter values" convention, for which reason you might sometimes see top-level const qualifiers being used in parameter declarations.
If you has light-weight type like a int or long you should use passing by value, because there won't be additional costs from work with references. But when you passing some heavy types, you should use references
I agree with the reviewer. And here's why:
A (const or non-const) reference to a small simple type, such as int will be more complex (in terms of number of instructions). This is because the calling code will have to pass the address of the argument into setA, and then inside setA the value has to be dereferenced from the address stored in b. In the case where b is a plain int, it just copies the value itself. So there is at least one step of a memory reference in saving. This may not make much of a difference in a long runtime of a large program, but if you keep adding one extra cycle everywhere you do this, then it does soon add up to noticeably slower.
I had a look at a piece of code that went something like this:
class X
{
vector v;
public:
...
void find(int& index, int b);
....
}
bool X::find(int &index, int b)
{
while(v[index] != b)
{
if (index == v.size()-1)
{
return false;
}
index++;
}
return true;
}
Rewriting this code to:
bool X::find(int &index, int b)
{
int i = index;
while(v[i] != b)
{
if (i == v.size()-1)
{
index = i;
return false;
}
i++;
}
index = i;
return true;
}
meant that this function went from about 30% of the total execution of some code that called find quite a bit, to about 5% of the execution time of the same test. Because the compiler put i in a register, and only updated the reference value when it finished searching.
References are implemented as pointers (that's not a requirement, but it's universally true, I believe).
So in your first one, since you're just passing an "int", passing the pointer to that int will take about the same amount of space to pass (same or more registers, or same or more stack space, depending on your architecture), so there's no savings there. Plus now you have to dereference that pointer, which is an extra operation (and will almost surely cause you to go to memory, which you might not have to do with the second one, again, depending on your architecture).
Now, if what you're passing is much larger than an int, then the first one could be better because you're only passing a pointer. [NB that there are cases where it still might make sense to pass by value even for a very large object. Those cases are usually when you plan to create your own copy anyway. In that case, it's better to let the compiler do the copy, because the overall approach may improve it's ability to optimize. Those cases are very complex, and my opinion is that if you're asking this question, you should study C++ more before you try to tackle them. Although they do make for interesting reading.]
Passing primitives as const-reference does not save you anything. A pointer and an int use the same amount of memory. If you pass a const-reference, the machine will have to allocate memory for a pointer and copy the pointer address, which has the same cost as allocating and copying an integer. If your Date class uses a single 64-bit integer (or double) to store the date, then you don't need to use const-reference. However, if your Data class becomes more complex and stores additional fields, then passing the Date object by const-reference should have a lower cost than passing it by value.

Passing by value, const value, reference, const reference, pointer, const pointer

Explore more and find the answer to determine how to pass in old post (sorry for duplicate)
If the function intends to change the argument as a side effect, take
it by non-const reference.
If the function doesn't modify its
argument and the argument is of primitive type, take it by value.
Otherwise take it by const reference, except in the following cases
If the function would then need to make a copy of the const reference
anyway, take it by value.
[Original Post is Below]
I'd like to summarize the use of passing by value, const value, reference, const reference, pointer, const pointer and please correct me and give me your suggestions.
As for reference and pointer, use const if possible (thanks to all).
There is no performance difference between passing by reference and pointer.
When the size is not larger than a pointer (thanks to Mark Ransom), pass by value.
And some questions:
I seldom see passing by const value. Is it useful or the compiler will detect the const-ness in passing by value?
The const reference takes too much space. Can I just use passing by value? Will the modern compilers optimize it to not sacrifice the performance?
According the the article "Want Speed? Pass by Value" juanchopanza mentioned, I add one more item.
If you will copy your arguments, pass them by value and let the compiler do the copying other than passing them by const reference and doing the copy by yourself in the function body.
Thanks a lot!
I seldom see passing by const value. Is it useful or the compiler will detect the const-ness in passing by value?
Passing by const value doesn't really exist. When you pass by value, you can't modify the value in such a way that the changes will be visible outside of the subroutine. This is because when you pass by value, a copy is made of the original value and that copy is used in the function.
The const reference takes too much space. Can I just use passing by
value? Will the modern compilers optimize it to not sacrifice the
performance?
Passing by (const) reference is not the same as passing by value. When you pass by reference the value is NOT copied, a memory location is simply supplied and thus you may 'modify' (indirectly) the value that you pass by reference.
Take for example, the following:
void byValue(int x) {
x += 1
}
void byRef(int &x) {
x += 1
}
// ...
{
y = 10;
byValue(y);
cout << y << endl // Prints 10
byRef(y);
cout << y << endl; // Prints 11
}
// ...
Use const as much as possible.
Passing const where necessary is always a good idea. It helps code readability, lets others know what will happen to the values they pass to the method, and helps the compiler catch any mistakes you may make in modifying the value inside the method.
There is no performance difference between passing by reference and pointer.
A negligible amount, if any. The compiler will take care of the details here. It saves you the effort of creating a pointer, and it nicely dereferences it for you.
When the size is not larger than a word, pass by value.
As Mark points out, you do this if the value is smaller than a pointer. Pointers are different sizes on 32bit and 64bit systems (hence the name) and so this is really at your discretion. I'm a fan of passing pointers for nearly everything except the primitive types (char, int8_t, int16_t, float, etc), but that is just my opinion.

Parameter Passing Etiquette (C++) const& vs. value

If all a function needs to do with a parameter is see its value, shouldn't you always pass that parameter by constant reference?
A colleague of mine stated that it doesn't matter for small types, but I disagree.
So is there any advantage to do this:
void function(char const& ch){ //<- const ref
if (ch == 'a'){
DoSomething(ch);
}
return;
}
over this:
void function(char ch){ //<- value
if (ch == 'a'){
DoSomething(ch);
}
return;
}
They appear to be the same size to me:
#include <iostream>
#include <cstdlib>
int main(){
char ch;
char& chref = ch;
std::cout << sizeof(ch) << std::endl; //1
std::cout << sizeof(chref) << std::endl; //1
return EXIT_SUCCESS;
}
But I do not know if this is always the case.
I believe I'm right, because it does not produce any additional overhead and it is self documenting.
However, I want to ask the community if my reasoning and assumptions are correct?
Your colleague is correct. For small types (char, int) it makes no sense to pass by reference, when the variable is not to be modified. Passing by value would be better, as size of pointer (used in case of passing by reference) is about the size of small types.
And moreover, passing by value, is lesser typing, as well as slightly more readable.
Even though the sizeof(chref) is the same as sizeof(ch), passing character by reference does take more bytes on most systems: although the standard does not say anything specific about the implementation of references, an address (i.e. a pointer) is regularly passed behind the scenes. With optimization on, it probably would not matter. When you code template functions, items of unknown type that will not be modified should always be passed by const reference.
As far as small types go, you can pass them by value with a const qualifier to emphasize the point that you aren't going to touch the argument through the signature of your function:
void function(const char ch){ //<- value
if (ch == 'a'){
DoSomething(ch);
}
return;
}
For small values, the cost of creating a reference and dereferencing it is likely to be greater than the cost of copying it (if there is a difference at all). This is especially true when you consider that reference parameters are pretty much always implemented as a pointer. Both document equally well if you just declare your value as const (I'm using this value for input only and it will not be modified). I generally just make all of the standard built-in types by const value and all user-defined / STL types as const &.
Your sizeof example is flawed because chref is just an alias for ch. You'd get equal results for sizeof(T) for any type T.
The sizes are not the same as passed. The result depends on the ABIs calling convention, but the sizeof(referenceVariable) produces the sizeof(value).
If all a function needs to do with a parameter is see its value, shouldn't you always pass that parameter by constant reference?
That's what I do. I know people disagree with me, and argue for passing small builtins by value, or prefer to omit the const. Passing by reference can add instructions and/or consume more space. I pass this way for consistency, and because always measuring the best way to pass for any given platform is a lot of hassle to maintain.
There isn't an advantage beyond readability (if that's your preference). Performance could suffer very slightly, but it will not be a consideration in most cases.
Passing these small builtins by value is more common. If passing by value, you can const qualify the definition (independent of the declaration).
My recommendation is that the vast majority of teams should simply choose one way to pass and stick with it, and performance should not influence that unless every instruction counts. The const never hurts.
In my opinion, your general approach of passing by const reference is a good practice (but see below for some caveats on your example). On the other hand, your friend is correct that for built-in types, passing by reference should not result in any significant performance gains, and could even result in marginal performance losses. I come from a C background, so I tend to think of references in terms of pointers (even though there are some subtle differences), and a "char*" will be bigger than a "char" on any platform with which I'm familiar.
[EDIT: removed incorrect information.]
The bottom line, in my opinion, is that when you're passing larger user-defined types, and the called function only needs to read values without modifying them, passing by "type const&" is a good practice. As you say, it's self-documenting, and helps clarify the roles of the various pieces of your internal API.