Will the two function specifications below always compile to the same thing? I can't see that a copy would be needed if you're using const. If they aren't the same, why?
void(const int y);
void(const int& y);
Not the same. If the argument changes after it's passed (e.g. because it's changed by another thread), the first version is unaffected because it has a copy. In the second variant, the function called may not change the argument itself, but it would be affected by changes to y. With threads, this might mean it requires a mutex lock.
Without optimization ... this is not the same.
The first line of code gets a copy of the value passed into this function.
The second line of code gets the reference of the variable and your function will read the value always directly from the calling location variable .
In both cases the compiler is informed (by the keyword const), that these variables (inside the function) MUST not be modified. If there are any modifications in the function, an error will be generated.
The & specifies that the object is passed by reference (similar to passing by pointer at least on the assembly level). Thus
void fval(type x)
{
// x is a local copy of the data passed by the caller.
// modifying x has no effect on the data hold by the caller
}
type a;
fval(a); // a will not be changed
void fref(type &x)
{
// x is a mere reference to an object
// changing x affects the data hold by the caller
}
type b;
fref(b); // b may get changed.
Now adding the const keyword merely expresses that the function promises not to change the object.
void fcval(const type x)
{
// x is a local copy of the data passed by the caller.
// modifying x is not allowed
}
type a;
fcval(a); // a will not be changed
void fcref(const type &x)
{
// x is a mere reference to an object
// changing x is not allowed
}
type b;
fcref(b); // b will not be changed
Since a copy may be expensive, it should be avoided if not needed. Therefore, the method of choice for passing a constant object a la fcref (except for builtin types when fval is fast).
Related
When a function (callee) returns a quantity to the caller function, is it returned by
value or by reference?
The thing is I have written a function which builds a very large vector of when called. I want to return this big vector to the calling function , ( in this case main() ) by constant reference so I can do some further processing on it.
I was in doubt because I was told that when a C++ function returns and terminates, all the variables/memory associated with that function, get wiped clean.
struct node{
string key;
int pnum;
node* ptr;
}
vector< vector<node> > myfun1(/*Some arguments*/)
{
/*Build the vector of vectors. Call it V*/
return v;
}
int main(void)
{
a=myfun1(/* Some arguments */)
}
C++ functions can return by value, by reference (but don't return a local variable by reference), or by pointer (again, don't return a local by pointer).
When returning by value, the compiler can often do optimizations that make it equally as fast as returning by reference, without the problem of dangling references. These optimizations are commonly called "Return Value Optimization (RVO)" and/or "Named Return Value Optimization (NRVO)".
Another way to for the caller to provide an empty vector (by reference), and have the function fill it in. Then it doesn't need to return anything.
You definitely should read this blog posting: Want Speed? Pass by value.
By default, everything in C/C++ is passed by value, including return type, as in the example below:
T foo() ;
In C++, where the types are usually considered value-types (i.e. they behave like int or double types), the extra copy can be costly if the object's construction/destruction is not trivial.
With C++03
If you want to return by reference, or by pointer, you need to change the return type to either:
T & foo() ; // return a reference
T * foo() ; // return a pointer
but in both cases, you need to make sure the object returned still exists after the return. For example, if the object returned was allocated on stack in the body of the function, the object will be destroyed, and thus, its reference/pointer will be invalid.
If you can't guarantee the object still exists after the return, your only solution is to either:
accept the cost of an extra copy, and hope for a Return Value Optimization
pass instead a variable by reference as a parameter to the function, as in the following:
void foo(T & t) ;
This way, inside the function, you set the t value as necessary, and after the function returns, you have your result.
With C++11
Now, if you have the chance to work with C++0x/C++11, that is, with a compiler that supports r-values references/move semantics, if your object has the right constructor/operator (if your object comes from the standard library, then it's ok), then the extra temporary copy will be optimized away, and you can keep the notation:
T foo() ;
Knowing that the compiler will not generate an unnecessary temporary value.
C++ can return either by reference or by value. If you want to return a reference, you must specify that as part of the return type:
std::vector<int> my_func(); // returns value
std::vector<int>& my_func(); // returns reference
std::vector<int> const& my_func(); // returns constant reference
All local (stack) variables created inside of a function are destroyed when the function returns. That means you should absolutely not return locals by reference or const reference (or pointers to them). If you return the vector by value it may be copied before the local is destroyed, which could be costly. (Certain types of optimizations called "return value optimization" can sometimes remove the copy, but that's out of the scope of this question. It's not always easy to tell whether the optimization will happen on a particular piece of code.)
If you want to "create" a large vector inside of a function and then return it without copying, the easiest way is to pass the vector in to the function as a reference parameter:
void fill_vector(std::vector<int> &vec) {
// fill "vec" and don't return anything...
}
Also note that in the recently ratified new version of the C++ standard (known as C++0x or C++11) returning a local vector by value from a function will not actually copy the vector, it will be efficiently moved into its new location. The code that does this looks identical to code from previous versions of C++ which could be forced to copy the vector. Check with your compiler to see whether it supports "move semantics" (the portion of the C++11 standard that makes this possible).
It's returned by whatever you declare the return type to be. vector<int> f(); and vector<int>& f(); return by value and reference respectively. However, it would be a grave error to return a reference to a local variable in the function as it will have been blown away when the function scope exits.
For good tips on how to efficiently return large vectors from a function, see this question (in fact this one is arguably a duplicate of that).
The function will return what you tell it to return. If you want to return a vector, then it will be copied to the variable hold by the caller. Unless you capture that result by const reference, in which case there is no need to copy it. There are optimizations that allow functions to avoid this extra copy-constructon by placing the result in the object that will hold the return value. You should read this before changing your design for performance:
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Like most things in C++, the answer is "it depends on how you defined the function".
The default for the language is return-by-value. A simple call like "double f()" is going to always return the floating-point number by value. However, you CAN return values by pointer or by reference- you just add the extra symbols '&' or '*' to the return type:
// Return by pointer (*)
T* f();
// Return by reference (a single '&')
T& f();
However, these are ridiculously unsafe in many situations. If the value the function is returning was declared within the function, the returned reference or pointer will point to random garbage instead of valid data. Even if you can guarantee that the pointed-to data is still around, this kind of return is usually more trouble than it is worth given the optimizations all modern C++ compilers will do for you. The idiomatic, safe way to return something by reference is to pass a named reference in as a parameter:
// Return by 'parameter' (a logical reference return)
void f(T& output);
Now the output has a real name, and we KNOW it will survive the call because it has to exist before the call to 'f' is even made. This is a pattern you will see often in C++, especially for things like populating a STL std::vector. Its ugly, but until the advent of C++11 it was often faster than simply returning the vector by value. Now that return by value is both simpler and faster even for many complex types, you will probably not see many functions following the reference return parameter pattern outside of older libraries.
All variables defined on the stack are cleaned upon exit.
To return a variable you should allocate it on the heap, which you do with the new keyword (or malloc).
Classes and structs are passed around as pointers, while the primitive types are passed around as values.
I know that where possible you should use the const keyword when passing parameters around by reference or by pointer for readability reasons. Is there any optimizations that the compiler can do if I specify that an argument is constant?
There could be a few cases:
Function parameters:
Constant reference:
void foo(const SomeClass& obj)
Constant SomeClass object:
void foo(const SomeClass* pObj)
And constant pointer to SomeClass:
void foo(SomeClass* const pObj)
Variable declarations:
const int i = 1234
Function declarations:
const char* foo()
What kind of compiler optimizations each one offers (if any)?
Source
Case 1:
When you declare a const in your program,
int const x = 2;
Compiler can optimize away this const by not providing storage for this variable; instead it can be added to the symbol table. So a subsequent read just needs indirection into the symbol table rather than instructions to fetch value from memory.
Note: If you do something like:
const int x = 1;
const int* y = &x;
Then this would force compiler to allocate space for x. So, that degree of optimization is not possible for this case.
In terms of function parameters const means that parameter is not modified in the function. As far as I know, there's no substantial performance gain for using const; rather it's a means to ensure correctness.
Case 2:
"Does declaring the parameter and/or the return value as const help the compiler to generate more optimal code?"
const Y& f( const X& x )
{
// ... do something with x and find a Y object ...
return someY;
}
What could the compiler do better? Could it avoid a copy of the parameter or the return value?
No, as argument is already passed by reference.
Could it put a copy of x or someY into read-only memory?
No, as both x and someY live outside its scope and come from and/or are given to the outside world. Even if someY is dynamically allocated on the fly within f() itself, it and its ownership are given up to the caller.
What about possible optimizations of code that appears inside the body of f()? Because of the const, could the compiler somehow improve the code it generates for the body of f()?
Even when you call a const member function, the compiler can't assume that the bits of object x or object someY won't be changed. Further, there are additional problems (unless the compiler performs global optimization): The compiler also may not know for sure that no other code might have a non-const reference that aliases the same object as x and/or someY, and whether any such non-const references to the same object might get used incidentally during the execution of f(); and the compiler may not even know whether the real objects, to which x and someY are merely references, were actually declared const in the first place.
Case 3:
void f( const Z z )
{
// ...
}
Will there be any optimization in this?
Yes because the compiler knows that z truly is a const object, it could perform some useful optimizations even without global analysis. For example, if the body of f() contains a call like g( &z ), the compiler can be sure that the non-mutable parts of z do not change during the call to g().
Before giving any answer, I want to emphasize that the reason to use or not use const really ought to be for program correctness and for clarity for other developers more so than for compiler optimizations; that is, making a parameter const documents that the method will not modify that parameter, and making a member function const documents that that member will not modify the object of which it is a member (at least not in a way that logically changes the output from any other const member function). Doing this, for example, allows developers to avoid making unnecessary copies of objects (because they don't have to worry that the original will be destroyed or modified) or to avoid unnecessary thread synchronization (e.g. by knowing that all threads merely read and do not mutate the object in question).
In terms of optimizations a compiler could make, at least in theory, albeit in an optimization mode that allows it to make certain non-standard assumptions that could break standard C++ code, consider:
for (int i = 0; i < obj.length(); ++i) {
f(obj);
}
Suppose the length function is marked as const but is actually an expensive operation (let's say it actually operates in O(n) time instead of O(1) time). If the function f takes its parameter by const reference, then the compiler could potentially optimize this loop to:
int cached_length = obj.length();
for (int i = 0; i < cached_length; ++i) {
f(obj);
}
... because the fact that the function f does not modify the parameter guarantees that the length function should return the same values each time given that the object has not changed. However, if f is declared to take the parameter by a mutable reference, then length would need to be recomputed on each iteration of the loop, as f could have modified the object in a way to produce a change in the value.
As pointed out in the comments, this is assuming a number of additional caveats and would only be possible when invoking the compiler in a non-standard mode that allows it to make additional assumptions (such as that const methods are strictly a function of their inputs and that optimizations can assume that code will never use const_cast to convert a const reference parameter to a mutable reference).
Function parameters:
const is not significant for referenced memory. It's like tying a hand behind the optimizer's back.
Suppose you call another function (e.g. void bar()) in foo which has no visible definition. The optimizer will have a restriction because it has no way of knowing whether or not bar has modified the function parameter passed to foo (e.g. via access to global memory). Potential to modify memory externally and aliasing introduce significant restrictions for optimizers in this area.
Although you did not ask, const values for function parameters does allow optimizations because the optimizer is guaranteed a const object. Of course, the cost to copy that parameter may be much higher than the optimizer's benefits.
See: http://www.gotw.ca/gotw/081.htm
Variable declarations: const int i = 1234
This depends on where it is declared, when it is created, and the type. This category is largely where const optimizations exist. It is undefined to modify a const object or known constant, so the compiler is allowed to make some optimizations; it assumes you do not invoke undefined behavior and that introduces some guarantees.
const int A(10);
foo(A);
// compiler can assume A's not been modified by foo
Obviously, an optimizer can also identify variables which do not change:
for (int i(0), n(10); i < n; ++i) { // << n is not const
std::cout << i << ' ';
}
Function declarations: const char* foo()
Not significant. The referenced memory may be modified externally. If the referenced variable returned by foo is visible, then an optimizer could make an optimization, but that has nothing to do with the presence/absence of const on the function's return type.
Again, a const value or object is different:
extern const char foo[];
The exact effects of const differ for each context where it is used. If const is used while declaring an variable, it is physically const and potently resides in read-only memory.
const int x = 123;
Trying to cast the const-ness away is undefined behavour:
Even though const_cast may remove constness or volatility from any pointer or reference, using the resulting pointer or reference to write to an object that was declared const or to access an object that was declared volatile invokes undefined behavior. cppreference/const_cast
So in this case, the compiler may assume that the value of x is always 123. This opens some optimization potential (constants propagation)
For functions it's a different matter. Suppose:
void doFancyStuff(const MyObject& o);
our function doFancyStuff may do any of the following things with o.
not modify the object.
cast the constness away, then modify the object
modify an mutable data member of MyObject
Note that if you call our function with an instance of MyObject that was declared as const, you'll invoke undefined behavior with #2.
Guru question: will the following invoke undefined behavior?
const int x = 1;
auto lam = [x]() mutable {const_cast<int&>(x) = 2;};
lam();
SomeClass* const pObj creates a constant object of pointer type. There exists no safe method of changing such an object, so the compiler can, for example, cache it into a register with only one memory read, even if its address is taken.
The others don't enable any optimizations specifically, although the const qualifier on the type will affect overload resolution and possibly result in different and faster functions being selected.
If I have no use for a variable after I pass it to a function, does it matter whether I pass it a non-const lvalue reference or use std::move to pass it an rvalue reference. The assumption is that there are two different overloads. The only difference in the two cases is the lifetime of the passed object, which ends earlier if I pass by rvalue reference. Are there other factors to consider?
If I have a function foo overloaded like:
void foo(X& x);
void foo(X&& x);
X x;
foo(std::move(x)); // Does it matter if I called foo(x) instead?
// ... no further accesses to x
// end-of-scope
The lifetime of an object does not end when it is passed by rvalue reference. The rvalue reference merely gives foo permission to take ownership of its argument and potentially change its value to nonsense. This might involve deallocating its members, which is a kind of end of lifetime, but the argument itself lives to the end of the scope of its declaration.
Using std::move on the last access is idiomatic. There is no potential downside. Presumably if there are two overloads, the rvalue reference one has the same semantics but higher efficiency. Of course, they could do completely different things, just for the sake of insane sadism.
It depends on what you do in foo():
Inside foo(), if you store the argument in some internal storage, then yes it does matter, from readability point of view, because it is explicit at the call site that this particular argument is being moved and it should not be used here at call site, after the function call returns.
If you simply read/write its value, then it doesn't matter. Note that even if you pass by T&, the argument can still be moved to some internal storage, but that is less preferred approach — in fact it should be considered a dangerous approach.
Also note that std::move does NOT actually move the object. It simply makes the object moveable. An object is moved if it invokes the move-constructor or move-assignment:
void f(X && x) { return; }
void g(X x) { return; }
X x1,x2;
f(std::move(x1)); //x1 is NOT actually moved (no move constructor invocation).
g(std::move(x2)); //x2 is actually moved (by the move-constructor).
//here it is safe to use x1
//here it is unsafe to use x2
Alright it is more complex than this. Consider another example:
void f(X && x) { vec_storage.push_back(std::move(x)); return; }
void g(X x) { return; }
X x1,x2;
f(std::move(x1)); //x1 is actually moved (move-constructor invocation in push_back)
g(std::move(x2)); //x2 is actually moved (move-constructor invocation when passing argument by copy).
//here it is unsafe to use x1 and x2 both.
Hope that helps.
I've been back & forth with this problem for a while especially since I started to OpenCV library. The fact is, in OpenCV, there are several methods used:
1st: funcA((const) CvMat arg)
2nd: funcA((const) CvMat& arg)
3rd: funcA((const) CvMat* arg)
4th: funcA((const) CvMat*& arg) => I've just seen and currently been stuck at this
and of course, corresponding to each method, the caller format and the function implementation should be different.
What is the significance about all of these derivatives?? especially the last one (I've not yet understood its usage)
Ignoring the (const) for now, and using int for clarity:
Pass by value makes a copy in the body of the function
void funcA(int arg) {
// arg here is a copy
// anything I do to arg has no effect on caller side.
arg++; // only has effect locally
}
Note that it semantically makes a copy, but the compiler is allowed to elide the copies under certain conditions. Look up copy elision
Pass by reference. I can modify the argument passed by the caller.
void funcA(int& arg) {
// arg here is a reference
// anything I do to arg is seen on caller side.
arg++;
}
Pass pointer by value. I get a copy of the pointer, but it points to the same object pointed at by the caller's argument
void funcA(int* arg) {
// changes to arg do not affect caller's argument
// BUT I can change the object pointed to
(*arg)++; // pointer unchanged, pointee changed. Caller sees it.
}
Pass reference to pointer. I can change the pointed itself and the caller will see the change.
void funcA(int*& arg) {
// changes to arg affect caller's argument
// AND I can change the object pointed to.
(*arg)++; // pointee changed
arg++; // pointer changed. Caller sees it.
}
As you can see, the second two are just the same as the first two, except that they deal with pointers. If you understand what pointers do, then there is no difference conceptually.
Concerning const, it specifies whether the argument can be modified or not, or, if the arguments are references or pointers, whether what they point to/refer to can be modified. The positioning of const is important here. See const correctness for example.
When a function (callee) returns a quantity to the caller function, is it returned by
value or by reference?
The thing is I have written a function which builds a very large vector of when called. I want to return this big vector to the calling function , ( in this case main() ) by constant reference so I can do some further processing on it.
I was in doubt because I was told that when a C++ function returns and terminates, all the variables/memory associated with that function, get wiped clean.
struct node{
string key;
int pnum;
node* ptr;
}
vector< vector<node> > myfun1(/*Some arguments*/)
{
/*Build the vector of vectors. Call it V*/
return v;
}
int main(void)
{
a=myfun1(/* Some arguments */)
}
C++ functions can return by value, by reference (but don't return a local variable by reference), or by pointer (again, don't return a local by pointer).
When returning by value, the compiler can often do optimizations that make it equally as fast as returning by reference, without the problem of dangling references. These optimizations are commonly called "Return Value Optimization (RVO)" and/or "Named Return Value Optimization (NRVO)".
Another way to for the caller to provide an empty vector (by reference), and have the function fill it in. Then it doesn't need to return anything.
You definitely should read this blog posting: Want Speed? Pass by value.
By default, everything in C/C++ is passed by value, including return type, as in the example below:
T foo() ;
In C++, where the types are usually considered value-types (i.e. they behave like int or double types), the extra copy can be costly if the object's construction/destruction is not trivial.
With C++03
If you want to return by reference, or by pointer, you need to change the return type to either:
T & foo() ; // return a reference
T * foo() ; // return a pointer
but in both cases, you need to make sure the object returned still exists after the return. For example, if the object returned was allocated on stack in the body of the function, the object will be destroyed, and thus, its reference/pointer will be invalid.
If you can't guarantee the object still exists after the return, your only solution is to either:
accept the cost of an extra copy, and hope for a Return Value Optimization
pass instead a variable by reference as a parameter to the function, as in the following:
void foo(T & t) ;
This way, inside the function, you set the t value as necessary, and after the function returns, you have your result.
With C++11
Now, if you have the chance to work with C++0x/C++11, that is, with a compiler that supports r-values references/move semantics, if your object has the right constructor/operator (if your object comes from the standard library, then it's ok), then the extra temporary copy will be optimized away, and you can keep the notation:
T foo() ;
Knowing that the compiler will not generate an unnecessary temporary value.
C++ can return either by reference or by value. If you want to return a reference, you must specify that as part of the return type:
std::vector<int> my_func(); // returns value
std::vector<int>& my_func(); // returns reference
std::vector<int> const& my_func(); // returns constant reference
All local (stack) variables created inside of a function are destroyed when the function returns. That means you should absolutely not return locals by reference or const reference (or pointers to them). If you return the vector by value it may be copied before the local is destroyed, which could be costly. (Certain types of optimizations called "return value optimization" can sometimes remove the copy, but that's out of the scope of this question. It's not always easy to tell whether the optimization will happen on a particular piece of code.)
If you want to "create" a large vector inside of a function and then return it without copying, the easiest way is to pass the vector in to the function as a reference parameter:
void fill_vector(std::vector<int> &vec) {
// fill "vec" and don't return anything...
}
Also note that in the recently ratified new version of the C++ standard (known as C++0x or C++11) returning a local vector by value from a function will not actually copy the vector, it will be efficiently moved into its new location. The code that does this looks identical to code from previous versions of C++ which could be forced to copy the vector. Check with your compiler to see whether it supports "move semantics" (the portion of the C++11 standard that makes this possible).
It's returned by whatever you declare the return type to be. vector<int> f(); and vector<int>& f(); return by value and reference respectively. However, it would be a grave error to return a reference to a local variable in the function as it will have been blown away when the function scope exits.
For good tips on how to efficiently return large vectors from a function, see this question (in fact this one is arguably a duplicate of that).
The function will return what you tell it to return. If you want to return a vector, then it will be copied to the variable hold by the caller. Unless you capture that result by const reference, in which case there is no need to copy it. There are optimizations that allow functions to avoid this extra copy-constructon by placing the result in the object that will hold the return value. You should read this before changing your design for performance:
http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Like most things in C++, the answer is "it depends on how you defined the function".
The default for the language is return-by-value. A simple call like "double f()" is going to always return the floating-point number by value. However, you CAN return values by pointer or by reference- you just add the extra symbols '&' or '*' to the return type:
// Return by pointer (*)
T* f();
// Return by reference (a single '&')
T& f();
However, these are ridiculously unsafe in many situations. If the value the function is returning was declared within the function, the returned reference or pointer will point to random garbage instead of valid data. Even if you can guarantee that the pointed-to data is still around, this kind of return is usually more trouble than it is worth given the optimizations all modern C++ compilers will do for you. The idiomatic, safe way to return something by reference is to pass a named reference in as a parameter:
// Return by 'parameter' (a logical reference return)
void f(T& output);
Now the output has a real name, and we KNOW it will survive the call because it has to exist before the call to 'f' is even made. This is a pattern you will see often in C++, especially for things like populating a STL std::vector. Its ugly, but until the advent of C++11 it was often faster than simply returning the vector by value. Now that return by value is both simpler and faster even for many complex types, you will probably not see many functions following the reference return parameter pattern outside of older libraries.
All variables defined on the stack are cleaned upon exit.
To return a variable you should allocate it on the heap, which you do with the new keyword (or malloc).
Classes and structs are passed around as pointers, while the primitive types are passed around as values.