return by value function optimization - c++

Sample Function 1
int func1 (int arg)
{
return arg + 10;
}
Sample Function 2
int func1 (int arg)
{
int retval = arg + 10;
return retval;
}
func_xyz (int x);
int main ()
{
int a = 10;
int p = func1 (a);
func_xyz(p);
}
Is there any difference between runtime behaviour of these functions (sample 1 and sample 2)?
I have a function definition in my code that uses sample 1 style function definition. When i invoke this function, a million times (not reproducable for lesser iterations) and try to pass this value to func_xyz, i get a segfault. However, when i use sample 2 style definition, segfault goes away. But i am unable to understand the reason for this behavior.

in THEORY in function2 a local variable will be initiated (which will take just a bit more space), then the calculation will be calculated and value will be copied to the variable's location.
After that the copy will be copied to the return value. So that's an extra copy operation.
in REALITY compilers do that optimization in compile time, and remove unneeded variables if their value isn't actually used. (refactoring)

Here are some details about the return value optimization in compilers.
Try with a class that has a non-trivial copy constructor to see what is actually happening.

There is absolutely no difference. Any compiler can see that the code is just
int main()
{
func_xyz(20);
}
What does the called function do??

Related

How can I initialize a variable but when said function is initiated the value of the variable is not reset?

I've made this example to show what I'm talking about. I want to know if there is a way to run through main() without resetting value to 0.
int main(){
int value = 0;
value++;
cout << value << endl;
main();
}
Before answering the question, your example has two big problems
Calling, or even taking the address of, main is not allowed.
Your function has infinite recursion which makes your program have undefined behavior.
A different example where value is saved between calls could look like this. It uses a static variable, initialized to 0 the first time the function is called, and is never initialized again during the program execution.
#include <iostream>
int a_function() {
static int value = 0;
++value;
if(value < 100) a_function();
return value;
}
int main(){
std::cout << a_function(); // prints 100
}
If you want to keep the variable value local to the main function, you can declare it as static int value = 0;.
As has been pointed out in various comments though, recursively calling any function without an escape mechanism like you are is a bad idea. Doing it with main is a worse idea still apparently not even possible.

Why can't my function access and modify the variable passed to it?

i have written this little program to explain my point and my variable a remains unchanged it prints 4. I later learned that I need to use pointers or references; why is that?
#include <iostream>
void setToTen(int x) { x = 10; }
int main(){
int a = 4;
setToTen(a);
std::cout << a << std::endl;
}
In C++ arguments to functions are passed by value. This means that when you write
setToTen(a);
the parameter int x in setToTen is given a copy of the value stored in the variable a. In other words, you're not actually handing off the variable a into the setToTen function. Instead, you're giving a copy of that value to setToTen, so the changes made in that function affect the copy rather than the original.
On the other hand, if you change setToTen so that it takes its parameter by reference, like this:
void setToTen(int& x) {
x = 10;
}
the story is different. Here, calling setToTen(a) essentially hands the variable a into the function setToTen, rather than a copy of the value. That means that changes made to the parameter x in setToTen will change the variable a.
Your code requests a copy of x by having the signature void setToTen(int x).
Being able to take things by copy means that reasoning about the behavior of a function is far easier. This is true both for you, and for the compiler.
For example, imagine this:
int increase_some( int x, int y, int z ) {
for (int i = 0; i < y; ++i )
x+=z;
return x;
}
because x y and z are copies, you can reason about what this does. If they where references to the values "outside" of increase_some, the bit where you x+=z could change y or z and things could get crazy.
But because we know they are copies, we can say increase_some returns x if y<=0, and otherwise returns x+y*z.
Which means that the optimizer could change it to exactly that:
int increase_some( int x, int y, int z ) {
if (y<=0) return x;
return x + y*z;
}
and generate that output.
This is a toy example, but we took a complex function and turned it into a simple one. Real optimizers do this all the time with pieces of your complex function.
Going one step further, by taking things by immutable value, and never touching global state, we can treat your code as "functional", only depending on its arguments. Which means the compiler can take repeated calls to a function and reduce them to one call.
This is so valuable that compilers will transform code that doesn't have immutable copies of primitive data into code that does before trying to optimize -- this is known as static single assignment form.
In theory, a complex program with lots of functions taking things by reference could be optimized this same way, and nothing be lost. But in practice that gets hard, and it is really easy to accidentally screw it up.
That is the other side; making it easier to reason about by people.
And all you have to embrace is the idea of taking arguments by value.
Function parameters are function local variables that are not alive after exiting function.
You can imagine the function definition and its call
int a = 4;
setToTen(a);
//...
void setToTen(int x) { x = 10; }
the following way
int a = 4;
setToTen(a);
//...
void setToTen( /* int x */ ) { int x = a; x = 10; }
As it is seen within the function there is declared a local variable x which is initialized by the argument a. Any changes of the local variable x do not influence on the original argument a.
If you want to change the original variable itself you should pass it by reference that is the function will deal with a reference to the variable. For example
void setToTen(int &x) { x = 10; }
In this case you can imagine the function definition and its call the following way
int a = 4;
setToTen(a);
//...
void setToTen( /* int x */ ) { int &x = a; x = 10; }
As you see the reference x is as usual local. But it references the original argument a. In this case the argument will be changed through the local reference.
Another way is to declare the parameter as pointer. For example
void setToTen(int *x) { *x = 10; }
In this case you have to pass the original argument indirectly by its address.
int a = 4;
setToTen( &a );

References - Why do the following two programs produce different output?

I recently read about references in C++. I am aware of basic properties of references but I am still not able to figure out why following two programs produce different output.
#include<iostream>
using namespace std;
int &fun()
{
static int x = 10;
return x;
}
int main()
{
fun() = 30;
cout << fun();
return 0;
}
This program prints 30 as output. As per my understanding, the function fun() returns a reference to memory location occupied by x which is then assigned a value of 30 and in the second call of fun() the assignment statement is ignored.
Now consider this program:
#include<iostream>
using namespace std;
int &fun()
{
int x = 10;
return x;
}
int main()
{
fun() = 30;
cout << fun();
return 0;
}
This program produces the output as 10. Is it because, after the first call, x is assigned 30, and after second call it is again overwritten to 10 because it is a local variable? Am I wrong anywhere? Please explain.
In the first case, fun() returns a reference to the same variable no matter how many times you call it.
In the second case, fun() returns a dangling reference to a different variable on every call. The reference is not valid after the function returns.
When you use
fun() = 30;
in the second case, you are setting the value of a variable that is not valid any longer. In theory, that is undefined behavior.
When you call fun() the second time in the second case, the variable x is set to 10. That is independent of the first call to the same function.
Just adding to what has been said. The reason behind the first case's behavior is because it is a static variable, which has a static duration. Static duration means that the object or variable is allocated when the program starts and is deallocated when the program ends.
This means that once x in the first case has been initialized the first time with 10, the next function call will ignore static int x = 10; because x cannot be instantiated again, as it has already been allocated, and will simply proceed to return x;, which will be the same x that was assigned 30 in main.
Basically, your understanding is right, except for in 2nd case, you're processing a dangled reference of the local variable has been invalid, which is undefined behaviour, means anything is possible. What you said is just one of the possibility, it also could result in others, such as getting a random number, program crash, etc.

Behaviour of reference(&) data member pointing to stack variable

I have come across a sample source code regarding use reference data member and i am confused about output. Here is sample code.
class Test {
private:
int &t;
public:
Test (int y):t(y) { }
int getT() { return t; }
};
int main() {
int x = 20;
Test t1(x);
cout << t1.getT() << "\n"; // Prints 20 as output. however y has already been destroyed but still prints 20.
x = 30;
cout << t1.getT() << endl; // Prints Garbage as output Why ? Ideally both steps should be Garbage.
return 0;
}
And to add for more confusion here is one more piece of code for same class
int main() {
int x = 20;
int z = 60;
Test t1(x);
Test t2(z);
cout<<t1.getT()<<"\n"; // Prints 60! WHY? Should print garbage
cout<<t2.getT() << "\n"; // Prints Garbage
cout<<t1.getT() << endl; // Prints Same Garbage value as previous expression
return 0;
}
x is passed by value using a temporary, so t is a reference to that temporary, not x. That temporary will be destroyed after constructor returns. Your code has undefined behavior. anything can come up as output. Your problem can be solved by passing a reference to x like
Test (int& y):t(y);
but this is not a good idea. There can be cases where x goes out of scope but the Test object is still used , then the same problem will appear.
Your constructor:
Test (int y):t(y) { }
sets t to be a reference to y, the local (temporary) variable on the stack, and not the variable in the calling function. When you change the variable value in the calling function it does not change anything in the object you created.
The fact that the reference is to a temporary variable that is lost at the end of the life of the constructor means that getT() returns an undefined value.
Every call to int getT() accesses the memory address for y. That memory address was released from the stack at the end of the constructor, so it points to memory that is not on the stack or the heap and so may be reused at any time. The time of reuse is not defined and depends on other operations established by the compiler and dependency libraries. The return value of int getT() therefor depends on other elements on your OS that affect memory, the compiler type and version, and the OS amongst other things.
Now i got it. Yes it is undefined but to answer my question why it is printing 20 or 60 before printing garbage? Actually answer is that 20 and 60 both values are garbage and ideally both getT function calls should print Garbage but it doesn't.Because there is no other instruction between Test t2(z);
cout<<t1.getT()<<"\n";
but for next statement \n works as a instruction and meanwhile stack clears the value.

Returning function parameter, possible, bad style?

So I just had a thought, is it possible to return a parameter sent when a function is called. And if it is, is this considered fine or is it bad style?
Example:
int main()
{
...
int value = 1;
value = Foo(value);
...
}
int Foo(int i)
{
i = i * 2;
return (i);
}
As the parameter is being passed in and returned by value, this is fine - there is an implicit copy occurring when you call the function and when it returns.
For example
int value=1,other=0;
other=Foo(value);
other is now 2, value will still be 1
If you were passing in a reference or pointer then you would potentially run risks.
e.g. if the signature of Foo was
int Foo( int &i )
Then after the code chunk I used above, both other and value would be 2
There's no problem with "returning a parameter" in your example. You are not really "returning a parameter" at all. You are simply using the parameter in the argument expression of return. It is the result of that expression (the value of i) that gets returned, not the parameter itself.
One can argue that the "undesirable" property of your code sample is the fact that you are modifying the parameter inside the function, i.e. you are using the parameter as an ordinary local variable. There's nothing formally wrong with it, but sometimes people prefer to preserve the original parameter values throughout the function body. I.e. from that point of view your function would look better as
int Foo(int i)
{
return i * 2;
}
or as
int Foo(int i)
{
int i2 = i * 2;
return i2;
}
but, again, it is not really about "not returning a parameter", but rather about leaving the original value of i untouched inside the function.
There's no problem with doing that and it makes it very clear what's going on.
That's one valid approach to do this, but you might also like the idea of passing by reference:
int main()
{
...
int value = 1;
Foo(value);
...
}
void Foo(int &i)
{
i = i * 2;
}
The drawback to this approach is that you have to pass what's called an lvalue into the function-- basically, something that can be on the left side of an assignment statement, which here means a variable. A call with a literal or temporary, such as Foo(2), will fail to compile. The way you had written it originally will instead do an implicit copy by value into the local scope of the Foo function. Note that the return value is now also void.
Technically, there is no problem, but semantically, it is not advisable: in most cases the input of the function and the return value of the function are not the same, so you are reusing the variable to mean something different. It is clearer in next example
int main()
{
double i = 5;
i = getSquareSurface(i); // i was a length and is now a surface
}
This should be:
int main()
{
double length = 5;
double surface = getSquareSurface(length);
}
Of course, there are cases like the addOne() or in this case the Foo() function where the meaning doesn't change.