Consider we have a function that returns by value:
int func() {
int x = 10; // create local variable x with value of 5
return x; // create temporary copy of x which is returned, local variable x is destroyed
}
int main()
{
int y = func(); // temporary copy of x is copied to y, when it hits`;` the temporary object is destroyed
return 0;
}
Correct me if I'm wrong in something what i said in comments above.
Now we can extend the life time of temporary object just by making a constant reference to it.
int main()
{
const int & y = func(); // now the temporary object (R-value) is not destroyed when it hits `;` thus the life time is lenghtened.
return 0;
}
The question is: Since I created a constant reference to temporary object which should be destroyed, does it mean that cout << &y << endl will print the address of that temporary object since reference is just "alias" ? Also where are those temporary objects (R-values) stored in memory( i used primitive type int but it could be class )?
As already stated in the comments, making a const reference (or rvalue reference) point to a temporary extends its lifetime to that one of the reference.
Also where are those temporary objects (R-values) stored in memory( i
used primitive type int but it could be class )?
This is not specified by the standard. However, you can just check on the most common compilers on what they will do. Starting out with GCC, without any optimizations, you will get
func():
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 10
mov eax, DWORD PTR [rbp-4]
pop rbp
ret
main:
push rbp
mov rbp, rsp
sub rsp, 16
call func()
mov DWORD PTR [rbp-12], eax
lea rax, [rbp-12]
mov QWORD PTR [rbp-8], rax
mov eax, 0
leave
ret
https://godbolt.org/g/8wPQqx
So the return value is pushed into EAX register inside the function and once it returns, the value is pushed onto the stack frame of main(), just as if you had created the variable in the main function. You will find similar results with other compilers. When turning on optimizations, well, the obvious will happen: the compiler sees the function just returns some constant value and elides it out entirely:
func():
mov eax, 10
ret
.LC0:
.string "%i"
main:
sub rsp, 24
mov edi, OFFSET FLAT:.LC0
xor eax, eax
lea rsi, [rsp+12]
mov DWORD PTR [rsp+12], 10
call printf
xor eax, eax
add rsp, 24
ret
Here, I added some printf() call that will output the address of the temporty s.t. the program is not totally trivial.
So it creates the function, but won't bother calling it and just writes 10, again, to some space in the local stack frame. If you just use it by value, it will be just put into a register as no address is required.
Related
I have tried to understand this basing on a square function in c++ at godbolt.org . Clearly, return, parameters and local variables use “rbp - alignment” for this function.
Could someone please explain how this is possible?
What then would rbp + alignment do in this case?
int square(int num){
int n = 5;// just to test how locals are treated with frame pointer
return num * num;
}
Compiler (x86-64 gcc 11.1)
Generated Assembly:
square(int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi. ;\\Both param and local var use rbp-*
mov DWORD PTR[rbp-4], 5. ;//
mov eax, DWORD PTR [rbp-20]
imul eax, eax
pop rbp
ret
This is one of those cases where it’s handy to distinguish between parameters and arguments. In short: arguments are the values given by the caller, while parameters are the variables holding them.
When square is called, the caller places the argument in the rdi register, in accordance with the standard x86-64 calling convention. square then allocates a local variable, the parameter, and places the argument in the parameter. This allows the parameter to be used like any other variable: be read, written into, having its address taken, and so on. Since in this case it’s the callee that allocated the memory for the parameter, it necessarily has to reside below the frame pointer.
With an ABI where arguments are passed on the stack, the callee would be able to reuse the stack slot containing the argument as the parameter. This is exactly what happens on x86-32 (pass -m32 to see yourself):
square(int): # #square(int)
push ebp
mov ebp, esp
push eax
mov eax, dword ptr [ebp + 8]
mov dword ptr [ebp - 4], 5
mov eax, dword ptr [ebp + 8]
imul eax, dword ptr [ebp + 8]
add esp, 4
pop ebp
ret
Of course, if you enabled optimisations, the compiler would not bother with allocating a parameter on the stack in the callee; it would just use the value in the register directly:
square(int): # #square(int)
mov eax, edi
imul eax, edi
ret
GCC allows "leaf" functions, those that don't call other functions, to not bother creating a stack frame. The free stack is fair game to do so as these fns wish.
Let's assume I pass a new object to a function like this:
loadContainer->addControlView( new BmpView( BMP_PICTURE ) );
Now, I want to change a specific characteristic of the BmpView before I pass it to addControlView. The way I do this is like this:
Control* newView = new BmpView( BMP_PICTURE );
newView->changeColor( WHITE );
loadContainer->addControlView( newView );
Does this create an extra temporary/local object? Or is there an equal amount of memory allocated in both cases?
The only added memory allocated in your function is a new pointer *newView, which its size is pretty low and doesn't affect by the actual size of the BmpView. It doesn't allocate twice memory for BmpView.
I'm not considering any memory overhead of calling changeColor, which I assume wasn't the point of this question.
In both cases, there is single call to new, hence the amount of memory used is equal. (Note, this is without speculating about any allocation possibly requested by BmpView constructor, changeColor(), etc.)
However, you may wish to refactor your code to ensure some exception safety, avoid potential leaks hence ensure the amount of memory used is under control:
// C++11
std::unique_ptr<Control> newView(new BmpView(BMP_PICTURE));
// C++14, preferred
//auto newView = std::make_unique<BmpView>(BMP_PICTURE);
newView->changeColor( WHITE );
loadContainer->addControlView( newView.release() );
Reference for the below code/assembly :
https://godbolt.org/g/8DgmC1
#include <cstdio>
class ValueClass {
public:
// Class content not important...
int someValue;
};
void PrintValueClass(ValueClass* ptr) {
printf("%d\n", ptr->someValue);
}
int main() {
PrintValueClass(new ValueClass());
ValueClass* pValueClass = new ValueClass();
pValueClass->someValue = 55;
PrintValueClass(pValueClass);
return 1;
}
Compiled Assembly (PrintValueClass redacted as not important to question at hand) :
Example where you pass the (new ValueClass) directly to the function.
main:
push rbp
mov rbp, rsp
mov edi, 4
call operator new(unsigned long)
mov DWORD PTR [rax], 0
mov rdi, rax
call PrintValueClass(ValueClass*)
mov eax, 1
pop rbp
ret
Example where you create a local variable holding the pointer, do something to it, and then pass it to the function.
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov edi, 4
call operator new(unsigned long)
mov DWORD PTR [rax], 0
mov QWORD PTR [rbp-8], rax
mov rax, QWORD PTR [rbp-8]
mov DWORD PTR [rax], 55
mov rax, QWORD PTR [rbp-8]
mov rdi, rax
call PrintValueClass(ValueClass*)
mov eax, 1
leave
ret
Before diving into the assembly, if your question is does the new operation occur twice if you store the pointer in a variable first, the answer is no. As shown through the assembly, the new 'function' is only called once, so only sizeof(ValueClass) is ever being allocated here through some sort of heap allocation function. But I feel the need to answer the question fully, even if it was not exactly intended to ask this question. Is extra memory used? Technically yes, realistically no.
The only difference between these two pieces of code is stack 'allocation', noted by the sub rsp, 16, which essentially means 'allocate' 16 bytes on the stack for local variables. So truly the only difference here is 16 bytes, which will greatly change on what compiler you use, what architecture you target, and many more factors.
At the end of the day, I would go as far to say, you would never care about the extra 16 bytes.
I have the following code:
int func(int a)
{
int b=2;
int c,d,e,f,g;
//some code which involves a,b,c,d,e,f,g
}
int main()
{
int s=3;
func(s);
}
Now what happens is that when main begins execution:
1.It pushes s onto stack
2.It calls func()
3.func() pushes b,c,d,e,f,g onto stack
4.Now when the code involving a,b,c,d,e,f.g is executed, in order to know the value of a all local variables of func() will have to be popped. Then a's value is retrieved. Now if again b.c.d.e.f.g are to be used, how will their values be retrieved (because they have already been popped)?
The local variables, as well as the argument, aren't actually pushed on the stack. Instead the compiler adds code to change the stack pointer by enough to fit all variables, and then when referencing a local variable the compiler have code to get the value from an offset from the stack pointer.
I recommend that you look at the assembler output of your example program to understand how it works.
The equivalent code for void func(int a)
void func(int a)
{
00413880 push ebp
00413881 mov ebp,esp
00413883 sub esp,108h
00413889 push ebx
0041388A push esi
0041388B push edi
0041388C lea edi,[ebp-108h]
00413892 mov ecx,42h
00413897 mov eax,0CCCCCCCCh
0041389C rep stos dword ptr es:[edi]
int b=2;
0041389E mov dword ptr [b],2
int c,d,e,f,g;
//some code which involves a,b,c,d,e,f,g
}
Now lets see the equivalent assembly code for the below code::
void func(int a)
{
int b=2;
int c,d,e,f,g;
c = 10 ;
d = 15 ;
e = 20 ;
a = a + 2 ;
}
Assembly Code::
void func(int a)
{
00413880 push ebp
00413881 mov ebp,esp
00413883 sub esp,108h
00413889 push ebx
0041388A push esi
0041388B push edi
0041388C lea edi,[ebp-108h]
00413892 mov ecx,42h
00413897 mov eax,0CCCCCCCCh
0041389C rep stos dword ptr es:[edi]
int b=2;
0041389E mov dword ptr [b],2
int c,d,e,f,g;
c = 10 ;
004138A5 mov dword ptr [c],0Ah
d = 15 ;
004138AC mov dword ptr [d],0Fh
e = 20 ;
004138B3 mov dword ptr [e],14h
a = a + 2 ;
004138BA mov eax,dword ptr [a]
004138BD add eax,2
004138C0 mov dword ptr [a],eax
}
So, although they are pushed in Stack (ESP or SP) but the pointer to each of the local variables are stored too, so that, they can be accessed easily when needed.
Like, when the code needed to use variable a, the dword ptr [a] is simply moved to register EAX.
NOTE: Technically not pushed but adjusted to fit in all variables. ( Courtesy: Joachim Pileborg)
in order to know the value of a all local variables of func() will have to be popped
This line is although grammatically ambiguous. I ll assume you mean to say the varibles are popped when being used by function..
But the actual case is local variable are popped out only when function returns to the caller.
Provided they are automatic.
On the other hand when function wants to access(read/write) the variable. It uses the offset(distance) from the base to access them, so there is no question of variables being popped out while being accessed for evaluation.
I recently had a serious bug, where I forgot to return a value in a function. The problem was that even though nothing was returned it worked fine under Linux/Windows and only crashed under Mac. I discovered the bug when I turned on all compiler warnings.
So here is a simple example:
#include <iostream>
class A{
public:
A(int p1, int p2, int p3): v1(p1), v2(p2), v3(p3)
{
}
int v1;
int v2;
int v3;
};
A* getA(){
A* p = new A(1,2,3);
// return p;
}
int main(){
A* a = getA();
std::cerr << "A: v1=" << a->v1 << " v2=" << a->v2 << " v3=" << a->v3 << std::endl;
return 0;
}
My question is how can this work under Linux/Windows without crashing? How is the returning of values done on lower level?
On Intel architecture, simple values (integers and pointers) are usually returned in eax register. This register (among others) is also used as temporary storage when moving values in memory and as operand during calculations. So whatever value left in that register is treated as the return value, and in your case it turned out to be exactly what you wanted to be returned.
Probably by luck, 'a' left in a register that happens to be used for returning single pointer results, something like that.
The calling/ conventions and function result returns are architecture-dependent, so it's not surprising that your code works on Windows/Linux but not on a Mac.
There are two major ways for a compiler to return a value:
Put a value in a register before returning, and
Have the caller pass a block of stack memory for the return value, and write the value into that block [more info]
The #1 is usually used with anything that fits into a register; #2 is for everything else (large structs, arrays, et cetera).
In your case, the compiler uses #1 both for the return of new and for the return of your function. On Linux and Windows, the compiler did not perform any value-distorting operations on the register with the returned value between writing it into the pointer variable and returning from your function; on Mac, it did. Hence the difference in the results that you see: in the first case, the left-over value in the return register happened to co-inside with the value that you wanted to return anyway.
First off, you need to slightly modify your example to get it to compile. The function must have at least an execution path that returns a value.
A* getA(){
if(false)
return NULL;
A* p = new A(1,2,3);
// return p;
}
Second, it's obviously undefined behavior, which means anything can happen, but I guess this answer won't satisfy you.
Third, in Windows it works in Debug mode, but if you compile under Release, it doesn't.
The following is compiled under Debug:
A* p = new A(1,2,3);
00021535 push 0Ch
00021537 call operator new (211FEh)
0002153C add esp,4
0002153F mov dword ptr [ebp-0E0h],eax
00021545 mov dword ptr [ebp-4],0
0002154C cmp dword ptr [ebp-0E0h],0
00021553 je getA+7Eh (2156Eh)
00021555 push 3
00021557 push 2
00021559 push 1
0002155B mov ecx,dword ptr [ebp-0E0h]
00021561 call A::A (21271h)
00021566 mov dword ptr [ebp-0F4h],eax
0002156C jmp getA+88h (21578h)
0002156E mov dword ptr [ebp-0F4h],0
00021578 mov eax,dword ptr [ebp-0F4h]
0002157E mov dword ptr [ebp-0ECh],eax
00021584 mov dword ptr [ebp-4],0FFFFFFFFh
0002158B mov ecx,dword ptr [ebp-0ECh]
00021591 mov dword ptr [ebp-14h],ecx
The second instruction, the call to operator new, moves into eax the pointer to the newly created instance.
A* a = getA();
0010484E call getA (1012ADh)
00104853 mov dword ptr [a],eax
The calling context expects eax to contain the returned value, but it does not, it contains the last pointer allocated by new, which is incidentally, p.
So that's why it works.
As Kerrek SB mentioned, your code has ventured into the realm of undefined behavior.
Basically, your code is going to compile down to assembly. In assembly, there's no concept of a function requiring a return type, there's just an expectation. I'm the most comfortable with MIPS, so I shall use MIPS to illustrate.
Assume you have the following code:
int add(x, y)
{
return x + y;
}
This is going to be translated to something like:
add:
add $v0, $a0, $a1 #add $a0 and $a1 and store it in $v0
jr $ra #jump back to where ever this code was jumped to from
To add 5 and 4, the code would be called something like:
addi $a0, $0, 5 # 5 is the first param
addi $a1, $0, 4 # 4 is the second param
jal add
# $v0 now contains 9
Note that unlike C, there's no explicit requirement that $v0 contain the return value, just an expectation. So, what happens if you don't actually push anything into $v0? Well, $v0 always has some value, so the value will be whatever it last was.
Note: This post makes some simplifications. Also, you're computer is likely not running MIPS... But hopefully the example holds, and if you learned assembly at a university, MIPS might be what you know anyway.
The way of returning of value from the function depends on architecture and the type of value. It could be done thru registers or thru stack.
Typically in the x86 architecture the value is returned in EAX register if it is an integral type: char, int or pointer.
When you don't specify the return value, that value is undefined. This is only your luck that your code sometimes worked correctly.
When popping values from the stack in IBM PC architecture there is no physical destruction of the old values of data stored there. They just become unavailable through the operation of the stack, but still remain in the same memory cell.
Of course, the previous values of these data will be destroyed during the subsequent pushing of new data on the stack.
So probably you are just lucky enough, and nothing is added to stack during your function's call and return surrounding code.
Regarding the following statement from n3242 draft C++ Standard, paragraph 6.6.3.2, your example yields undefined behavior:
Flowing off the end of a function is equivalent to a return with no
value; this results in undefined behavior in a value-returning
function.
The best way to see what actually happens is to check the assembly code generated by the given compiler on a given architecture. For the following code:
#pragma warning(default:4716)
int foo(int a, int b)
{
int c = a + b;
}
int main()
{
int n = foo(1, 2);
}
...VS2010 compiler (in Debug mode, on Intel 32-bit machine) generates the following assembly:
#pragma warning(default:4716)
int foo(int a, int b)
{
011C1490 push ebp
011C1491 mov ebp,esp
011C1493 sub esp,0CCh
011C1499 push ebx
011C149A push esi
011C149B push edi
011C149C lea edi,[ebp-0CCh]
011C14A2 mov ecx,33h
011C14A7 mov eax,0CCCCCCCCh
011C14AC rep stos dword ptr es:[edi]
int c = a + b;
011C14AE mov eax,dword ptr [a]
011C14B1 add eax,dword ptr [b]
011C14B4 mov dword ptr [c],eax
}
...
int main()
{
011C14D0 push ebp
011C14D1 mov ebp,esp
011C14D3 sub esp,0CCh
011C14D9 push ebx
011C14DA push esi
011C14DB push edi
011C14DC lea edi,[ebp-0CCh]
011C14E2 mov ecx,33h
011C14E7 mov eax,0CCCCCCCCh
011C14EC rep stos dword ptr es:[edi]
int n = foo(1, 2);
011C14EE push 2
011C14F0 push 1
011C14F2 call foo (11C1122h)
011C14F7 add esp,8
011C14FA mov dword ptr [n],eax
}
The result of addition operation in foo() is stored in eax register (accumulator) and its content is used as a return value of the function, moved to variable n.
eax is used to store a return value (pointer) in the following example as well:
#pragma warning(default:4716)
int* foo(int a)
{
int* p = new int(a);
}
int main()
{
int* pn = foo(1);
if(pn)
{
int n = *pn;
delete pn;
}
}
Assembly code:
#pragma warning(default:4716)
int* foo(int a)
{
000C1520 push ebp
000C1521 mov ebp,esp
000C1523 sub esp,0DCh
000C1529 push ebx
000C152A push esi
000C152B push edi
000C152C lea edi,[ebp-0DCh]
000C1532 mov ecx,37h
000C1537 mov eax,0CCCCCCCCh
000C153C rep stos dword ptr es:[edi]
int* p = new int(a);
000C153E push 4
000C1540 call operator new (0C1253h)
000C1545 add esp,4
000C1548 mov dword ptr [ebp-0D4h],eax
000C154E cmp dword ptr [ebp-0D4h],0
000C1555 je foo+50h (0C1570h)
000C1557 mov eax,dword ptr [ebp-0D4h]
000C155D mov ecx,dword ptr [a]
000C1560 mov dword ptr [eax],ecx
000C1562 mov edx,dword ptr [ebp-0D4h]
000C1568 mov dword ptr [ebp-0DCh],edx
000C156E jmp foo+5Ah (0C157Ah)
std::operator<<<std::char_traits<char> >:
000C1570 mov dword ptr [ebp-0DCh],0
000C157A mov eax,dword ptr [ebp-0DCh]
000C1580 mov dword ptr [p],eax
}
...
int main()
{
000C1610 push ebp
000C1611 mov ebp,esp
000C1613 sub esp,0E4h
000C1619 push ebx
000C161A push esi
000C161B push edi
000C161C lea edi,[ebp-0E4h]
000C1622 mov ecx,39h
000C1627 mov eax,0CCCCCCCCh
000C162C rep stos dword ptr es:[edi]
int* pn = foo(1);
000C162E push 1
000C1630 call foo (0C124Eh)
000C1635 add esp,4
000C1638 mov dword ptr [pn],eax
if(pn)
000C163B cmp dword ptr [pn],0
000C163F je main+51h (0C1661h)
{
int n = *pn;
000C1641 mov eax,dword ptr [pn]
000C1644 mov ecx,dword ptr [eax]
000C1646 mov dword ptr [n],ecx
delete pn;
000C1649 mov eax,dword ptr [pn]
000C164C mov dword ptr [ebp-0E0h],eax
000C1652 mov ecx,dword ptr [ebp-0E0h]
000C1658 push ecx
000C1659 call operator delete (0C1249h)
000C165E add esp,4
}
}
VS2010 compiler issues warning 4716 in both examples. By default this warning is promoted to an error.
when ever a function has a object passed by value it uses either copy constructor or bit wise copy to create a temporary to place on stack to use inside the function,How about some object returned from function ?
//just a sample code to support the qn
rnObj somefunction()
{
return rnObj();
}
and also explain how the return value is taken to the called function.
As can be judged by the other answers - the compiler can optimize this.
A concrete example generated using MSVC, to explain how this is possible (as asked in one of the comments) -
Take a class -
class AClass
{
public:
AClass( int Data1, int Data2, int Data3 );
int GetData1();
private:
int Data1;
int Data2;
int Data3;
};
With the following trivial implementation -
AClass::AClass( int Data1, int Data2, int Data3 )
{
this->Data1 = Data1;
this->Data2 = Data2;
this->Data3 = Data3;
}
int AClass::GetData1()
{
return Data1;
}
And the following calling code -
AClass Func( int Data1, int Data2, int Data3 )
{
return AClass( Data1, Data2, Data3 );
}
int main()
{
AClass TheClass = Func( 10, 20, 30 );
printf( "%d", TheClass.GetData1() );
}
(the printf() added just to make sure the compiler doesn't optimize everything away...).
In a non-optimized code, we would expect Func() to create a local AClass on its stack, construct it there and copy it as its return variable.
However, the generated assembly actually looks like (removing unneeded lines) -
_TEXT SEGMENT
___$ReturnUdt$ = 8 ; size = 4
_Data1$ = 12 ; size = 4
_Data2$ = 16 ; size = 4
_Data3$ = 20 ; size = 4
mov eax, DWORD PTR _Data3$[esp-4]
mov ecx, DWORD PTR _Data2$[esp-4]
mov edx, DWORD PTR _Data1$[esp-4]
push esi
mov esi, DWORD PTR ___$ReturnUdt$[esp]
push eax
push ecx
push edx
mov ecx, esi
call ??0AClass##QAE#HHH#Z ; AClass::AClass
mov eax, esi
pop esi
ret 0
The 3 function variables are extracted from the stack and placed into eax, ecx and edx.
Another forth value is placed into esi (and passed on to ecx).
The constructor is called with the 3 parameters on the stack, and ecx still containing the forth value.
Let's take a look in the constructor -
_TEXT SEGMENT
_Data1$ = 8 ; size = 4
_Data2$ = 12 ; size = 4
_Data3$ = 16 ; size = 4
mov edx, DWORD PTR _Data2$[esp-4]
mov eax, ecx
mov ecx, DWORD PTR _Data1$[esp-4]
mov DWORD PTR [eax], ecx
mov ecx, DWORD PTR _Data3$[esp-4]
mov DWORD PTR [eax+4], edx
mov DWORD PTR [eax+8], ecx
ret 12 ; 0000000cH
The 3 constructor parameters are read into offsets of eax - eax being a copy of ecx, the forth parameter from the call above.
So, the constructor builts the object where it is told to - the forth parameter of Func().
And, you guessed it, the forth parameter of Func() is actually the real single place in the entire program where a constructed AClass exists. Let's look at the relevant part of main() -
_TEXT SEGMENT
_TheClass$ = -12 ; size = 12
_main PROC
sub esp, 12
push 30
push 20
lea eax, DWORD PTR _TheClass$[esp+20]
push 10
push eax
call ?Func##YA?AVAClass##HHH#Z ; Func
12 bytes are reserved for an AClass and the three arguments to Func() are passed, along with the forth one - pointing to those 12 bytes.
This is a specific example with a specific compiler. Other compilers do this differently. But that's the spirit of things.
The C++ Standard states that the compiler can choose to create a temporary object, using the copy constructor, or it can choose to optimise away the temporary.
There is a wikipedia article on the subject, with numerous references here.
Most modern compilers will be able to avoid the temporary object creation if the optimization is turned on. See Named Return Value Optimization for more details.
The compiler is allowed to perform a copy on return. Most compilers won't. You must, however, make sure that rnObj has an accessible copy constructor.