I have a very simple program where my goal was to study how the compiler pushes the values to the different registers. But now the behaviour is much more complicated than expected, at least in debug-mode.
What is going on here?
#include <cstdio>
struct A
{
int B;
A() : B(0) { }
};
int main()
{
A a;
A b(a);
printf("%d", b.B);
printf("%d", a.B);
return 0;
}
This is how the disassembly looks like in Visual Studio:
int main()
{
01048210 push ebp
01048211 mov ebp,esp
01048213 sub esp,0D8h
01048219 push ebx
0104821A push esi
0104821B push edi
0104821C lea edi,[ebp-0D8h]
01048222 mov ecx,36h
01048227 mov eax,0CCCCCCCCh
0104822C rep stos dword ptr es:[edi]
A a;
0104822E lea ecx,[a]
01048231 call A::A (104678Ah)
A b(a);
01048236 mov eax,dword ptr [a]
01048239 mov dword ptr [b],eax
printf("%d", b.B);
0104823C mov eax,dword ptr [b]
0104823F push eax
01048240 push offset string "%d" (1093C6Ch)
01048245 call #ILT+3885(_printf) (1046F32h)
0104824A add esp,8
printf("%d", a.B);
0104824D mov eax,dword ptr [a]
01048250 push eax
01048251 push offset string "%d" (1093C6Ch)
01048256 call #ILT+3885(_printf) (1046F32h)
0104825B add esp,8
}
The first lines are explained in this answer, they are there to kep the frame pointer so that nice stack traces can be generated.
But the next lines are confusing: why subtract 216 (0D8h) from esp?
What are these lines after main, but before the first line of code A a; doing?
Edit: after setting the runtime checks to default the disassembly is much smaller:
int main()
{
00247110 push ebp
00247111 mov ebp,esp
00247113 sub esp,48h
00247116 push ebx
00247117 push esi
00247118 push edi
A a;
Edit 2: in Release mode (/Ox) a and b are completely optimized away and no memory is allocated on the stack at all:
int main()
{
A a;
A b(a);
printf("%d", b.B);
00B41000 push 0
00B41002 push 0B499A0h
00B41007 call printf (0B4102Dh)
printf("%d", a.B);
00B4100C push 0
00B4100E push 0B499A4h
00B41013 call printf (0B4102Dh)
00B41018 add esp,10h
return 0;
0127101B xor eax,eax
}
0127101D ret
Edit 3: this is the result using gcc -m32 -O3 -mpreferred-stack-boundary=2 (thanks to #CodyGray).
.LC0:
.string "%d"
Test():
push 0
push OFFSET FLAT:.LC0
call printf
pop eax
pop edx
push 0
push OFFSET FLAT:.LC0
call printf
pop ecx
pop eax
ret
00CC8223 sub esp,0D8h
Allocates the stack space for the local variables.
What are these lines after main, but before the first instruction doing?
What are you referring to?
Related
This question already has answers here:
Why does the x86-64 GCC function prologue allocate less stack than the local variables?
(1 answer)
Why is there no "sub rsp" instruction in this function prologue and why are function parameters stored at negative rbp offsets?
(2 answers)
Closed 4 years ago.
I'm on the way to get idea how the stack works on x86 and x64 machines. What I observed however is that when I manually write a code and disassembly it, it differs from what I see in the code people provide (eg. in their questions and tutorials). Here is little example:
Source
int add(int a, int b) {
int c = 16;
return a + b + c;
}
int main () {
add(3,4);
return 0;
}
x86
add(int, int):
push ebp
mov ebp, esp
sub esp, 16
mov DWORD PTR [ebp-4], 16
mov edx, DWORD PTR [ebp+8]
mov eax, DWORD PTR [ebp+12]
add edx, eax
mov eax, DWORD PTR [ebp-4]
add eax, edx
leave (!)
ret
main:
push ebp
mov ebp, esp
push 4
push 3
call add(int, int)
add esp, 8
mov eax, 0
leave (!)
ret
Now goes x64
add(int, int):
push rbp
mov rbp, rsp
(?) where is `sub rsp, X`?
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-24], esi
mov DWORD PTR [rbp-4], 16
mov edx, DWORD PTR [rbp-20]
mov eax, DWORD PTR [rbp-24]
add edx, eax
mov eax, DWORD PTR [rbp-4]
add eax, edx
(?) where is `mov rsp, rbp` before popping rbp?
pop rbp
ret
main:
push rbp
mov rbp, rsp
mov esi, 4
mov edi, 3
call add(int, int)
mov eax, 0
(?) where is `mov rsp, rbp` before popping rbp?
pop rbp
ret
As you can see, my main confusion is that when I compile against x86 - I see what I expect. When it's x64 - I miss leave instruction or exact following sequence: mov rsp, rbp then pop rbp. What's worng?
UPDATE
It seems like leave is missing, just because it wasn't altered previously. But then, goes another question - why there is no allocation for local vars in the frame?
To this question #melpomene gives pretty straightforward answer - because of "red zone". Which basically means the function that calls no further functions (leaf) can use the first 128 bytes below the stack without allocating space. So if I insert a call inside an add() to any other dumb function - sub rsp, X and add rsp, X will be added to prologue and epilogue respectively.
I have this encryption program in C++ and ASM which has got encryption routines but I need to know
how the decryption routine for it should look like .
This is the code :
//-ENCRYPTION ROUTINES
void encrypt_chars (int length, char EKey)
{ char temp_char;
for (int i = 0; i < length; i++)
{ temp_char = OChars [i];
__asm {
push eax
push ecx
movzx ecx,temp_char
lea eax,EKey
call encrypt
mov temp_char,al
pop ecx
pop eax
}
EChars [i] = temp_char;
}
return;
// --- Start of Assembly code
__asm {
encrypt5: push eax
mov al,byte ptr [eax]
push ecx
and eax,0x7C
ror eax,1
ror eax,1
inc eax
mov edx,eax
pop ecx
pop eax
mov byte ptr [eax],dl
xor edx,ecx
mov eax,edx
rol al,1
ret
encrypt:
mov eax,ecx
inc eax
ret
}
//--- End of Assembly code
}
The best clue ever for decryption (and as general as the question):
undo everything
I guess every instruction in that code has an conservative opposite (unless it's destroyin the data, but hey)
So if the code ends with:
inc eax
ret
You start with
[load the return in eax]
dec eax
and so on.
string reverse(string str) pure nothrow
{
string reverse_impl(string temp, string str) pure nothrow
{
if (str.length == 0)
{
return temp;
}
else
{
return reverse_impl(str[0] ~ temp, str[1..$]);
}
}
return reverse_impl("", str);
}
As far as I know, this code should be subject to tail-call optimization, but I can't tell if DMD is doing it or not. Which of the D compilers support tail-call optimization, and will they perform it on this function?
From looking at the disassembly, DMD performs TCO on your code:
_D4test7reverseFNaNbAyaZAya12reverse_implMFNaNbAyaAyaZAya comdat
assume CS:_D4test7reverseFNaNbAyaZAya12reverse_implMFNaNbAyaAyaZAya
L0: sub ESP,0Ch
push EBX
push ESI
cmp dword ptr 018h[ESP],0
jne L1C
LC: mov EDX,024h[ESP]
mov EAX,020h[ESP]
pop ESI
pop EBX
add ESP,0Ch
ret 010h
L1C: push dword ptr 024h[ESP]
mov EAX,1
mov EDX,offset FLAT:_D12TypeInfo_Aya6__initZ
push dword ptr 024h[ESP]
mov ECX,024h[ESP]
push ECX
push EAX
push EDX
call near ptr __d_arraycatT
mov EBX,02Ch[ESP]
mov ESI,030h[ESP]
mov 034h[ESP],EAX
dec EBX
lea ECX,1[ESI]
mov 01Ch[ESP],EBX
mov 020h[ESP],ECX
mov 02Ch[ESP],EBX
mov 030h[ESP],ECX
mov 038h[ESP],EDX
add ESP,014h
cmp dword ptr 8[ESP],0
jne L1C
jmp short LC
_D4test7reverseFNaNbAyaZAya12reverse_implMFNaNbAyaAyaZAya ends
end
A very good resource for quickly looking at the code generated by gdc is http://d.godbolt.org/. We currently don't have a dmd equivalent.
#include<stdio.h>
int a[100];
int main(){
char UserName[100];
char *n=UserName;
char *q=NULL;
char Serial[200];
q=Serial;
scanf("%s",UserName);
//this is about
__asm{
pushad
mov eax,q
push eax
mov eax,n
push eax
mov EAX,EAX
mov EAX,EAX
CALL G1
LEA EDX,DWORD PTR SS:[ESP+10H]
jmp End
G1:
SUB ESP,400H
XOR ECX,ECX
PUSH EBX
PUSH EBP
MOV EBP,DWORD PTR SS:[ESP+40CH]
PUSH ESI
PUSH EDI
MOV DL,BYTE PTR SS:[EBP]
TEST DL,DL
JE L048
LEA EDI,DWORD PTR SS:[ESP+10H]
MOV AL,DL
MOV ESI,EBP
SUB EDI,EBP
L014:
MOV BL,AL
ADD BL,CL
XOR BL,AL
SHL AL,1
OR BL,AL
MOV AL,BYTE PTR DS:[ESI+1]
MOV BYTE PTR DS:[EDI+ESI],BL
INC ECX
INC ESI
TEST AL,AL
JNZ L014
TEST DL,DL
JE L048
MOV EDI,DWORD PTR SS:[ESP+418H]
LEA EBX,DWORD PTR SS:[ESP+10H]
MOV ESI,EBP
SUB EBX,EBP
L031:
MOV AL,BYTE PTR DS:[ESI+EBX]
PUSH EDI
PUSH EAX
CALL G2
MOV AL,BYTE PTR DS:[ESI+1]
ADD ESP,8
ADD EDI,2
INC ESI
TEST AL,AL
JNZ L031
MOV BYTE PTR DS:[EDI],0
POP EDI
POP ESI
POP EBP
POP EBX
ADD ESP,400H
RETN
L048:
MOV ECX,DWORD PTR SS:[ESP+418H]
POP EDI
POP ESI
POP EBP
MOV BYTE PTR DS:[ECX],0
POP EBX
ADD ESP,400H
RETN
G2:
MOVSX ECX,BYTE PTR SS:[ESP+4]
MOV EAX,ECX
AND ECX,0FH
SAR EAX,4
AND EAX,0FH
CMP EAX,0AH
JGE L009
ADD AL,30H
JMP L010
L009:
ADD AL,42H
L010:
MOV EDX,DWORD PTR SS:[ESP+8]
CMP ECX,0AH
MOV BYTE PTR DS:[EDX],AL
JGE L017
ADD CL,61H
MOV BYTE PTR DS:[EDX+1],CL
RETN
L017:
ADD CL,45H
MOV BYTE PTR DS:[EDX+1],CL
RETN
End:
mov eax,eax
popad
}
printf("%s\n",Serial);
return 0;
}
Can you help me?
this problem about Asm,I don't know why cause this result.
this program is very easy,and it about a program of internal code.
Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.
It seems the two parameters which are pushed onto the stack before the call to G1 are never popped from the stack.
Possibly it happens because at the beginning of the function G1 you SUB ESP,400H, after L031 you make ADD ESP,8 and at the end you ADD ESP,400H. It seems like ESP before the G1 call is by 8 less then after call.
EDIT: Regarding to the coding style of assembly function please see this. Here briefly described what are the caller's responsibilities and what are callee's responsibilities, that are regarded to ESP.
__RTC_CheckEsp is a call that verifies the correctness of the esp, stack, register. It is called to ensure that the value of the esp was saved across a function call.
Anyone knows how it's implemented?
Well a little bit of inspection of the assembler gives it away
0044EE35 mov esi,esp
0044EE37 push 3039h
0044EE3C mov ecx,dword ptr [ebp-18h]
0044EE3F add ecx,70h
0044EE42 mov eax,dword ptr [ebp-18h]
0044EE45 mov edx,dword ptr [eax+70h]
0044EE48 mov eax,dword ptr [edx+0Ch]
0044EE4B call eax
0044EE4D cmp esi,esp
0044EE4F call #ILT+6745(__RTC_CheckEsp) (42BA5Eh)
There are 2 lines to note in this. First note at 0x44ee35 it stores the current value of esp to esi.
Then after the function call is completed it does a cmp between esp and esi. They should both be the same now. If they aren't then someone has either unwound the stack twice or not unwound it.
The _RTC_CheckEsp function looks like this:
_RTC_CheckEsp:
00475A60 jne esperror (475A63h)
00475A62 ret
esperror:
00475A63 push ebp
00475A64 mov ebp,esp
00475A66 sub esp,0
00475A69 push eax
00475A6A push edx
00475A6B push ebx
00475A6C push esi
00475A6D push edi
00475A6E mov eax,dword ptr [ebp+4]
00475A71 push 0
00475A73 push eax
00475A74 call _RTC_Failure (42C34Bh)
00475A79 add esp,8
00475A7C pop edi
00475A7D pop esi
00475A7E pop ebx
00475A7F pop edx
00475A80 pop eax
00475A81 mov esp,ebp
00475A83 pop ebp
00475A84 ret
As you can see the first thing it check is whether the result of the earlier comparison were "not equal" ie esi != esp. If thats the case then it jumps to the failure code. If they ARE the same then the function simply returns.
If you're any good at asm, maybe this helps:
jne (Jump if Not Equal) - jumps if the ZERO flag is NZ (NotZero)
_RTC_CheckEsp:
004C8690 jne esperror (4C8693h)
004C8692 ret
esperror:
004C8693 push ebp
004C8694 mov ebp,esp
004C8696 sub esp,0
004C8699 push eax
004C869A push edx
004C869B push ebx
004C869C push esi
004C869D push edi
004C869E mov eax,dword ptr [ebp+4]
004C86A1 push 0
004C86A3 push eax
004C86A4 call _RTC_Failure (4550F8h)
004C86A9 add esp,8
004C86AC pop edi
004C86AD pop esi
004C86AE pop ebx
004C86AF pop edx
004C86B0 pop eax
004C86B1 mov esp,ebp
004C86B3 pop ebp
004C86B4 ret
004C86B5 int 3
004C86B6 int 3
004C86B7 int 3
004C86B8 int 3
004C86B9 int 3
004C86BA int 3
004C86BB int 3
004C86BC int 3
004C86BD int 3
004C86BE int 3
004C86BF int 3