In the following output of disassemble 28 is in decimal or hex ?
mov edx,DWORD PTR [ebp-28]
In the following output of disassemble 28 is in decimal or hex ? mov edx,DWORD PTR [ebp-28]
Very likely decimal. Is that output from GDB?
Newer versions of GDB use hex, and are explicit about it:
0x000cadb6 <+6>: mov eax,DWORD PTR [ebp+0x8]
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
When using Compiler Explorer (https://godbolt.org/) to compare assembly output of simple programs, why D language assembly output is so long compared to C or C++ output. The simple square function output is the same for C, C++, and D, but the D output has additional lines that are not highlighted when hovering over the square function in the source code.
What are these additional lines?
How I can remove these lines from being generated?
Let's say I have https://godbolt.org/z/64EsWo5Ke a template function both in C++ and D, the Intel asm output for D is 29309 lines long, while the C++ Intel asm output is 73 lines only.
These are the codes in question:
For D:
int example.square(int):
push rbp
mov rbp, rsp
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
imul eax, dword ptr [rbp - 4]
pop rbp
ret
ldc.register_dso:
sub rsp, 40
mov qword ptr [rsp + 8], 1
lea rax, [rip + ldc.dso_slot]
mov qword ptr [rsp + 16], rax
lea rax, [rip + __start___minfo]
mov qword ptr [rsp + 24], rax
lea rax, [rip + __stop___minfo]
mov qword ptr [rsp + 32], rax
lea rax, [rsp + 8]
mov rdi, rax
call _d_dso_registry#PLT
add rsp, 40
ret
example.__ModuleInfo:
.long 2147483652
.long 0
.asciz "example"
example.__moduleRef:
.quad example.__ModuleInfo
ldc.dso_slot:
.quad 0
C/C++:
square(int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
imul eax, eax
pop rbp
ret
As you can see the actual implementation in assembly is very similar (almost identical). The program constructs the stack frame:
push rbp
mov rbp, rsp
Takes the argument and multiplies it with itself leaving it in the return value (eax register):
mov dword ptr [rbp - 4], edi
mov eax, dword ptr [rbp - 4]
imul eax, dword ptr [rbp - 4]
in D and
mov DWORD PTR [rbp-4], edi
mov eax, DWORD PTR [rbp-4]
imul eax, eax
in C++/C, and then deconstructs stack frame and returns:
pop rbp
ret
Now I don't claim to know what the D compiler is doing, but I assume the rest of the code is so that this piece of compiled code can work well with other D code. Basically metadata and other fun stuff. I assume this because nowhere does our function use any of the defined symbols nor do the other function call square. This code is therefore probably to do something with inclusion into other D programs, or the like, and therefore you might not be able to/should not remove it.
In the case of your second example, most of the code is the output library implemented. Using only the function defined it is actually 66 lines long. While still longer than the equivalent 22 lines of C++ generated assembly it is not several thousand.
Edit:
As I explained in a comment would recommend to analyse the output binaries with something like Cutter or Ghidra, which give you a more complete picture of what is actually produced in a binary, because I can tell you that even in 'shorter' C++ code you will find a lot of function calls such as _entry before getting to main.
I attached my asm into my source code of dll and hooked it to my exe, and it works like a charm but when I packed my exe using exe packer. the dll with asm not working on exe packed. any idea how to solve this problem?
#include "StdAfx.h"
void __declspec(naked) MyStuff() {
__asm {
PUSH EBP
MOV EBP, ESP
MOV EAX, DWORD PTR SS : [EBP + 0x8]
MOV EAX, DWORD PTR DS : [EAX]
XOR EAX, ENCPACKET
MOV DWORD PTR SS : [EBP + 0x8], EAX
MOV AX, WORD PTR SS : [EBP + 0xA]
POP EBP
RETN 0x4
}
}
void SetStuff(){
SetJmp((LPVOID)0x00424B1C, MyStuff);
}
The result was:
005DB4E0 /> \55 PUSH EBP
005DB4E1 |. 8BEC MOV EBP,ESP
005DB4E3 |. 56 PUSH ESI
005DB4E4 |. FF75 0C PUSH DWORD PTR SS:[EBP+C]
005DB4E7 |. B9 403E0801 MOV ECX,01083E40
005DB4EC |. E8 38CCE3FF CALL 00418129
005DB4F1 |. 8BF0 MOV ESI,EAX
005DB4F3 |. 85F6 TEST ESI,ESI
005DB4F5 |. 74 1E JE SHORT 005DB515
005DB4F7 |. FF75 08 PUSH DWORD PTR SS:[EBP+8]
005DB4FA |. 8BCE MOV ECX,ESI
005DB4FC |. E8 D799E4FF CALL 00424ED8
005DB501 |. 8B4D 10 MOV ECX,DWORD PTR SS:[EBP+10]
005DB504 |. FF75 08 PUSH DWORD PTR SS:[EBP+8]
005DB507 |. 8901 MOV DWORD PTR DS:[ECX],EAX
005DB509 |. 8BCE MOV ECX,ESI
005DB50B |. E8 EEA5E4FF CALL 00425AFE
005DB510 |. 8B4D 14 MOV ECX,DWORD PTR SS:[EBP+14]
005DB513 |. 8901 MOV DWORD PTR DS:[ECX],EAX
005DB515 |> 5E POP ESI
005DB516 |. 5D POP EBP
005DB517 \. C2 1000 RETN 10
and may I ask is it possible to run an asm into specific offset like this?
#include "StdAfx.h"
void __declspec(naked) MyStuff() {
__asm {
005DB4E0-> PUSH EBP
005DB4E1-> MOV EBP, ESP
005DB4E3-> MOV EAX, DWORD PTR SS : [EBP + 0x8]
005DB4E4-> MOV EAX, DWORD PTR DS : [EAX]
005DB4E7-> XOR EAX, ENCPACKET
005DB4EC-> MOV DWORD PTR SS : [EBP + 0x8], EAX
005DB4E1-> MOV AX, WORD PTR SS : [EBP + 0xA]
005DB4F3-> POP EBP
005DB4F4-> RETN 0x4
}
}
Why printf function causes the change of prologue?
C code_1:
#include <cstdio>
int main(){
int a = 11;
printf("%d", a);
}
GCC -m32 generated one:
.LC0:
.string "%d"
main:
lea ecx, [esp+4] // What's purpose of this three
and esp, -16 // lines?
push DWORD PTR [ecx-4] //
push ebp
mov ebp, esp
push ecx
sub esp, 20 // why sub 20?
mov DWORD PTR [ebp-12], 11
sub esp, 8
push DWORD PTR [ebp-12]
push OFFSET FLAT:.LC0
call printf
add esp, 16
mov eax, 0
mov ecx, DWORD PTR [ebp-4]
leave
lea esp, [ecx-4]
ret
C code_2:
#include <cstdio>
int main(){
int a = 11;
}
GCC -m32:
main:
push ebp
mov ebp, esp
sub esp, 16
mov DWORD PTR [ebp-4], 11
mov eax, 0
leave
ret
What is the purpose of first three lines added in first code?
Please, explain first assembly code, if you can.
EDIT:
64-bit mode:
.LC0:
.string "%d"
main:
push rbp
mov rbp, rsp
sub rsp, 16
mov DWORD PTR [rbp-4], 11
mov eax, DWORD PTR [rbp-4]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
mov eax, 0
leave
ret
The insight is that the compiler keep the stack aligned at function calls.
The alignment is 16 byte.
lea ecx, [esp+4] ;Save original ESP to ECX (ESP+4 actually)
and esp, -16 ;Align stack on 16 bytes (Lower esp)
push DWORD PTR [ecx-4] ;Push main return address (Stack at 16B + 4)
;My guess is to aid debugging tools that expect the RA
;to be at [ebp+04h]
push ebp
mov ebp, esp ;Prolog (Stack at 16B+8)
push ecx ;Save ECX (Original stack pointer) (Stack at 16B+12)
sub esp, 20 ;Reserve 20 bytes (Stack at 16B+0, ALIGNED AGAIN)
;4 for alignment + 1x16 for a variable (variable space is
;allocated in multiple of 16)
mov DWORD PTR [ebp-12], 11 ;a = 11
sub esp, 8 ;Stack at 16B+8 for later alignment
push DWORD PTR [ebp-12] ;a
push OFFSET FLAT:.LC0 ;"%d" (Stack at 16B)
call printf
add esp, 16 ;Remove args+pad from the stack (Stack at 16B)
mov eax, 0 ;Return 0
mov ecx, DWORD PTR [ebp-4] ;Restore ECX without the need to add to esp
leave ;Restore EBP
lea esp, [ecx-4] ;Restore original ESP
ret
I don't know why the compiler saves esp+4 in ecx instead of esp (esp+4 is the address of the first parameter of main).
CreateCursor function takes HINSTANCE as a first argument as described here:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms648385(v=vs.85).aspx
It can be NULL but why is it required on the first place? My guess is that it can be used to find app main window and determine display driver to be used for cursor creation. But maybe someone have a better explanation?
On my Windows Server 2008 R2 system the handle is not used at all. Here is the disassembly with a few added comments:
0:000> u createcursor
USER32!CreateCursor:
00000000`77a40cb4 488bc4 mov rax,rsp
00000000`77a40cb7 48895808 mov qword ptr [rax+8],rbx
00000000`77a40cbb 48896810 mov qword ptr [rax+10h],rbp
00000000`77a40cbf 48897018 mov qword ptr [rax+18h],rsi
00000000`77a40cc3 48897820 mov qword ptr [rax+20h],rdi
00000000`77a40cc7 4154 push r12
00000000`77a40cc9 4881ecc0000000 sub rsp,0C0h
00000000`77a40cd0 33db xor ebx,ebx
0:000> u
USER32!CreateCursor+0x1e:
00000000`77a40cd2 458be1 mov r12d,r9d
00000000`77a40cd5 418bf8 mov edi,r8d
00000000`77a40cd8 3bd3 cmp edx,ebx
00000000`77a40cda 8bf2 mov esi,edx
; If xHotSpot is negative, return error
00000000`77a40cdc 0f8c60070100 jl USER32!CreateCursor+0xb0 (00000000`77a51442)
00000000`77a40ce2 413bd1 cmp edx,r9d
; If xHotSpot is greater than nWidth return error
00000000`77a40ce5 0f8f57070100 jg USER32!CreateCursor+0xb0 (00000000`77a51442)
00000000`77a40ceb 443bc3 cmp r8d,ebx
0:000> u
USER32!CreateCursor+0x36:
; If yHotSpot is negative, return error
00000000`77a40cee 0f8c4e070100 jl USER32!CreateCursor+0xb0 (00000000`77a51442)
00000000`77a40cf4 8bac24f0000000 mov ebp,dword ptr [rsp+0F0h]
00000000`77a40cfb 443bc5 cmp r8d,ebp
; If yHotSpot is greater than nHeight return error
00000000`77a40cfe 0f8f3e070100 jg USER32!CreateCursor+0xb0 (00000000`77a51442)
; This LEA overwrites whatever was passed in as the application handle!!!
; RCX was not saved before this point, so handle is lost
00000000`77a40d04 488d4c2430 lea rcx,[rsp+30h]
00000000`77a40d09 33d2 xor edx,edx
00000000`77a40d0b 41b888000000 mov r8d,88h
00000000`77a40d11 e8828affff call USER32!memset (00000000`77a39798)
I am trying to output the result to the user after getting 3 inputs from scanf.
When I run my code, I am able to get the input I need. However it crashes after I collect the input and begin the calculation.
By the way, I am using Ubuntu 14.04 with g++ and NASM 64bit.
Here's how it should look:
This program is brought to you by Chris Tarazi
Welcome to Areas of Trapezoids
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
****//Crashes here with Segmentation fault (core dumped)****
The area of a trapezoid with sizes 5.799999999999999365, 2.200000000000000153,
and 6.500000000000000000 is 26.000000000000000328
Have a nice day. Enjoy your trapezoids.
C++ file:
#include <stdio.h>
#include <stdint.h>
extern "C" double ComputeArea(); // links with global in assembly
using namespace std;
int main()
{
double area;
printf("This program is brought to you by Chris Tarazi.\n");
area = ComputeArea();
printf("Have a nice day. Enjoy your trapezoids.\n");
return 0;
}
Assembly file:
extern printf ; This function will be linked later.
extern scanf
global ComputeArea ; Declare function global to link with "extern" from C++.
;---------------------------------Declare variables-------------------------------------------
segment .data
welcome: db "Welcome to the area of trapezoids.", 10, 0
input: db "Please enter one of the base numbers: ", 0
secInput: db "Please enter the other base number: ", 0
output: db "The area of a trapezoid with sizes %1.18lf, %1.18lf, and %1.18lf is %1.18lf .", 10, 0
hInput: db "Please enter the height: ", 0
inputformat: db "%lf", 0
stringformat: db "%s", 0
fourfloatformat: db "%1.18lf %1.18lf %1.18lf %1.18lf", 0
;---------------------------------Begin segment of executable code------------------------------
segment .text
ComputeArea: ; Area of trapezoid = ((a + b) / 2) * h.
push rbp ; Save a copy of the stack base pointer
mov rbp, rsp ; We do this in order to be 100% compatible with C and C++.
push rbx ; Back up rbx
push rcx ; Back up rcx
push rdx ; Back up rdx
push rsi ; Back up rsi
push rdi ; Back up rdi
push r8 ; Back up r8
push r9 ; Back up r9
push r10 ; Back up r10
push r11 ; Back up r11
push r12 ; Back up r12
push r13 ; Back up r13
push r14 ; Back up r14
push r15 ; Back up r15
pushf ; Back up rflags
;---------------------------------Output messages to user---------------------------------------
mov qword rax, 0
mov rdi, stringformat
mov rsi, welcome
call printf
mov qword rax, 0
mov rdi, stringformat
mov rsi, input
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;firstbase
call scanf
movsd xmm0, [rsp]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, secInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;secondbase
call scanf
movsd xmm1, [rsp + 4]
pop rax
mov qword rax, 0
mov rdi, stringformat
mov rsi, hInput
call printf
push qword 0
mov qword rax, 0
mov rdi, inputformat
mov rsi, rsp ;height
call scanf
movsd xmm2, [rsp + 8]
pop rax
;---------------------------------Begin ComputeArea Calculation-----------------------------------
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
;---------------------------------Output result to user-------------------------------------------
mov rax, 3
mov rdi, output
call printf
First off, why on earth are you saving ALL of those registers?!? The ABI for 64 bit Linux says you only need to save rbx, rbp, and r12 - r15 if you use those registers in your function. Also, you using Assembler, there is no need to create a stack frame in 64bit land (plus you aren't even using rbp! so why create a stack frame?) The only thing that is very important is to make sure your stack is aligned on a 16 byte boundary - call pushes an 8 byte return address, so all you need in your ComputeArea function is sub rsp, 8 and add rsp, 8 right before your ret.
In your first scanf you are using rsp without adjusting it, you just overwrote something!
You do some computations here:
mov rax, 2
cvtsi2sd xmm3, rax
addsd xmm0, xmm1
divsd xmm0, xmm3
mulsd xmm0, xmm2
ret
You return from the procedure here but do not pop all of those registers you just pushed!! So basically your stack pointer is all messed up! The CPU does not know what the return address is!
What you do in the prologue, must be reversed in the epilogue before you return!
Maybe, you should start simple, read in 3 floats and try to print them!
When I correct your code, this is my output:
Welcome to the area of trapezoids.
Please enter one of the base numbers: 5.8
Please enter the other base number: 2.2
Please enter the height: 6.5
The area of a trapezoid with sizes 5.799999999999999822, 2.200000000000000178, and 6.500000000000000000 is 26.000000000000000000 .