Let's say we have the following c++ code:
int var1;
__asm {
mov var1, 2;
}
Now, what I'd like to know is if I didn't want to define var1 outside the __asm directive, what would I have to do to put it inside it. Is it even possible?
Thanks
To do that, you'll need to create a "naked" method with _declspec(naked) and to write yourself the prolog and the epilog that are normally created by the compiler.
The aim of a prolog is to:
set up EBP and ESP
reserve space on stack for local variables
save registers that should be modified in the body of the function
An epilog has to:
restore the saved register values
clean up the reserved space for local variables
Here is a standard prolog
push ebp ; Save ebp
mov ebp, esp ; Set stack frame pointer
sub esp, localbytes ; Allocate space for locals
push <registers> ; Save registers
and a standard epilog:
pop <registers> ; Restore registers
mov esp, ebp ; Restore stack pointer
pop ebp ; Restore ebp
ret ; Return from function
Your local variables will then begin at (ebp - 4) and go downward to (ebp - 4 - localbytes). The function parameters will start at (ebp + 8) and go upward.
It' impossible to create a C variable in assembler: the C compiler has to know about the variable (ie its type and address), which means it has to be declared in the C code.
What can be done is accessing symbols defined in assembler via extern declarations in C. That won't work for variables with automatic storage duration, though, as these don't have a fixed address but are referenced relative to the base pointer.
If you don't want to access the variables outside of the asm block, you can use the stack for storing assembler-local data. Just keep in mind that you have to restore the stack pointer to its previous value when leaving the asm block, eg
sub esp, 12 ; space for 3 asm-local 32bit vars
mov [esp-8], 42 ; set value of local var
[...]
push 0xdeadbeaf ; use stack
[...] ; !!! 42 resides now in [esp-12] !!!
add esp, 16 ; restore esp
If you don't want the relative addresses of the local variables to change whenever you manipulate the stack (ie use push or pop), you have to establish a stack frame (ie save the base of the stack in ebp and address locals relative to this value) as described in cedrou's answer.
Local variables are allocated and freed by manipulating the available space on the call stack via the ESP register, ie:
__asm
{
add esp, 4
mov [esp], 2;
...
sub esp, 4
}
Generally, this is better handled by establishing a "stack frame" for the calling function instead, and then access local variables (and function parameters) using offsets within the frame, instead of using the ESP register directly, ie:
__asm
{
push ebp
mov ebp, esp
add esp, 4
...
mov [ebp-4], 2;
...
mov esp, ebp
pop ebp
}
Related
I am searching for a way to define variables in c++ inline assembly. I found an interesting way to do it. But it confuses me, how this can work.
__asm
{
push ebp
mov ebp, esp
add esp, 4
mov [ebp - 4], 2
mov esp, ebp
pop ebp
}
I see this code as - Push base pointer address to the stack, move stack pointer address into base pointer's (Stack logically should collapse here, because this is common epilogue function of cleaning the stack). Then we move 4 to the esp address (Not even the value) And then remove that 4 from esp. So we get back to the same esp address. Strange fact for me is that, it even compiles, and it works. But when I try to test it by outputting the value
uint32_t output;
__asm
{
push ebp
mov ebp, esp
add esp, 4
mov [ebp - 4], 2
mov output,[ebp-4]
mov esp, ebp
pop ebp
}
std::cout << output;
It does not compile, showing "Operand size conflict", which seems weird to me, because I use 32 bit integer and register is also 32 bit. When using [ebp-4] without [], it gives garbage values, as expected.
So, maybe someone could explain how this works without giving error :)
And one additional question, why does db does not work in inline assembly?
It doesn't work, that doesn't define a C++ variable.
It just messes with the stack to reserve some new storage below the stack frame created by the compiler. And you modify EBP so compiler-generated addressing modes that use EBP will be broken.1
If you want to define or declare a C++ variable, do it with C++ syntax like int tmp.
asm doesn't really have variables. It has registers and memory. Keep track of where values are using comments. If you want to use some extra stack space from MSVC inline asm, I think that's safe, but don't modify EBP if you also want to reference C++ local variables.
Footnote 1:
That would be the case if your code assembled at all, which it won't because mov output,[ebp-4] has 2 explicit memory operands. MSVC inline asm can't allocate C++ variables in register.
Also mov [ebp - 4], 2 has ambiguous operand-size: neither operand has a size associated with it because neither is a register. Maybe you want mov dword ptr [ebp - 4], 2
I'm studying assembly language from the book "Assembly Language Step-by-Step: Programming with Linux" by Jeff Dunteman, and have come across an interesting paragraph in the book which I'm most likely misunderstanding, hence would appreciate some clarification on:
"The stack must be clean before we destroy the stack frame and return control. This simply means that any temporary values that we may have pushed onto the stack during the program’s run must be gone. All that is left on the stack should be the caller’s EBP, EBX, ESI, and EDI values.
...
Once the stack is clean, to destroy the stack frame we must first pop the caller’s register values back into their registers, ensuring that the pops are in the correct order.
...
We restore the caller’s ESP by moving the value from EBP into ESP, and finally pop the caller’s EBP value off the stack."
Consider the following code generated from Visual Studio 2008:
int myFuncSum( int a, int b)
{
001B1020 push ebp
001B1021 mov ebp,esp
001B1023 push ecx <------------------
int c;
c = a + b;
001B1024 mov eax,dword ptr [ebp+8]
001B1027 add eax,dword ptr [ebp+0Ch]
001B102A mov dword ptr [ebp-4],eax
return c;
001B102D mov eax,dword ptr [ebp-4]
}
001B1030 mov esp,ebp
001B1032 pop ebp
001B1033 ret
The value of ecx (indicated), pushed to make space on the stack for my variable c, is, as far as I can see, only gone from the stack when we reset ESP; however, as quoted, the book states that the stack must be clean before we reset ESP. Can someone please clarify whether or not I am missing something?
The example from Visual Studio 2008 doesn't contradict the book. The book is covering the most elaborate case of a call. See the x86-32 Calling Convention as a cross-reference which spells it out with pictures.
In your example, there were no caller registers saved on the stack, so there are no pop instructions to be performed. This is part of the "clean up" that must occur before mov esp, ebp that the book is referring to. So more specifically, let's suppose the callee is saving si and di for the caller, then the prelude and postlude for the function might look like this:
push ebp ; save base pointer
mov ebp, esp ; setup stack frame in base pointer
sub esp, 4 ; reserve 4 bytes of local data
push si ; save caller's registers
push di
; do some stuff, reference 32-bit local variable -4(%ebp), etc
; Use si and di for our own purposes...
; clean up
pop di ; do the stack clean up
pop si ; restoring the caller's values
mov esp, ebp ; restore the stack pointer
pop ebp
ret
In your simple example, there were no saved caller registers, so no final pop instructions needed at the end.
Perhaps because it's simpler or faster, the compiler elected to do the following instruction in place of sub esp, 4:
push ecx
But the effect is the same: reserve 4 bytes for a local variable.
Notice the instruction:
push ebp
mov ebp,esp ; <<<<=== saves the stack base pointer
and the instruction:
mov esp,ebp ; <<<<<== restore the stack base pointer
pop ebp
So after this sequence the stack is clean again
This is the program:
#include <stdio.h>
void test_function(int a, int b, int c, int d){
int flag;
flag = 31337;
}
int main(){
test_function(1,2,3,4);
}
From GDB:
Breakpoint 1, main () at stack_example.c:14
14 test_function(1,2,3,4);
(gdb) i r esp ebp eip
esp 0xffffcf88 0xffffcf88
ebp 0xffffcf98 0xffffcf98
eip 0x8048402 0x8048402 <main+6>
(gdb) cont
Continuing.
Breakpoint 2, test_function (a=1, b=2, c=3, d=4) at stack_example.c:8
8 flag = 31337;
(gdb) i r esp ebp eip
esp 0xffffcf70 0xffffcf70
ebp 0xffffcf80 0xffffcf80
eip 0x80483f3 0x80483f3 <test_function+6>
(gdb)
Dump of assembler code for function test_function:
0x080483ed <+0>: push ebp
x080483ee <+1>: mov ebp,esp
0x080483f0 <+3>: sub esp,0x10
=> 0x080483f3 <+6>: mov DWORD PTR [ebp-0x4],0x7a69
0x080483fa <+13>: leave
0x080483fb <+14>: ret
From GDB,
Why does EBP not have the same value as what ESP was in main()? Shouldn't mov ebp, esp
make EBP == 0xffffcf88? I thought this sets EBP to what ESP is.
EDIT:
I think I may have answered my own question. Please correct me.
ESP is moved when the return address and saved frame pointer are pushed onto the stack.
The ESP value was 0xffffcf88 before the two values(both 4 bytes) were pushed onto the stack. Afterwards, it's value is 0xffffcf88 - 0x8 == 0xffffcf80. This is what EBP's current value is. Then ESP -= 0x10.
How is the value of ESP modified? Is it something like mov ESP, ESP - 0x8 ?
I hope I understand your question.
First, you are correct about new value of frame pointer. Here you need to know that "break test_function" actually stops gdb after the function prologue has been executed, so you are already past the point where ebp is explicitly stored to stack.
See detailed explanation.
ESP is decreased by four for every push (or call). It's also decreased (some immediate value subtracted) by the amount of space required by the local variables, etc.
Different "calling conventions" are being used on x86, so these details may vary a bit.
Some parts of this are also compiler specific. For example GCC by default keeps stack 4-word aligned on x86 (for one reason to keep SSE unit happy). See this SO thread.
BTW: frame pointer is actually redundant as long as compiler knows how to unwind stack from any point in function and it is typically omitted with optimized code.
I am doing 64 bit migration and i need to port inline assembly code to cpp Here is he code
void ExternalFunctionCall::callFunction(ArgType resultType, void* resultBuffer)
{
// I386
// just copy the args buffer to the stack (it's already layed out correctly)
int* begin = m_argsBegin;
int* ptr = m_argsEnd;
while (ptr > begin) {
int val = *(--ptr);
__asm push val
}
}
I want to migrate this __asm push val to cpp. This function is called four times and for every call we get different values of m_argsBegin and m_argsEnd(both m_argsBegin and m_argsEnd are dynamic arrays).
This while loop executes 4 times for every call of this "callFunction" function. So, in total 4x4 = 16 values are to be stored in a "CONTINUOUS memory location" this is what "__asm push val" does i guess. I need to implement this in c++ . I tried every possible way (stack, array, Lnked list, Queue even separated this into a separate asm file but none are working)
Can anyone help?
I separated this inline assembly function into a separate assembly file . Here is the code:
.386
.model c,flat
public callFunction_asm
CSEG segment public 'CODE'
callFunction_asm PROC
push ebp
mov ebp, esp
mov ecx, [ebp+8] ;val
push dword ptr [ecx]
mov esp, ebp
pop ebp
RETN
callFunction_asm ENDP
CSEG ends
END
where callFunction_asm is an extern function , I declared it as:
extern "C"
void callFunction_asm(int val);
and I am calling this function as:
while (ptr > begin) {
int val = *(--ptr);
callFunction_asm(val); //possible replacement
}
but even this is not working, can anyone tell where am I going wrong. I am new to assembly coding.
push puts its operand on the stack, as well as decrementing the stack pointer.
If you looked at the stack pointer plus 1 (1($sp)), you should see the value (but if you wanted it back, you'd typically use pop).
I got the following simple C++ code:
#include <stdio.h>
int main(void)
{
::printf("\nHello,debugger!\n");
}
And from WinDbg, I got the following disassembly code:
SimpleDemo!main:
01111380 55 push ebp
01111381 8bec mov ebp,esp
01111383 81ecc0000000 sub esp,0C0h
01111389 53 push ebx
0111138a 56 push esi
0111138b 57 push edi
0111138c 8dbd40ffffff lea edi,[ebp-0C0h]
01111392 b930000000 mov ecx,30h
01111397 b8cccccccc mov eax,0CCCCCCCCh
0111139c f3ab rep stos dword ptr es:[edi]
0111139e 8bf4 mov esi,esp
011113a0 683c571101 push offset SimpleDemo!`string' (0111573c)
011113a5 ff15b0821101 call dword ptr [SimpleDemo!_imp__printf (011182b0)]
011113ab 83c404 add esp,4
011113ae 3bf4 cmp esi,esp
011113b0 e877fdffff call SimpleDemo!ILT+295(__RTC_CheckEsp) (0111112c)
011113b5 33c0 xor eax,eax
011113b7 5f pop edi
011113b8 5e pop esi
011113b9 5b pop ebx
011113ba 81c4c0000000 add esp,0C0h
011113c0 3bec cmp ebp,esp
011113c2 e865fdffff call SimpleDemo!ILT+295(__RTC_CheckEsp) (0111112c)
011113c7 8be5 mov esp,ebp
011113c9 5d pop ebp
011113ca c3 ret
I have some difficulties to fully understand it. What is the SimpleDemo!ILT things doing here?
What's the point of the instruction comparing ebp and esp at 011113c0?
Since I don't have any local variables in main() function, why there's still a sub esp,0C0h at the loacation of 01111383?
Many thanks.
Update 1
Though I still don't know what ILT means, but the __RTC_CheckESP is for runtime checks. These code can be elimiated by placing the following pragma before the main() function.
#pragma runtime_checks( "su", off )
Reference:
http://msdn.microsoft.com/en-us/library/8wtf2dfz.aspx
http://msdn.microsoft.com/en-us/library/6kasb93x.aspx
Update 2
The sub esp,0C0h instruction allocate another 0C0h bytes extra space on the stack. Then EAX is filled with 0xCCCCCCCC, this is 4 bytes, since ECX=30h, 4*30h=0C0h, so the instruction rep stos dword ptr es:[edi] fill exactly the extra spaces with 0xCC. But what is this extra space on stack for? Is this some kind of safe belt? Also I notice that if I turn off the runtime check as Update 1 shows, there's still such extra space on stack, though much smaller. And this space is not filled with 0xCC.
The assembly code without runtime check is like below:
SimpleDemo!main:
00231250 55 push ebp
00231251 8bec mov ebp,esp
00231253 83ec40 sub esp,40h <-- Still extra space allocated from stack, but smaller
00231256 53 push ebx
00231257 56 push esi
00231258 57 push edi
00231259 683c472300 push offset SimpleDemo!`string' (0023473c)
0023125e ff1538722300 call dword ptr [SimpleDemo!_imp__printf (00237238)]
00231264 83c404 add esp,4
00231267 33c0 xor eax,eax
00231269 5f pop edi
0023126a 5e pop esi
0023126b 5b pop ebx
0023126c 8be5 mov esp,ebp
0023126e 5d pop ebp
0023126f c3 ret
Most of the instructions are part of MSVC runtime checking, enabled by default for debug builds. Just calling printf and returning 0 in an optimized build takes much less code. (Godbolt compiler explorer). Other compilers (like GCC and clang) don't do as much stuff like stack-pointer comparison after calls, or poisoning stack memory with a recognizable 0xCC pattern to detect use-uninitialized, so their debug builds are like MSVC debug mode without its extra runtime checks.
I've annotated the assembler, hopefully that will help you a bit. Lines starting 'd' are debug code lines, lines starting 'r' are run time check code lines. I've also put in what I think a debug with no runtime checks version and release version would look like.
; The ebp register is used to access local variables that are stored on the stack,
; this is known as a stack frame. Before we start doing anything, we need to save
; the stack frame of the calling function so it can be restored when we finish.
push ebp
; These two instructions create our stack frame, in this case, 192 bytes
; This space, although not used in this case, is useful for edit-and-continue. If you
; break the program and add code which requires a local variable, the space is
; available for it. This is much simpler than trying to relocate stack variables,
; especially if you have pointers to stack variables.
mov ebp,esp
d sub esp,0C0h
; C/C++ functions shouldn't alter these three registers in 32-bit calling conventions,
; so save them. These are stored below our stack frame (the stack moves down in memory)
r push ebx
r push esi
r push edi
; This puts the address of the stack frame bottom (lowest address) into edi...
d lea edi,[ebp-0C0h]
; ...and then fill the stack frame with the uninitialised data value (ecx = number of
; dwords, eax = value to store)
d mov ecx,30h
d mov eax,0CCCCCCCCh
d rep stos dword ptr es:[edi]
; Stack checking code: the stack pointer is stored in esi
r mov esi,esp
; This is the first parameter to printf. Parameters are pushed onto the stack
; in reverse order (i.e. last parameter pushed first) before calling the function.
push offset SimpleDemo!`string'
; This is the call to printf. Note the call is indirect, the target address is
; specified in the memory address SimpleDemo!_imp__printf, which is filled in when
; the executable is loaded into RAM.
call dword ptr [SimpleDemo!_imp__printf]
; In C/C++, the caller is responsible for removing the parameters. This is because
; the caller is the only code that knows how many parameters were put on the stack
; (thanks to the '...' parameter type)
add esp,4
; More stack checking code - this sets the zero flag if the stack pointer is pointing
; where we expect it to be pointing.
r cmp esi,esp
; ILT - Import Lookup Table? This is a statically linked function which throws an
; exception/error if the zero flag is cleared (i.e. the stack pointer is pointing
; somewhere unexpected)
r call SimpleDemo!ILT+295(__RTC_CheckEsp))
; The return value is stored in eax by convention
xor eax,eax
; Restore the values we shouldn't have altered
r pop edi
r pop esi
r pop ebx
; Destroy the stack frame
r add esp,0C0h
; More stack checking code - this sets the zero flag if the stack pointer is pointing
; where we expect it to be pointing.
r cmp ebp,esp
; see above
r call SimpleDemo!ILT+295(__RTC_CheckEsp)
; This is the usual way to destroy the stack frame, but here it's not really necessary
; since ebp==esp
mov esp,ebp
; Restore the caller's stack frame
pop ebp
; And exit
ret
; Debug only, no runtime checks
push ebp
mov ebp,esp
d sub esp,0C0h
d lea edi,[ebp-0C0h]
d mov ecx,30h
d mov eax,0CCCCCCCCh
d rep stos dword ptr es:[edi]
push offset SimpleDemo!`string'
call dword ptr [SimpleDemo!_imp__printf]
add esp,4
xor eax,eax
mov esp,ebp
pop ebp
ret
; Release mode (The optimiser is clever enough to drop the frame pointer setup with no VLAs or other complications)
push offset SimpleDemo!`string'
call dword ptr [SimpleDemo!_imp__printf]
add esp,4
xor eax,eax
ret
Number one your code's main() is improperly formed. It doesn't return the int you promised it would return. Correcting this defect, we get:
#include
int main(int argc, char *argv[])
{
::printf("\nHello,debugger!\n");
return 0;
}
Additionally, any more, it is very strange to see #include <stdio.h> in a C++ program. I believe you want #include <cstdio>
In all cases, space must be made on the stack for arguments and for return values. main()'s return value requires stack space. main()s context to be saved during the call to printf() requires stack space. printf()'s arguments require stack space. printf()'s return value requires stack space. That's what the 0c0h byte stack frame is doing.
The first thing that happens is the incoming bas pointer is copied to the top of the stack. Then the new stack pointer is copied into the base pointer. We'll be checking later to be sure that the stack winds up back where it started from (because you have runtime checking turned on). Then we build the (0C0h bytes long) stack frame to hold our context and printf()'s arguments during the call to printf(). We jump to printf(). When we get back, we hop over the return value which you didn't check in your code (the only thing left on its frame) and make sure the stack after the call is in the same place it was before the call. We pop our context back off the stack. We then check that the final stack pointer matches the value we saved way up at the front. Then we pop the prior value of the base pointer off the very top of the stack and return.
That is code that is inserted by the compiler when you build with runtime checking (/RTC). Disable those options and it should be clearer. /GZ could also be causing this depending on your VS version.
For the record, I suspect that ILT means "Incremental Linking Thunk".
The way incremental linking (and Edit&Continue) works is the following: the linker adds a layer of indirection for every call via thunks which are grouped at the beginning of executable, and adds a huge reserved space after them. This way, when you're relinking the updated executable it can just put any new/changed code into the reserved area and patch only the affected thunks, without changing the rest of the code.
The 40 bytes is the worst case stack allocation for any called or subsequently called function. This is explained in glorious detail here.
What is this space reserved on the top of the stack for? First, space is created for any local variables. In this case, FunctionWith6Params() has two. However, those two local variables only account for 0x10 bytes. What’s the deal with the rest of the space created on the top of the stack?
On the x64 platform, when code prepares the stack for calling another function, it does not use push instructions to put the parameters on the stack as is commonly the case in x86 code. Instead, the stack pointer typically remains fixed for a particular function. The compiler looks at all of the functions the code in the current function calls, it finds the one with the maximum number of parameters, and then creates enough space on the stack to accommodate those parameters. In this example, FunctionWith6Params() calls printf() passing it 8 parameters. Since that is the called function with the maximum number of parameters, the compiler creates 8 slots on the stack. The top four slots on the stack will then be the home space used by any functions FunctionWith6Params() calls.