While reverse engineering I came around a very odd program that uses a calling convention that passes one argument in eax ( very odd compiler ?? ). I want to call that function now and I don't know how to declare it, IDA defines it as
bool __usercall foo<ax>(int param1<eax>, int param2);
where param1 is passed in the eax register. I tried something like
bool MyFoo(int param1, int param2)
{
__asm mov eax, param1;
return reinterpret_cast<bool(__stdcall *)(int)>(g_FooAddress)(param2);
}
However, unfortunately my compiler makes use of the eax register when pushing param2 on the stack, is there any way how I can make this clean without writing the whole call with inline assembler? (I am using Visual Studio if that matters)
There are "normal" calling conventions which pass arguments via registers. If you are using MSVC for example, __fastcall.
http://en.wikipedia.org/wiki/X86_calling_conventions#fastcall
You cannot define your own calling conventions, but I would suggest that you do create a wrapper function which does its own calling / cleanup via inline assembly. This is probably the most practical to achieve this effect, though you could also probably do it faster by using __fastcall, doing a bit of register swapping, then jmp to the correct function.
There's more to a calling convention than argument passing though, so option #1 is probably the best as you'll get full control over how the caller acts.
Related
I am writing a program that requires one function in assembly. It would be pretty helpful to encapsulate the assembly function in a C++ class, so its own data is isolated and I can create multiple instances.
If I create a class and call an external function from a C++ method, the function is reentrant even if it has its own stack and local "variables" into the stack frame.
Is there some way to make the assembly function a C++ method, maybe using name mangling, so the function is implemented in assembly but the prototype is declared inside the C++ class?
If not possible, is there some way to create multiple instances (dynamically) of the assembly function although it is not part of the class? Something like clone the function in memory and just call it, obviously using relocatable code (adding a delta displacement for variables and data if required)...
I am writing a program that requires one function in assembly.
Then, by definition, your program becomes much less portable. And depends upon the calling conventions and ABI of your C++ implementation and your operating system.
It would then be coherent to use some compiler specific features (which are not in portable standard C++11, e.g. in n3337).
My recommendation is then to take advantage of GCC extended assembly. Read the chapter on using assembly language with C (it also, and of course, applies to C++).
By directly embedding some extended asm inside a C++ member function, you avoid the hassle of calling some function. Probably, your assembler code is really short and executed quickly. So it is better to embed it in C or C++ functions, avoiding the costs of function call prologue and epilogue.
NB: In 2019, there is no economical sense to spend efforts in writing large assembly code: most optimizing compilers produce better assembler code than a reasonable programmer can (in a reasonable time). So you have an incentive to use small assembler code chunks in larger C++ or C functions.
Yes, you can. Either define it as an inline wrapper that passes all the args (including the implicit this pointer) to an external function, or figure out the name-mangling to define the right symbol for the function entry point in asm.
An example of the wrapper way:
extern "C" int asm_function(myclass *p, int a, double b);
class myclass {
int q, r, member_array[4];
int my_method(int a, double b) { return asm_function(this, a, b); }
};
A stand-alone definition of my_method for x86-64 would be just jmp asm_function, a tailcall, because the args are identical. So after inlining, you'll have call asm_function instead of call _Zmyclass_mymethodZd or whatever the actual name mangling is. (I made that up).
In GNU C / C++, there's also the asm keyword to set the asm symbol name for a function, instead of letting the normal name-mangling rules generate it from the class and member-function name, and arg types. (Or with extern "C", usually just a leading underscore or not, depending on the platform.)
class myclass {
int q, r, member_array[4];
public:
int my_method(int a, double b)
asm("myclass_my_method_int_double"); // symbol name for separate asm
};
Then in your .asm file (e.g. NASM syntax, for the x86-64 System V calling convention)
global myclass_my_method_int_double
myclass_my_method_int_double:
;; inputs: myclass *this in RDI, int a in ESI, double b in XMM0
cvtsd2si eax, xmm0
add eax, [rdi+4] ;; this->r
imul eax, esi
ret
(You can pick any name you want for your asm function; it doesn't have to encode the args. But doing that will let you overload it without conflicting symbol names.)
Example on Godbolt of a test caller calling the asm("") way:
void foo(myclass *p){
p->my_method(1, 1.0);
}
compiles to
foo(myclass*):
movsd xmm0, qword ptr [rip + .LCPI0_0] # xmm0 = mem[0],zero
mov esi, 1
jmp myclass_my_method_int_double # TAILCALL
Note that the caller emitted jmp myclass_my_method_int_double, using your name, not a mangled name.
I have two functions, looking like this in C++:
void f1(...);
void f2(...);
I can change the body of f1, but f2 is defined in another library I cannot change. I absolutely have to (tail) call f2 inside f1, and I must pass all arguments provided to f1 to f2, but as far as I know, this is impossible in pure C or C++. There is no alternative of f2 that accepts a va_list, unfortunately. The call to f2 happens last in the function, so I need some form of tailcall.
I decided to use assembly to pop the stack frame of the current function, then jump to f2 (it is actually received as a function pointer and in a variable, so that's why I first store it in a register):
__asm {
mov eax, f2
leave
jmp eax
}
In MSVC++, in Debug, it appears to work at first, but it somehow messes with the return values of other functions, and sometimes it crashes. In Release, it always crashes.
Is this assembly code incorrect, or do some optimizations of the compiler somehow break this code?
The compiler will make no guarantees at the point you are digging around. A trampoline function might work, but you have to save state between them, and do a lot of digging around.
Here is a skeleton, but you will need to know a lot about calling conventions, class method invocation, etc...
/
* argn, ..., arg0, retaddr */
trampoline:
push < all volatile regs >
call <get thread local storage >
copy < volatile regs and ret addr > to < local storage >
pop < volatile regs >
remove ret addr
call f2
call < get thread local storage >
restore < volatile regs and ret addr>
jmp f1
ret
You have to write f1 in pure asm for it to be guaranteed-safe.
In all the major x86 calling conventions, the callee "owns" the args, and can modify the stack-space that held them. (Whether or not the C source changes them and whether or not they're declared const).
e.g. void foo(int x) { x += 1; bar(x); } might modify the stack space above the return address that holds x, if compiled with optimization disabled. Making another call with the same args requires storing them again unless you know the callee hasn't stepped on them. The same argument applies for tailcalling from the end of one function.
I checked on the Godbolt compiler explorer; both MSVC and gcc do in fact modify x on the stack in debug builds. gcc uses add DWORD PTR [ebp+8], 1 before pushing [ebp+8].
Compilers in practice may not actually take advantage of this for variadic functions, though, so depending on the definitions of your functions, you might get away with it if you can convince them to make a tailcall.
Note that void bar(...); is not a valid prototype in C, though:
# gcc -xc on Godbolt to force compiling as C, not C++
<source>:1:10: error: ISO C requires a named argument before '...'
It is valid in C++, or at least g++ accepts it while gcc doesn't. MSVC accepts it in C++ mode, but not in C mode. (Godbolt has a whole separate C mode with a different set of compilers, which you can use to get MSVC to compile code as C instead of C++. I don't know a command-line option to flip it to C mode the way gcc has -xc and -xc++)
Anyway, It might work (in optimized builds) to write f2(); at the end of f1, but that's nasty and completely lying to the compiler about what args are passed. And of course only works for a calling convention with no register args. (But you were showing 32-bit asm, so you might well be using a calling convention with no register args.)
Any decent compiler will use jmp f2 to make an optimized tail-call in this case, because they both return void. (For non-void, you would return f2();)
BTW, if mov eax, f2 works, then jmp f2 will also work.
Your code can't work in an optimized build, though, because you're assuming that the compiler made a legacy stack-frame, and that the function won't inline anywhere.
It's unsafe even in a debug build because the compiler may have pushed some call-preserved registers that need to be popped before leaving the function (and before running leave to destroy the stack frame).
The trampoline idea that #mevets showed could maybe be simplified: if there's a reasonable fixed upper size limit on the args, you can copy maybe 64 or 128 bytes of potential-args from your incoming args into args for f1. A few SIMD vectors will do it. Then you can call f1 normally, then tail-call f2 from your asm wrapper.
If there are potentially register args, save them to stack space before the args you copy, and restore them before tailcalling.
im working on a hook in C++ and ASM and currently i have just made an easy inline hook that places a jump in the first instruction of the target function which in this case is OutputDebugString just for testing purposes.
the thing is that my hook fianlly works after about 3 days of research and figuring out the bits and peaces of how things work, but there is one problem i have no idea how to change the parameters that come in to my "dummy" function before jumping on to the rest of the original function.
as u can see in my code i have tried to change the parameter simply in C++ but of course this does not work as im poping all the registers afterwards :/
anyways here is my dummy function which is what the hooked function jumps to:
static void __declspec(naked) MyDebugString(LPCTSTR lpOutputString) {
__asm {
PUSHAD
}
//Where i suppose i could run my code, but not be able to interfere with parameters :/
lpOutputString = L"new message!";
__asm {
POPAD
MOV EDI, EDI
PUSH EBP
MOV EBP, ESP
JMP Addr
}
original_DebugString(lpOutputString);
}
i understand why the code is not working as i said, i just can't see a proper solution to this, any help is greatly appreciated.
Every compiler has a protocol for calling functions using assembly language. The protocol may be stated deep in their manuals.
A faster method to find the function protocols is to have the compiler generate an assembly language listing for your function.
The best method for writing inline assembly is to:
First write the function in C++ source code
Next print out the assembly listing of the function.
Review and understand how the compiler generated assembly works.
Lastly, modify the internal assembly to suite your needs.
My preference is to write the C++ code as efficient as I can (or to help the compiler use optimal assembly language). I then review the assembly listing. I only change the inline assembly to invoke processor special features (such as block move instructions).
I want to ask order of function signature, call and definition
like, which one would the computer look first, second and third
So:
#include <iostream>
using namespace std;
void max(void);
void min(void);
int main() {
max();
min();
return;
}
void max() {
return;
}
void min() {
return;
}
So this is what I think,
the computer will go to main and look at the function call, then it will look at the
function signature, and at the last, it will look at the definition.
It is right?
Thank
It is right?
No.
You need to understand the difference between function declarations and function definitions, the difference between compilation, linking, and execution, and the difference between non-virtual and virtual functions.
Function declarations
This is a function declaration: void max(void);. It doesn't tell the compiler anything about what the function does. What it does is to tell the compiler how to call the function and how to interpret the result. When the compiler is compiling the body of some function, call it function A, the compiler doesn't need to know what other functions do. All it needs to know is what to do with the functions that function A calls. The compiler might generate code in assembly or some intermediate language that corresponds to your C++ function calls. Or it might reject your C++ code because your code doesn't make sense.
Determining whether your code makes sense is another key purpose of those function declarations. This is particularly important in C++ where multiple functions can have the same name. How would the compiler know which of the half dozen or so max functions to call if it didn't know about those functions? When your C++ code calls some function, the compiler must find one best match (possibly involving type conversions) with one of those function declarations. Your code doesn't make sense if the compiler can't find a match at all, or if it finds more than one match but can't distinguish one as the best match.
When the compiler does find a best match, the generated code will be in the form of a call to an undefined external reference to that function. Where that function lives is not the job of the compiler.
Function definitions
That void max(void) was a function declaration. The corresponding void max() {...} is the definition of that function. When the compiler is processing void max() {...} it doesn't have to worry about what other functions have called it. It just has to worry about processing void max() {...} . The body of this function becomes assembly or intermediate language code that is inserted into some compiled object file. The compiler marks the address of the entry point to this generated code is marked as such.
Compilation versus linking
So far I've talked about what the compiler does. It generates chunks of low-level code that correspond to your C++ code. That generated code is not ready for prime time because of those external references. Resolving those undefined external references is the job of the linker. The linker is what builds your executable from multiple object files, multiple libraries. It keeps track of where it has put those chunks of code in the executable. What about those undefined external references? If the linker has already placed that reference in the executable, the linker simply fills in the placeholder for that reference. If the linker hasn't come across the definition for that reference, it puts the reference and the placeholder onto a list of still-unresolved references. Every time the linker adds a chunk of code to the executable, it checks that list to see if it can fix any of those still-unresolved references. At the end, you will either have all references resolved or you will still have some outstanding ones. The latter is an error. The former means that you have an executable.
Execution
When your code runs, those function calls are really just some stack management wrapped around the machine language equivalent of that evil goto statement. There's no examining your function declarations; those don't even exist by the time the code is executed. Return? That's a goto also.
Non-virtual versus virtual functions
What I said above pertains to non-virtual functions. Run-time dispatching does occur for virtual functions. That run-time dispatching has nothing to do with examining function declarations. Those virtual functions are perhaps an issue for a different question.
One last thing:
Get out of the habit of using namespace std; Think of it as akin to smoking. It's a bad habit.
As you may know, the compiler converts the program into machine code (via several intermediate steps). Here is the dissassembly of the machine code for main() when compiled on Visual Studio 2012 in debug mode on Windows 8:
int main() {
00C24400 push ebp # Setup stack frame
00C24401 mov ebp,esp
00C24403 sub esp,0C0h
00C24409 push ebx
00C2440A push esi
00C2440B push edi
00C2440C lea edi,[ebp-0C0h] # Fill with guard bytes
00C24412 mov ecx,30h
00C24417 mov eax,0CCCCCCCCh
00C2441C rep stos dword ptr es:[edi]
max();
00C2441E call max (0C21302h) # Call max
min();
00C24423 call min (0C2126Ch) # Call min
return 0;
00C24428 xor eax,eax
}
00C2442A pop edi # Restore stack frame
00C2442B pop esi
00C2442C pop ebx
00C2442D add esp,0C0h
00C24433 cmp ebp,esp
}
00C24435 call __RTC_CheckEsp (0C212D5h) # Check for memory corruption
00C2443A mov esp,ebp
00C2443C pop ebp
00C2443D ret
The exact details will vary from compiler to compiler and operating system to operating system. If min() or max() had arguments or return values, they would be passed as appropriate for the architecture. The key point is that the compiler has already worked out what the arguments and return values are and created machine code that just passes or accepts them.
You can learn more details if you wish to help with debugging or to do low level calls but be aware that the machine code emitted can be highly variable. For example, here is the same code compiled on the same system in release mode (i.e. with optimizations on):
return 0;
01151270 xor eax,eax
}
01151272 ret
As you can see, it has detected that min() and max() do nothing and removed them completely. Since there is now no stack frame to setup and restore, that is gone, leaving a single instruction to set eax to 0 then returning (since the return value is in the eax register).
I've come across __stdcall a lot these days.
MSDN doesn't explain very clearly what it really means, when and why should it be used, if at all.
I would appreciate if someone would provide an explanation, preferably with an example or two.
This answer covers 32-bit mode. (Windows x64 only uses 2 conventions: the normal one (which is called __fastcall if it has a name at all) and __vectorcall, which is the same except for how SIMD vector args like __m128i are passed).
Traditionally, C function calls are made with the caller pushing some parameters onto the stack, calling the function, and then popping the stack to clean up those pushed arguments.
/* example of __cdecl */
push arg1
push arg2
push arg3
call function
add esp,12 ; effectively "pop; pop; pop"
Note: The default convention — shown above — is known as __cdecl.
The other most popular convention is __stdcall. In it the parameters are again pushed by the caller, but the stack is cleaned up by the callee. It is the standard convention for Win32 API functions (as defined by the WINAPI macro in <windows.h>), and it's also sometimes called the "Pascal" calling convention.
/* example of __stdcall */
push arg1
push arg2
push arg3
call function // no stack cleanup - callee does this
This looks like a minor technical detail, but if there is a disagreement on how the stack is managed between the caller and the callee, the stack will be destroyed in a way that is unlikely to be recovered.
Since __stdcall does stack cleanup, the (very tiny) code to perform this task is found in only one place, rather than being duplicated in every caller as it is in __cdecl. This makes the code very slightly smaller, though the size impact is only visible in large programs.
(Optimizing compilers can sometimes leave space for args allocated across multiple cdecl calls made from the same function and mov args into it, instead of always add esp, n / push. That saves instructions but can increase code-size. For example gcc -maccumulate-outgoing-args always does this, and was good for performance on older CPUs before push was efficient.)
Variadic functions like printf() are impossible to get right with __stdcall, because only the caller really knows how many arguments were passed in order to clean them up. The callee can make some good guesses (say, by looking at a format string), but it's legal in C to pass more args to printf than the format-string references (they'll be silently ignored). Hence only __cdecl supports variadic functions, where the caller does the cleanup.
Linker symbol name decorations:
As mentioned in a bullet point above, calling a function with the "wrong" convention can be disastrous, so Microsoft has a mechanism to avoid this from happening. It works well, though it can be maddening if one does not know what the reasons are.
They have chosen to resolve this by encoding the calling convention into the low-level function names with extra characters (which are often called "decorations"), and these are treated as unrelated names by the linker. The default calling convention is __cdecl, but each one can be requested explicitly with the /G? parameter to the compiler.
__cdecl (cl /Gd ...)
All function names of this type are prefixed with an underscore, and the number of parameters does not really matter because the caller is responsible for stack setup and stack cleanup. It is possible for a caller and callee to be confused over the number of parameters actually passed, but at least the stack discipline is maintained properly.
__stdcall (cl /Gz ...)
These function names are prefixed with an underscore and appended with # plus the number of bytes of parameters passed. By this mechanism, it's not possible to call a function with the wrong amount of parameters. The caller and callee definitely agree on returning with a ret 12 instruction for example, to pop 12 bytes of stack args along with the return address.
You'll get a link-time or runtime DLL error instead of having a function return with ESP pointing somewhere the caller isn't expecting. (For example if you added a new arg and didn't recompile both the main program and the library. Assuming you didn't fool the system by making an earlier arg narrower, like int64_t -> int32_t.)
__fastcall (cl /Gr ...)
These function names start with an # sign and are suffixed with the #bytes count, much like __stdcall. The first 2 args are passed in ECX and EDX, the rest are passed on the stack. The byte count includes the register args. As with __stdcall, a narrow arg like char still uses up a 4-byte arg-passing slot (a register, or a dword on the stack).
Examples:
Declaration -----------------------> decorated name
void __cdecl foo(void); -----------------------> _foo
void __cdecl foo(int a); -----------------------> _foo
void __cdecl foo(int a, int b); -----------------------> _foo
void __stdcall foo(void); -----------------------> _foo#0
void __stdcall foo(int a); -----------------------> _foo#4
void __stdcall foo(int a, int b); -----------------------> _foo#8
void __fastcall foo(void); -----------------------> #foo#0
void __fastcall foo(int a); -----------------------> #foo#4
void __fastcall foo(int a, int b); -----------------------> #foo#8
Note that in C++, the normal name-mangling mechanism that allows function overloading is used instead of #8, not as well. So you'll only see actual numbers in extern "C" functions. For example, https://godbolt.org/z/v7EaWs for example.
All functions in C/C++ have a particular calling convention. The point of a calling convention is to establish how data is passed between the caller and callee and who is responsible for operations such as cleaning out the call stack.
The most popular calling conventions on windows are
__stdcall, Pushes parameters on the stack, in reverse order (right to left)
__cdecl, Pushes parameters on the stack, in reverse order (right to left)
__clrcall, Load parameters onto CLR expression stack in order (left to right).
__fastcall, Stored in registers, then pushed on stack
__thiscall, Pushed on stack; this pointer stored in ECX
Adding this specifier to the function declaration essentially tells the compiler that you want this particular function to have this particular calling convention.
The calling conventions are documented here
https://learn.microsoft.com/en-us/cpp/cpp/calling-conventions
Raymond Chen also did a long series on the history of the various calling conventions (5 parts) starting here.
https://devblogs.microsoft.com/oldnewthing/20040102-00/?p=41213
__stdcall is a calling convention: a way of determining how parameters are passed to a function (on the stack or in registers) and who is responsible for cleaning up after the function returns (the caller or the callee).
Raymond Chen wrote a blog about the major x86 calling conventions, and there's a nice CodeProject article too.
For the most part, you shouldn't have to worry about them. The only case in which you should is if you're calling a library function that uses something other than the default -- otherwise the compiler will generate the wrong code and your program will probably crash.
Unfortunately, there is no easy answer for when to use it and when not.
__stdcall means that the arguments to a function are pushed onto the stack from the first to the last. This is as opposed to __cdecl, which means that the arguments are pushed from last to first, and __fastcall, which places the first four (I think) arguments in registers, and the rest go on the stack.
You just need to know what the callee expects, or if you are writing a library, what your callers are likely expect, and make sure you document your chosen convention.
That's a calling convention that WinAPI functions need to be called properly. A calling convention is a set of rules on how the parameters are passed into the function and how the return value is passed from the function.
If the caller and the called code use different conventions you run into undefined behaviour (like such a strange-looking crash).
C++ compilers don't use __stdcall by default - they use other conventions. So in order to call WinAPI functions from C++ you need to specify that they use __stdcall - this is usually done in Windoes SDK header files and you also do it when declaring function pointers.
It specifies a calling convention for a function. A calling convention is a set of rules how parameters are passed to a function: in which order, per address or per copy, who is to clean up the parameters (caller or callee) etc.
__stdcall denotes a calling convention (see this PDF for some details). This means it specifies how function arguments are pushed and popped from the stack, and who is responsible.
__stdcall is just one of several calling conventions, and is used throughout the WINAPI. You must use it if you provide function pointers as callbacks for some of those functions. In general, you do not need to denote any specific calling convention in your code, but just use the compiler's default, except for the case noted above (providing callbacks to 3rd party code).
simply put when you call function, it gets loaded in stack/register. __stdcall is one convention/way(right argument first, then left argument ...), __decl is another convention that are used to load the function on the stack or registers.
If you use them you instruct the computer to use that specific way to load/unload the function during linking and hence you would not get a mismatch/crash.
Otherwise the function-callee and function-caller might use different conventions causing program to crash.
__stdcall is the calling convention used for the function. This tells the compiler the rules that apply for setting up the stack, pushing arguments and getting a return value. There are a number of other calling conventions like __cdecl, __thiscall, __fastcall and __naked.
__stdcall is the standard calling convention for Win32 system calls.
More details can be found on Wikipedia.