I am using Visual C++ 2010, and MASM as my x64-Assembler.
This is my C++ code:
// include directive
#include "stdafx.h"
// functions
extern "C" int Asm();
extern "C" int (convention) sum(int x, int y) { return x + y; }
// main function
int main()
{
// print asm
printf("Asm returned %d.\n", Asm());
// get char, return
_getch();
return EXIT_SUCCESS;
}
And my assembly code:
; external functions
extern sum : proc
; code segment
.code
Asm proc
; create shadow space
sub rsp, 20o
; setup parameters
mov ecx, 10
mov edx, 15
; call
call sum
; clean-up shadow space
add rsp, 20o
; return
ret
Asm endp
end
The reason I am doing this is so I can learn the different calling conventions.
I would make sum's calling convention stdcall, and modify the asm code so it would call sum the "stdcall" way. Once I got that working, I would make it, say, fastcall, and then call it in asm the "fastcall" way.
But look at my assembly code right now. When I use that code, no matter if sum is stdcall, fastcall or cdecl, it will compile, execute fine, and print 25 as my sum.
My question: How, and why can __cdecl, __stdcall and __fastcall all be called the exact same way?
The problem is that you're compiling for x64 targets. From MSDN
Given the expanded register set, x64 just uses the __fastcall calling
convention and a RISC-based exception-handling model. The __fastcall
model uses registers for the first four arguments and the stack frame
to pass the other parameters.
Switch over to compiling for x86 targets, and you should be able to see the various calling conventions in action.
As far as i know x64 only uses the __fastcall convention. __cdecl and stdcall will just be compiled as __fastcall.
Related
My question is very specific, i want force compiler to take the code of a funtion and copy it inside a another one, like inline or __forceinline keywords can do, but i want pass the function i want to copy in the other funtion, as an argument. Here is a simple example.
using pFunc = void(*)();
void func_1() { /*some code*/ }
void func_2(pFunc function) { /*some code*/ } //after compile i want this funtion takes no argument and copy the func_1 inside this.
int main()
{
func_2(func_1);
}
so with this example the compiler will pass the pointer of func_1 as argunent to func_2, as expected.
I tried add inline keyword for func_1 and also tried to pass the argument of func_2 as reference, but compiler didn't copied the func_1 inside func_2.
Any idea how can i do that?
I use the compiler of visual studio(msvc) with toolset 2017(v141).
My project platform is x64.
You can use a noinline template function to get the asm you want
So you want the compiler to do constant-propagation into a clone of void func_2(pFunc f){ f(); }? Like what GCC might do with __attribute__((noinline)) but not noclone?
For example,
using pFunc = void(*)();
int sink, sink2;
#ifdef _MSC_VER
#define NOINLINE _declspec(noinline)
#else
#define NOINLINE __attribute__((noinline)) // noclone and/or noipa
#endif
__attribute__((always_inline)) // without this, gcc chooses to clone .constprop.0 with just a jmp func_2
void func_1() { sink = 1; sink2 = 2; }
NOINLINE static void func_2(pFunc function) { function(); }
int main()
{
func_2(func_1);
}
produces, with GCC11.3 -O2 or higher, or -O1 -fipa-cp, on Godbolt. (Clang is similar):
# GCC11 -O3 with C++ name demangling
func_1():
mov DWORD PTR sink[rip], 1
mov DWORD PTR sink2[rip], 2
ret
func_2(void (*)()) [clone .constprop.0]:
mov DWORD PTR sink[rip], 1
mov DWORD PTR sink2[rip], 2
ret
main:
# note no arg passed, calling a special version of the function
# specialized for function = func_1
call func_2(void (*)()) [clone .constprop.0]
xor eax, eax
ret
Of course if we hadn't disabled inlining of func_2, main would just call func_1. Or inline that body of func_1 into main and not do any calls.
MSVC might not be willing to do that "optimization", instead preferring to just inline func_2 into main as call func_1.
If you want to force it to make clunky asm that duplicates func_1 unnecessarily, you could use a template to do the same thing as constprop, taking the function pointer as a template arg, so you can instantiate func_2<func1> as a stand-alone non-inline function if you really want. (Perhaps with _declspec(noinline)).
Your func_2 can accept func_1 as an unused argument if you want.
using pFunc = void(*)();
int sink, sink2;
#ifdef _MSC_VER
#define NOINLINE _declspec(noinline)
#define ALWAYS_INLINE /* */
#else
#define NOINLINE __attribute__((noinline)) // not noclone or noipa, we *want* those to happen
#define ALWAYS_INLINE __attribute__((always_inline))
#endif
//ALWAYS_INLINE // Seems not needed for this case, with the template version
void func_1() { sink = 1; sink2 = 2; }
template <pFunc f>
NOINLINE void func_2() { f(); }
int main()
{
func_2<func_1>();
}
Compiles as desired with MSVC -O2 (Godbolt), and GCC/clang
int sink DD 01H DUP (?) ; sink
int sink2 DD 01H DUP (?) ; sink2
void func_2<&void func_1(void)>(void) PROC ; func_2<&func_1>, COMDAT
mov DWORD PTR int sink, 1 ; sink
mov DWORD PTR int sink2, 2 ; sink2
ret 0
void func_2<&void func_1(void)>(void) ENDP ; func_2<&func_1>
void func_1(void) PROC ; func_1, COMDAT
mov DWORD PTR int sink, 1 ; sink
mov DWORD PTR int sink2, 2 ; sink2
ret 0
void func_1(void) ENDP ; func_1
main PROC ; COMDAT
$LN4:
sub rsp, 40 ; 00000028H
call void func_2<&void func_1(void)>(void) ; func_2<&func_1>
xor eax, eax
add rsp, 40 ; 00000028H
ret 0
main ENDP
Note the duplicated bodies of func_1 and func_2.
You should check (with a disassembler) that the linker doesn't do identical code folding and just attach the both symbol names to one block of machine code.
I don't think this looks like much of an obfuscation technique; IDK why having a 2nd copy of a function with identical machine code would be a problem to reverse engineer. I guess it would maybe create more overall work, and people wouldn't notice that two calls to different functions are actually doing the same thing.
I mostly answered as an exercise in making a compiler spit out the asm I wanted it to, whether or not that has value to anyone else.
Obviously it only works for compile-time-constant function pointers; commenters have been discussing self-modifying code and scripting languages. If you wanted this for non-const function pointer args to func_1, you're completely out of luck in a language like C++ that's designed for strictly ahead-of-time compilation.
I'm currently learning masm and I am having a problem calling an external function.
I have a function in c++ which is called writei, it receives a uint64 and outputs it.
int writei(uint64_t a)
{
cout << a;
return 1;
}
I tried "extrn"ing and calling it from an .asm file but the compiler throws "unresolved external symbol writei referenced in function mai".
this is the masm code (I'm using visual studio 2019)
extern writei : proto
.code
mai proc
push rbp
push rsp
mov ecx,3
call writei
pop rsp
pop rbp
ret
mai endp
end
Among other things, you need "extern C" in your C++ method declaration.
For example:
extern "C" {
int writei(uint64_t a);
}
int writei(uint64_t a)
{
cout << a;
return 1;
}
Here's a good article that explains this in more detail:
ISO C++ FAQ: How to mix C and C++
Like the title says, I want to trace ALL functions calls in my application (from inside).
I tried using "_penter" but I get either a recursion limit reached error or an access violation when I try to prevent the recursion.
Is there any way to achieve this ?
Update
What I tried:
extern "C"
{
void __declspec(naked) _cdecl _penter()
{
_asm {
push eax
push ecx
push edx
mov ecx, [esp + 0Ch]
push ecx
mov ecx, offset Context::Instance
call Context::addFrame
pop edx
pop ecx
pop eax
ret
}
}
class Context
{
public:
__forceinline void addFrame(const void* addr) throw() {}
static thread_local Context Instance;
};
sadly this still gives a stack overflow due to recursion
Your approach is correct, /Gh and /GH compiler switches + _penter and _pexit functions is the way to go.
I think there’re errors in your implementation of these functions. That’s very low-level stuff, for 32 bit builds you have to use __declspec(naked), and for 64 bit builds you have to use assembler. Both are quite tricky to implement correctly.
Take a look at this repository for an example how to do it right:
https://github.com/tyoma/micro-profiler Specifically, to this source file: https://github.com/tyoma/micro-profiler/blob/master/micro-profiler/collector/hooks.asm As you see, they decided to use assembler for both platforms, and from that they call some C++ function to record call information. Also note how in C++ collector implementation they use __forceinline to avoid recursion.
but I get either a recursion limit reached error
this can be if inside Context::addFrame implementation compiler also insert call _penter which recursive call Context::addFrame.
but how __forceinline you can ask ? nothing. c/c++ compiler to insert a copy of the function body into each place the function is called from code which is generated by this compiler. c/c++ compiler can not insert a copy of the function body into code, which he not compile itself. so when we call function marked as __forceinline from assembler code - function will be called in usual way but not expanded in place. so your __forceinline simply have no effect and sense
you need implement Context::addFrame (and all functions which it call) in separate c++ file (let be context.cpp) compiled without /Gh option.
you can set /Gh for all files in project, except context.cpp
if exist too many cpp files in project - you can set /Gh for project, but how then remove it for single file context.cpp ? exist one original way - you can copy <cmdline> for this file and that set custom build tool for it
Command Line- CL.exe <cmdline> $(InputFileName) (not forget remove /Gh) and Outputs - $(IntDir)\$(InputName).obj. original by perfect work.
so in context.cpp you can have next code:
class Context
{
public:
void __fastcall addFrame(const void* addr);
int _n;
static thread_local Context Instance;
};
thread_local Context Context::Instance;
void __fastcall Context::addFrame(const void* addr)
{
#pragma message(__FUNCDNAME__)
DbgPrint("%p>%u\n", addr, _n++);
}
if Context::addFrame call some another internal function (explicit or implicit) - put it also in this file, which compile without /Gh
the _penter better implement in separate asm file, but not as inline asm (this not supported in x64 anyway)
so for x86 you can create code32.asm ( ml /c /Cp $(InputFileName) -> $(InputName).obj)
.686p
.MODEL flat
extern ?addFrame#Context##QAIXPBX#Z:proc
extern ?Instance#Context##2V12#A:byte
_TEXT segment 'CODE'
__penter proc
push edx
push ecx
mov edx,[esp+8]
lea ecx,?Instance#Context##2V12#A
call ?addFrame#Context##QAIXPBX#Z
pop ecx
pop edx
ret
__penter endp
_TEXT ends
end
note - you need save only rcx and rdx (if you use __fastcall , except context.cpp, functions)
for x64 - create code64.asm ( ml64 /c /Cp $(InputFileName) -> $(InputName).obj)
extern ?addFrame#Context##QEAAXPEBX#Z:proc
extern ?Instance#Context##2V12#A:byte
_TEXT segment 'CODE'
_penter proc
mov [rsp+8],rcx
mov [rsp+16],rdx
mov [rsp+24],r8
mov [rsp+32],r9
mov rdx,[rsp]
sub rsp,28h
lea rcx,?Instance#Context##2V12#A
call ?addFrame#Context##QEAAXPEBX#Z
add rsp,28h
mov r9,[rsp+32]
mov r8,[rsp+24]
mov rdx,[rsp+16]
mov rcx,[rsp+8]
ret
_penter endp
_TEXT ENDS
end
Here is what I use
Configuration Properties > C/C++ > Command Line
Add compiler option to Additional Options box
Like so
Add flag /Gh for _penter hook
Add flag /GH for _pexit hook
Code I use for tracing / logging
#include <intrin.h>
extern "C" void __declspec(naked) __cdecl _penter(void) {
__asm {
push ebp; // standard prolog
mov ebp, esp;
sub esp, __LOCAL_SIZE
pushad; // save registers
}
// _ReturnAddress always returns the address directly after the call, but that is not the start of the function!
PBYTE addr;
addr = (PBYTE)_ReturnAddress() - 5;
SYMBOL_INFO* mysymbol;
HANDLE process;
process = GetCurrentProcess();
SymInitialize(process, NULL, TRUE);
mysymbol = (SYMBOL_INFO*)calloc(sizeof(SYMBOL_INFO) + 256 * sizeof(char), 1);
mysymbol->MaxNameLen = 255;
mysymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
SymFromAddr(process, (DWORD64)((void*)addr), 0, mysymbol);
myprintf("Entered Function: %s [0x%X]\n", mysymbol->Name, addr);
_asm {
popad; // restore regs
mov esp, ebp; // standard epilog
pop ebp;
ret;
}
}
extern "C" void __declspec(naked) __cdecl _pexit(void) {
__asm {
push ebp; // standard prolog
mov ebp, esp;
sub esp, __LOCAL_SIZE
pushad; // save registers
}
// _ReturnAddress always returns the address directly after the call, but that is not the start of the function!
PBYTE addr;
addr = (PBYTE)_ReturnAddress() - 5;
SYMBOL_INFO* mysymbol;
HANDLE process;
process = GetCurrentProcess();
SymInitialize(process, NULL, TRUE);
mysymbol = (SYMBOL_INFO*)calloc(sizeof(SYMBOL_INFO) + 256 * sizeof(char), 1);
mysymbol->MaxNameLen = 255;
mysymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
SymFromAddr(process, (DWORD64)((void*)addr), 0, mysymbol);
myprintf("Exit Function: %s [0x%X]\n", mysymbol->Name, addr);
_asm {
popad; // restore regs
mov esp, ebp; // standard epilog
pop ebp;
ret;
}
}
We all know that private methods and members are only accessable inside the class, same way that protected methods and members are accessable inside the class and classes that derived from that class. But where is the «access control» of this? Does the «access control» happen in compile time, or does the compiler add addional machine code that controls that in runtime?
Can I create a class like this:
class Print
{
public:
void printPublic();
private:
void printPrivate();
};
int main()
{
Print print;
print.printPublic() // Change this to printPrivate() after compiling the code
return(EXIT_SUCCESS);
}
And then after compiling the code edit the machine code to call printPrivate() instead of printPublic() method without error?
Once you've fiddled around with the machine code, you're no longer compiling C++, but you're programming directly in machine code.
Your question is therefore somewhat moot.
You can regard the access specifiers as being essentially compile time directives, but note that the compiler can make optimisation choices based on them. In other words, it could be either. The C++ standard doesn't have to say anything about this either.
The «access control» happen at compile time and only for c++ code. you even not need edit the machine code - you can easy call private methods from assembly language - so this demonstrate that this is only for c++ restriction. and of course no any additional machine code that controls that in run-time - this at all impossible control who call method.
simply demo . note function names, how it mangled depended from x86 or x64 compiling and from compiler probably - my demo for CL compiler and x64 platform bat it can be easy changed to x86 or other compiler
c++ code
class Print
{
public:
void printPublic();
private:
void printPrivate();
};
// must be not inline or referenced from c++ code or will be droped by compiler!
void Print::printPrivate()// thiscall
{
DbgPrint("%s<%p>\n", __FUNCTION__, this);
}
void Print::printPublic()// thiscall
{
DbgPrint("%s<%p>\n", __FUNCTION__, this);
}
extern "C"
{
// stub impemeted in asm
void __fastcall Print_printPrivate(Print* This);
void __fastcall Print_printPublic(Print* This);
};
Print p;
//p.printPrivate();//error C2248
p.printPublic();
Print_printPrivate(&p);
Print_printPublic(&p);
and asm code (for ml64)
_TEXT segment 'CODE'
extern ?printPrivate#Print##AEAAXXZ:proc
extern ?printPublic#Print##QEAAXXZ:proc
Print_printPrivate proc
jmp ?printPrivate#Print##AEAAXXZ
Print_printPrivate endp
Print_printPublic proc
jmp ?printPublic#Print##QEAAXXZ
Print_printPublic endp
_TEXT ENDS
END
also note for x86 only that all c++ methods use thiscall calling convention - first parameter this in ECX register and next in stack as for __stdcall - so if method have no parameters (really one this ) we can use __fastcall for asm function as is, and if exist parameters we need push EDX to stack in assembler stub. for x64 no this problem - here only one calling convention, but all this already not related to main question.
example for x86 code with extra params, for show how transform __fastcall to __thiscall
class Print
{
public:
void printPublic(int a, int b)// thiscall
{
DbgPrint("%s<%p>(%x, %x)\n", __FUNCTION__, this, a, b);
}
private:
void printPrivate(int a, int b);
};
// must be not inline or referenced from c++ code or will be droped by compiler!
void Print::printPrivate(int a, int b)// thiscall
{
DbgPrint("%s<%p>(%x, %x)\n", __FUNCTION__, this, a, b);
}
extern "C"
{
// stub impemeted in asm
void __fastcall Print_printPrivate(Print* This, int a, int b);
void __fastcall Print_printPublic(Print* This, int a, int b);
};
Print p;
//p.printPrivate(1,2);//error C2248
p.printPublic(1, 2);
Print_printPrivate(&p, 1, 2);
Print_printPublic(&p, 1, 2);
and asm
.686p
_TEXT segment
extern ?printPublic#Print##QAEXHH#Z:proc
extern ?printPrivate#Print##AAEXHH#Z:proc
#Print_printPrivate#12 proc
xchg [esp],edx
push edx
jmp ?printPrivate#Print##AAEXHH#Z
#Print_printPrivate#12 endp
#Print_printPublic#12 proc
xchg [esp],edx
push edx
jmp ?printPublic#Print##QAEXHH#Z
#Print_printPublic#12 endp
_TEXT ends
end
The «access control» happen at compile time
I am trying to run a function on a separately allocated stack.
I want to keep the stack for later so I can restore it and resume the function.
The following code compiles and runs, but nothing prints to the screen.
#include <cstdlib>
#include <csetjmp>
#include <iostream>
using namespace std;
unsigned char stack[65535];
unsigned char *base_ptr = stack + 65535 - 1;
unsigned char *old_stack;
unsigned char *old_base;
void function()
{
cout << "hello world" << endl;
}
int main()
{
__asm
{
mov old_base, ebp
mov old_stack, esp
mov ebp, base_ptr
mov esp, base_ptr
call function
mov ebp, old_base
mov esp, old_stack
}
}
using vs2012/win8/intel Q9650
Welcome to C++ and name mangling. Function names in C++ are mangled by the compiler (such that using gcc function becomes _Z8functionv for me). This is to facilitate function overloading. The compiler keeps track of the actual names that it has given the different functions in the background so you aren't aware of it. This is a problem for any other language that tries to interact with C++.
This code won't link on my computer.
The solutions:
1) compile with g++ and pass the -S flag (so g++ -S test.cpp). And then take a look at the assembly output (cat test.s) to see what the function is called. Then change the name in "call function" to be "call _Z8functionv" (for me - it could easily be different for you).
2) use C: change the cout << to a printf statement and the above should work.
I take it that you aren't using gcc though (as the assembler is back to front for gas - I had to switch all the operands on the assembler around).
Actually I don't see any problem with your code.
Your sample taken as-is compiles, links and runs as expected.
Perhaps your problem with console settings, or some global STL/CRT initialization or whatever. Anyway, you may put a breakpoint inside your function to ensure you're getting there.
According to Intel's x86 documentation for MOV, page 3-403, you should load the SS register immediately before loading a new ESP value. That blocks any interrupts from running until ESP has been assigned.