Passing integer to x86 ASM in C++ - c++

I am trying to do some script hooking in C++, and have setup a simple test function for this case.
void __declspec(naked) testFunct()
{
int myInt;
myInt = 2000;
__asm{
mov eax, myInt
jmp [jmp_back_address]
}
}
when using this to pass in the integer, the function fails when it is called and the project crashes. However, when using this instead, without an integer value, it successfully passes through.
void __declspec(naked) testFunct()
{
__asm{
mov eax, 2000
jmp [jmp_back_address]
}
}
How can I successfully pass the integer?

The correct solution for my situation was to simply do everything within the ourFunct() through ASM instead, as mixing both C++ and ASM for passing variables was creating buggy assembly code. Example with a function call that works:
int CalculateTotalScore()
{
return (int)*Game::current_speed_score;
}
DWORD jmpBackAddress;
void __declspec(naked) ourFunct()
{
__asm{
call CalculateTotalScore
jmp [jmpBackAddress]
}
}

The assembler doesn't know what "myInt" means. Most compilers support inline assembly with the possibility to pass values. For instance, with GCC, you may try to define a macro like
#define MY_ASM_MACRO(myInt) ({ asm volatile("mov eax,%0\n\t \
jmp [jmp_back_address]" : : "r"(myInt) : ); })
And use it like
void __declspec(naked) testFunct()
{
int myInt;
myInt = 2000;
MY_ASM_MACRO(myInt)
}

Related

Do branch likelihood hints carry through function calls?

I've come across a few scenarios where I want to say a function's return value is likely inside the body of a function, not the if statement that will call it.
For example, say I want to port code from using a LIKELY macro to using the new [[likely]] annotation. But these go in syntactically different places:
#define LIKELY(...) __builtin_expect(!!(__VA_ARGS__),0)
if(LIKELY(x)) { ... }
vs
if(x) [[likely]] { ... }
There's no easy way to redefine the LIKELY macro to use the annotation. Would defining a function like
inline bool likely(bool x) {
if(x) [[likely]] return true;
else return false;
}
propagate the hint out to an if? Like in
if(likely(x)) { ... }
Similarly, in generic code, it can be difficult to directly express algorithmic likelihood information in the actual if statement, even if this information is known elsewhere. For example, a copy_if where the predicate is almost always false. As far as I know, there is no way to express that using attributes, but if branch weight info can propagate through functions, this is a solved problem.
So far I haven't been able to find documentation about this and I don't know a good setup to test this by looking at the outputted assembly.
The story appears to be mixed for different compilers.
On GCC, I think your inline likely function works, or at least has some effect. Using Compiler Explorer to test differences on this code:
inline bool likely(bool x) {
if(x) [[likely]] return true;
else return false;
}
//#define LIKELY(x) likely(x)
#define LIKELY(x) x
int f(int x) {
if (LIKELY(!x)) {
return -3548;
}
else {
return x + 1;
}
}
This function f adds 1 to x and returns it, unless x is 0, in which case it returns -3548. The LIKELY macro, when it's active, indicates to the compiler that the case where x is zero is more common.
This version, with no change, produces this assembly under GCC 10 -O1:
f(int):
test edi, edi
je .L3
lea eax, [rdi+1]
ret
.L3:
mov eax, -3548
ret
With the #define changed to the inline function with the [[likely]], we get:
f(int):
lea eax, [rdi+1]
test edi, edi
mov edx, -3548
cmove eax, edx
ret
That's a conditional move instead of a conditional jump. A win, I guess, albeit for a simple example.
This indicates that branch weights propagate through inline functions, which makes sense.
On clang, however, there is limited support for the likely and unlikely attributes, and where there is it does not seem to propagate through inline function calls, according to #Peter Cordes 's report.
There is, however, a hacky macro solution that I think also works:
#define EMPTY()
#define LIKELY(x) x) [[likely]] EMPTY(
Then anything like
if ( LIKELY(x) ) {
becomes like
if ( x) [[likely]] EMPTY( ) {
which then becomes
if ( x) [[likely]] {
.
Example: https://godbolt.org/z/nhfehn
Note however that this probably only works in if-statements, or in other cases that the LIKELY is enclosed in parentheses.
gcc 10.2 at least is able to make this deduction (with -O2).
If we consider the following simple program:
void foo();
void bar();
void baz(int x) {
if (x == 0)
foo();
else
bar();
}
then it compiles to:
baz(int):
test edi, edi
jne .L2
jmp foo()
.L2:
jmp bar()
However if we add [[likely]] on the else clause, the generated code changes to
baz(int):
test edi, edi
je .L4
jmp bar()
.L4:
jmp foo()
so that the not-taken case of the conditional branch corresponds to the "likely" case.
Now if we pull the comparison out into an inline function:
void foo();
void bar();
inline bool is_zero(int x) {
if (x == 0)
return true;
else
return false;
}
void baz(int x) {
if (is_zero(x))
foo();
else
bar();
}
we are again back to the original generated code, taking the branch in the bar() case. But if we add [[likely]] on the else clause in is_zero, we see the branch reversed again.
clang 10.0.1 however does not demonstrate this behavior and seems to ignore [[likely]] altogether in all versions of this example.
Yes, it will probably inline, but this is quite pointless.
The __builtin_expect will continue to work even after you upgrade to a compiler that supports those C++ 20 attributes. You can refactor them later, but it will be for purely aesthetic reasons.
Also, your implementation of the LIKELY macro is erroneous (it is actually UNLIKELY), the correct implementations are nelow.
#define LIKELY( x ) __builtin_expect( !! ( x ), 1 )
#define UNLIKELY( x ) __builtin_expect( !! ( x ), 0 )

Contextual differences in inline assembly code

Along the lines with the first answer here I tried to encapsulate some assembly code in a C++ function.
When I put this code in a function (inline or not) and pass the shellcode as an argument to the function it gives me an access violation 0xC0000005 at the call instruction, with or without DEP enabled. However, when I define the shellcode inside the function just before VirtualProtect, it works fine.
Current function code:
inline void ExecuteShellcode(char shellcode[])
{
/*char shellcode[] = \
"shellcode"; // If I use this local variable instead of the argument it works tho
*/
DWORD tempstore;
if (VirtualProtect(shellcode, sizeof(shellcode), PAGE_EXECUTE_READWRITE, &tempstore))
{
__asm lea eax, shellcode;
__asm call eax; Access violation 0xC0000005
}
}
Why does __asm call not work with non-local variables in this instance?

C++ Base64 String "Z2V0UGFzc3dvcmQ=" as function name

some quick info about me
I'm a MalwareResearcher since 2008 and C++/MASM Developer since 2013. Atm I improve and test my skills with malware samples and CrackMe's.
I found a really nice one and got stucked at the coding part :(
Code Snippet from crackme:
MOV EAX,004260AC ; ASCII "TUFMQ0hPLkRMTA=="
CALL 00407B10
JMP SHORT 004049FB
XOR EAX,EAX
MOV DWORD PTR SS:[LOCAL.1],-1
TEST EAX,EAX
JZ 00404AC3
MOV EAX,DWORD PTR DS:[EAX]
PUSH EAX ; /FileName
CALL DWORD PTR DS:[<&kernel32.LoadLibraryA>] ; \KERNEL32.LoadLibraryA
TEST EAX,EAX
JZ 00404AC3
PUSH 004260C0 ; /Procname = "Z2V0UGFzc3dvcmQ="
PUSH EAX ; |hModule
CALL DWORD PTR DS:[<&kernel32.GetProcAddress>] ; \KERNEL32.GetProcAddress
The crackme tries to load a dll called MALCHO.dll with LoadLibraryA and then tries to execute one its functions named Z2V0UGFzc3dvcmQ=.
After that it decrypts one of its resources with the password gathered from the dll's function Z2V0UGFzc3dvcmQ=.
As a part of the crackme it seems that I have to make this dll.
I was able to get the password which is needed for decryption by analysing another part of this specimen.
So "only" dll coding is needed to reach the end of the crackme :)
While decrypting TUFMQ0hPLkRMTA== to MALCHO.dll it seems that its function name Z2V0UGFzc3dvcmQ= is not decrypted to getPassword.
I don't now how to use a base64 encoded string as a function name in c++.
I get a syntax error in cause of the = in Z2V0UGFzc3dvcmQ= :(
My MALCHO.dll source:
MALCHO.h:
#ifdef MALCHODLL_EXPORTS
#define MALCHOFUNCSDLL_API __declspec(dllexport)
#else
#define MALCHOFUNCSDLL_API __declspec(dllimport)
#endif
namespace MALCHO
{
//This class is exported from the MalchoFuncsDll.dll
class MalchoFuncs
{
public:
// Returns password
static MALCHOFUNCSDLL_API char* Z2V0UGFzc3dvcmQ=(char* p);
};
}
MALCHO.cpp
#include "stdafx.h"
#include "MALCHO.h"
namespace MALCHO
{
char* MalchoFuncs::Z2V0UGFzc3dvcmQ=(char* p)
{
char* pw = "Yes I did it!";
return pw;
}
}
thanks in advance
MasDie
You can't use = as part of a name in C++, but GetProcAddress is an OS function which doesn't care about the language that you used. It just does string matching, and not very fancy either. It really cares only about \0 because that terminates the string. So, if you pass Z2V0UGFzc3dvcmQ=\0 it will look for an export named Z2V0UGFzc3dvcmQ=\0.
The syntax of a linker definition file for LINK.EXE won't allow you to add such a name, but again GetProcAddress doesn't care who put the name in the export table. The easiest solution is probably to add Z2V0UGFzc3dvcmQ_\0 and then overwrite the _

Inlining and static function call operators

I have a function template parameterized by a template parameter T to give it different behavior depending on what T it is instantiated with. The specific variations desired are very simple, a call to a static function T::foo(some_args) would suffice, because no state is involved.
However I do not want to that foo to appear in the body of the function template.
I would rather call T(some_args);to avoid syntactic noise. I believe declaring the function call operator () to be static is not possible (or is it ?). T has no state, therefore no instance specific variables.
In the event the above is not possible, what has more chance of getting inlined / optimized (in G++, Clang, ICC)
T::foo(some_args); // foo being a static function
or
T()(some_args); // operator () declared inline
I dont know assembly to check the output, and the question is more from an academic/curiosity point of view than actual performance.
Does T()(some_args) really allocate an object at runtime ? or is it typically optimized away ?
Simple example:
struct T
{
int operator()(int i) const {
return i+1;
}
};
int main()
{
return T()(1);
}
Compiled with -O2 this will yield:
(gdb) disassemble main
Dump of assembler code for function main():
0x0000000000400400 <+0>: mov eax,0x2
0x0000000000400405 <+5>: ret
End of assembler dump.
Even with -O0 this will not create a temporary in case you use the implicit default constructor in T:
(gdb) disassemble main
Dump of assembler code for function main():
0x00000000004004ec <+0>: push rbp
0x00000000004004ed <+1>: mov rbp,rsp
0x00000000004004f0 <+4>: sub rsp,0x10
0x00000000004004f4 <+8>: lea rax,[rbp-0x1]
0x00000000004004f8 <+12>: mov esi,0x1
0x00000000004004fd <+17>: mov rdi,rax
0x0000000000400500 <+20>: call 0x400508 <T::operator()(int) const>
0x0000000000400505 <+25>: leave
0x0000000000400506 <+26>: ret
End of assembler dump.

Arbitrary pointer to unknown class function - invalid type conversion

I have a hack program; it injects some functions into a target process to control it. The program is written in C++ with inline assembly.
class GameProcMain {
// this just a class
};
GameProcMain* mainproc; // there is no problem I can do =(GameProcMain*)0xC1EA90
Now I want to define a class function (which set ecx to class pointer) instead of writing assembly.
PPLYDATA GetNearblyMob(__Vector3* cordinate) {
__asm {
mov ecx, 0xC1EA90
enter code here
push cordinate
mov edi, 0x4A8010
call edi
}
}
I want to define it and call it like.
PPLYDATA (DLPL::*GetNearblyMob)(__Vector3* cordinate);
mainproc->GetNearblyMob(ADDR_CHRB->kordinat)
When I try GetNearblyMob=(PPLYDATA (DLPL::*)(__Vector3*)) 0x4A8010;
It says something like error: invalid type conversion: "int" to "PPLYDATA (DLPL::*)(int, int)"
but I can do this to set the pointer:
void initializeHack() {
__asm {
LEA edi, GetNearblyMob
MOV eax, 0x4A8010
MOV [edi], eax
}
}
Now I want to learn "how I can set GetNearblyMob without using assembly and legitimately in C++".
The problem is that member functions automatically get an extra parameter for the this pointer. Sometimes you can cast between member and non-member functions, but I don't see the need to cast anything.
Typically it's easier to reverse-engineer into C functions than into C++. C typically has a more straightforward ABI, so you can keep the data structures straight as you work them out.
So, I would recommend
PPLYDATA (*GetNearblyMob)(DLPL *main_obj, __Vector3* cordinate) = 0x12345UL;
and then define your own function
class DLPL {
GetNearblyMob( __Vector3* cordinate ) {
return ::GetNearblyMob( this, cordinate );
}
// ... other program functions
};
I am a bit surprised that it won't you cast like that.
You can try to do something like
GetNearblyMob=reinterpret_cast<PPLYDATA (DLPL::*)(__Vector3*)> (0x4A8010);
If that still does not work, try
*(int*)(&GetNearblyMob) = 0x4A8010;