Currently trying to emit a random instruction from a method but keep getting the error "Improper operand type".
#include <iostream>
#include <time.h>
#define PUSH 0x50
#define POP 0x58
#define NOP 0x90
auto generate_instruction() -> int {
int instruction_list[] = { NOP };
return instruction_list[rand() % (sizeof(instruction_list) / sizeof(*instruction_list))];
}
#define JUNK_INSTRUCTION(x) \
__asm _emit PUSH \
__asm _emit x \
__asm _emit POP \
#define JUNK JUNK_INSTRUCTION(generate_instruction)
int main() {
srand(static_cast<int>(time(NULL)));
JUNK;
std::cout << "Hello World!" << std::endl;
}
However when I replace #define JUNK JUNK_INSTRUCTION(generate_instruction) with #define JUNK JUNK_INSTRUCTION(NOP) , the program runs fine. I'm unsure as to why it's not working when they both return the same value.
Not sure what you are trying to do.
JUNK expands to JUNK_INSTRUCTION(generate_instruction), which will expand to:
__asm _emit PUSH
__asm _emit generate_instruction
__asm _emit POP
generate_instruction is simply the name of a function. The compiler is not going to run the function and replace just because you name it.
According to the docs, you need to provide a constant byte value, like you do with the other two.
I think you are really confused with the concepts of run-time calls, compile-time computation and macros.
Related
The function I want to call is a function of a class:
void D3DBase::SetTexture(const std::string& path);
When i call it with asm block it works, but It was giving an error when I built it in release mode, then when I checked it from memory I realized that I needed to shift the string offset by 4 bytes and when I tried it worked.
My question is Why should I do that? What is the reason of this?
std::string __tmpString = "";
void SetTexture(DWORD table, const std::string& str)
{
__tmpString = str;
__asm {
#ifdef NDEBUG
push offset __tmpString - 0x4
#else
push offset __tmpString
#endif
mov ecx, table
mov eax, 0x401FC0
call eax
}
}
I have simple class using a kind of ATL database access.
All functions are defined in a header file.
The problematic functions all do the same. There are some macros in use. The generated code looks like this
void InitBindings()
{
if (sName) // Static global char*
m_sTableName = sName; // Save into member
{ AddCol("Name", some_constant_data... _GetOleDBType(...), ...); };
{ AddCol("Name1", some_other_constant_data_GetOleDBType(...), ...); };
...
}
AddCol returns a reference to a structure, but as you see it is ignored.
When I look into the assembler code where I have a function that uses 6 AddCol calls I can see that the function requires 2176 bytes of stack space. I have functions that requires 20kb and more. And in the debugger I can see that the stack isn't use at all. (All initialized to 0xCC and never touched)
See assembler code at the end.
The problem can be seen with VS-2015, and VS-2017.Only in Debug mode.
In Release mode the function reserves no extra stack space at all.
The only rule I see is; more AddCol calls, will cause more stack to be reserved. I can see that approximativ 500bytes per AddCol call is reserved.
Again: The function returns no object, it returns a reference to the binding information.
I already used the following pragmas in front of the function (but inside the class definition in the header):
__pragma(runtime_checks("", off)) __pragma(optimize("ts", on)) __pragma(strict_gs_check(push, off))
But no avail. This pragmas should turn optimization on, switches off runtime checks and stack checks. How can I reduce this unneeded stack space that is allocated. In some cases I can see stack overflows in the debug version, when this functions are used. No problems in the release version.
; 325 : BIND_BEGIN(CMasterData, _T("tblMasterData"))
push ebp
mov ebp, esp
sub esp, 2176 ; 00000880H
push ebx
push esi
push edi
mov DWORD PTR _this$[ebp], ecx
mov eax, OFFSET ??_C#_1BM#GOLNKAI#?$AAt?$AAb?$AAl?$AAM?$AAa?$AAs?$AAt?$AAe?$AAr?$AAD?$AAa?$AAt?$AAa?$AA?$AA#
test eax, eax
je SHORT $LN2#InitBindin
push OFFSET ??_C#_1BM#GOLNKAI#?$AAt?$AAb?$AAl?$AAM?$AAa?$AAs?$AAt?$AAe?$AAr?$AAD?$AAa?$AAt?$AAa?$AA?$AA#
mov ecx, DWORD PTR _this$[ebp]
add ecx, 136 ; 00000088H
call DWORD PTR __imp_??4?$CStringT#_WV?$StrTraitMFC_DLL#_WV?$ChTraitsCRT#_W#ATL#####ATL##QAEAAV01#PB_W#Z
$LN2#InitBindin:
; 326 : // Columns:
; 327 : B$C_IDENT (_T("Id"), m_lId);
push 0
push 0
push 1
push 4
push 0
call ?_GetOleDBType#ATL##YAGAAJ#Z ; ATL::_GetOleDBType
add esp, 4
movzx eax, ax
push eax
push 0
push OFFSET ??_C#_15NCCOGFKM#?$AAI?$AAd?$AA?$AA#
mov ecx, DWORD PTR _this$[ebp]
call ?AddCol#CDBAccess#DB##QAEAAUS_BIND#2#PB_WKGKW4TYPE#32#0_N#Z ; DB::CDBAccess::AddCol
; 328 : B$C (_T("Name"), m_szName);
push 0
push 0
push 0
push 122 ; 0000007aH
mov eax, 4
push eax
call ?_GetOleDBType#ATL##YAGQA_W#Z ; ATL::_GetOleDBType
add esp, 4
movzx ecx, ax
push ecx
push 4
push OFFSET ??_C#_19DINFBLAK#?$AAN?$AAa?$AAm?$AAe?$AA?$AA#
mov ecx, DWORD PTR _this$[ebp]
call ?AddCol#CDBAccess#DB##QAEAAUS_BIND#2#PB_WKGKW4TYPE#32#0_N#Z ; DB::CDBAccess::AddCol
; 329 : B$C (_T("Data"), m_data);
push 0
push 0
push 0
push 4
push 128 ; 00000080H
call ?_GetOleDBType#ATL##YAGAAVCComBSTR#1##Z ; ATL::_GetOleDBType
add esp, 4
movzx eax, ax
push eax
push 128 ; 00000080H
push OFFSET ??_C#_19IEEMEPMH#?$AAD?$AAa?$AAt?$AAa?$AA?$AA#
mov ecx, DWORD PTR _this$[ebp]
call ?AddCol#CDBAccess#DB##QAEAAUS_BIND#2#PB_WKGKW4TYPE#32#0_N#Z ; DB::CDBAccess::AddCol
It is a compiler bug. Already known in connect.
EDIT The problem seams to be fixed in VS-2017 15.5.1
The problem has to do with a bug in the built in offsetof.
It is not possible for me to #undef _CRT_USE_BUILTIN_OFFSETOF as written in this case.
For me it only works to #undef offsetof and to use one of this:
#define myoffsetof1(s,m) ((size_t)&reinterpret_cast<char const volatile&>((((s*)0)->m)))
#define myoffsetof2(s, m) ((size_t)&(((s*)0)->m))
#undef offsetof
#define offsetof myoffsetof1
All ATL DB consumers are affected.
Here is a minimum repro, that shows the bug. Set a breakpint on the Init function. Look into the assembler code and wonder how much stack is used!
// StackUsage.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <string>
#include <list>
#include <iostream>
using namespace std;
struct CRec
{
char t1[20];
char t2[20];
char t3[20];
char t4[20];
char t5[20];
int i1, i2, i3, i4, i5;
GUID g1, g2, g3, g4, g5;
DBTIMESTAMP d1, d2, d3, d4, d5;
};
#define sizeofmember(s,m) sizeof(reinterpret_cast<const s *>(0)->m)
#define typeofmember(c,m) _GetOleDBType(((c*)0)->m)
#define myoffsetof1(s,m) ((size_t)&reinterpret_cast<char const volatile&>((((s*)0)->m)))
#define myoffsetof2(s, m) ((size_t)&(((s*)0)->m))
// Undef this lines to fix the bug
// #undef offsetof
// #define offsetof myoffsetof1
#define COL(n,v) { AddCol(n,offsetof(CRec,v),typeofmember(CRec,v),sizeofmember(CRec,v)); }
class CFoo
{
public:
CFoo()
{
Init();
}
void Init()
{
COL("t1", t1);
COL("t2", t2);
COL("t3", t3);
COL("t4", t4);
COL("t5", t5);
COL("i1", i1);
COL("i2", i2);
COL("i3", i3);
COL("i4", i4);
COL("i5", i5);
COL("g1", g1);
COL("g2", g2);
COL("g2", g3);
COL("g2", g4);
COL("g2", g5);
COL("d1", d1);
COL("d2", d2);
COL("d2", d3);
COL("d2", d4);
COL("d2", d5);
}
void AddCol(PCSTR szName, ULONG nOffset, DBTYPE wType, ULONG nSize)
{
cout << szName << '\t' << nOffset << '\t' << wType << '\t' << nSize << endl;
}
};
int main()
{
CFoo foo;
return 0;
}
I have been wondering how V8 JavaScript Engine and any other JIT compilers execute the generated code.
Here are the articles I read during my attempt to write a small demo.
http://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introduction
http://nullprogram.com/blog/2015/03/19/
I only know very little about assembly, so I initially used http://gcc.godbolt.org/ to write a function and get the disassembled output, but the code is not working on Windows.
I then wrote a small C++ code, compiled with -g -Og, then get disassmbled output with gdb.
#include <stdio.h>
int square(int num) {
return num * num;
}
int main() {
printf("%d\n", square(10));
return 0;
}
Output:
Dump of assembler code for function square(int):
=> 0x00000000004015b0 <+0>: imul %ecx,%ecx
0x00000000004015b3 <+3>: mov %ecx,%eax
0x00000000004015b5 <+5>: retq
I copy-pasted the output ('%' removed) to online x86 assembler and get { 0x0F, 0xAF, 0xC9, 0x89, 0xC1, 0xC3 }.
Here is my final code. if I compiled it with gcc, I always get 1. If I compiled it with VC++, I get random number. What is going on?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <windows.h>
typedef unsigned char byte;
typedef int (*int0_int)(int);
const byte square_code[] = {
0x0f, 0xaf, 0xc9,
0x89, 0xc1,
0xc3
};
int main() {
byte* buf = reinterpret_cast<byte*>(VirtualAlloc(0, 1 << 8, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE));
if (buf == nullptr) return 0;
memcpy(buf, square_code, sizeof(square_code));
{
DWORD old;
VirtualProtect(buf, 1 << 8, PAGE_EXECUTE_READ, &old);
}
int0_int square = reinterpret_cast<int0_int>(buf);
int ans = square(100);
printf("%d\n", ans);
VirtualFree(buf, 0, MEM_RELEASE);
return 0;
}
Note
I am trying to learn how JIT works, so please do not suggest me to use LLVM or any library. I promise I will use a proper JIT library in real project rather than writing from scratch.
Note: as Ben Voigt points out in the comments, this is really only valid for x86, not x86_64. For x86_64 you just have some errors in your assembly (which are still errors in x86 as well) as Ben Voigt points out as well in his answer.
This is happening because your compiler could see both sides of the function call when you generated your assembly. Since the compiler was in control of generating code for both the caller and the callee, it didn't have to follow the cdecl calling convention, and it didn't.
The default calling convention for MSVC is cdecl. Basically, function parameters are pushed onto the stack in the reverse of the order they're listed, so a call to foo(10, 100) could result in the assembly:
push 100
push 10
call foo(int, int)
In your case, the compiler will generate something like the following at the call site:
push 100
call esi ; assuming the address of your code is in the register esi
That's not what your code is expecting though. Your code is expecting its argument to be passed in the register ecx, not the stack.
The compiler has used what looks like the fastcall calling convention. If I compile a similar program (I get slightly different assembly) I get the expected result:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <windows.h>
typedef unsigned char byte;
typedef int (_fastcall *int0_int)(int);
const byte square_code[] = {
0x8b, 0xc1,
0x0f, 0xaf, 0xc0,
0xc3
};
int main() {
byte* buf = reinterpret_cast<byte*>(VirtualAlloc(0, 1 << 8, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE));
if (buf == nullptr) return 0;
memcpy(buf, square_code, sizeof(square_code));
{
DWORD old;
VirtualProtect(buf, 1 << 8, PAGE_EXECUTE_READ, &old);
}
int0_int square = reinterpret_cast<int0_int>(buf);
int ans = square(100);
printf("%d\n", ans);
VirtualFree(buf, 0, MEM_RELEASE);
return 0;
}
Note that I've told the compiler to use the _fastcall calling convention. If you want to use cdecl, the assembly would need to look more like this:
push ebp
mov ebp, esp
mov eax, DWORD PTR _n$[ebp]
imul eax, eax
pop ebp
ret 0
(DISCLAMER: I'm not great at assembly, and that was generated by Visual Studio)
I copy-pasted the output ('%' removed)
Well, that means your second instruction was
mov ecx, eax
which makes no sense at all (it overwrites the result of the multiplication with the uninitialized return value).
On the other hand
mov eax, foo
ret
is a very common pattern for ending a function with non-void return type.
The difference between your two assembly languages (AT&T style vs Intel style) is more than just the % marker, the operand order is reversed and pointers and offsets are denoted very differently as well.
You'll want to issue a set disassembly-flavor intel command in gdb
I'm trying to get PEB address of the current process with assembler.
the cpp file:
#include <iostream>
//#include <windows.h>
extern "C" int* __ptr64 Get_Ldr_Addr();
int main(int argc, char **argv)
{
std::cout << "asm " << Get_Ldr_Addr() << "\n";
//std::cout <<"peb "<< GetModuleHandle(0) << "\n";
return 0;
}
the asm file:
.code
Get_Ldr_Addr proc
push rax
mov rax, GS:[30h]
mov rax, [rax + 60h]
pop rax
ret
Get_Ldr_Addr endp
end
But I get different addresses from the GetModuleHandle(0) and the Get_Ldr_Addr()!
what is the problem? doesn't is suppose to be the same?
Q: If the function is external, it will check the PEB of the process that called it or of the function's dll (it suppose to be a dll)?
Tnx
If you don't mind C. Works in Microsoft Visual Studio 2015.
Uses the "__readgsqword()" intrinsic.
#include <winnt.h>
#include <winternl.h>
// Thread Environment Block (TEB)
#if defined(_M_X64) // x64
PTEB tebPtr = reinterpret_cast<PTEB>(__readgsqword(reinterpret_cast<DWORD_PTR>(&static_cast<NT_TIB*>(nullptr)->Self)));
#else // x86
PTEB tebPtr = reinterpret_cast<PTEB>(__readfsdword(reinterpret_cast<DWORD_PTR>(&static_cast<NT_TIB*>(nullptr)->Self)));
#endif
// Process Environment Block (PEB)
PPEB pebPtr = tebPtr->ProcessEnvironmentBlock;
Just two comments.
No need to push/pop rax because it's a scratch or volatile register on Windows, see the caller/callee saved registers. In particular, rax will hold the return value for your function.
It often helps to step through the machine code when you call GetModuleHandle() and compare it with your own assembly code. You'll probably encounter something like this implementation.
I like Sirmabus' answer but I much prefer it with simple C casts and the offsetof macro:
PPEB get_peb()
{
#if defined(_M_X64) // x64
PTEB tebPtr = (PTEB)__readgsqword(offsetof(NT_TIB, Self));
#else // x86
PTEB tebPtr = (PTEB)__readfsdword(offsetof(NT_TIB, Self));
#endif
return tebPtr->ProcessEnvironmentBlock;
}
Get_Ldr_Addr didnt save your result.
you should not protect rax by push and pop because rax is the return value
I want to ask if there's some way, to "repeat" macro n times automatically - by automatically I mean compile time, I want to do something like this:
#define foo _asm mov eax, eax
#define bar(x) //I don't know how can I do it
int main()
{
bar(5); //would generate 5 times _asm mov eax, eax
return 0;
}
I know I can embed macros in other macros but I don't know how can I do it something exactly n times. I want to use it in random-sized junk generator
You can do this using recoursive template:
// recoursive step
template
<
size_t count
>
void n_asm() {
_asm mov eax, eax
n_asm<count - 1>();
}
// base of recursion
template<>
void n_asm<0>() {
}
int main()
{
n_asm<5>();
return 0;
}