Extract a static value which may change in the future - c++

I got the following code in ASM:
MOV EAX,DWORD PTR DS:[EBX+F8]
EBX containts an address, F8 is the offset added to this address. If I get this right, eax contains the dereferenced value from address+offset after the operation is executed.
What I want to do now is, to write some pattern in C++ while using inline asm, which allows me to fetch/retrieve F8 without changing the code in case F8 changes.
Is there any pattern search method (like regex) which I can use here? Is the offset possibly saved in any register? Or is this quite impossible to do?
Hopefully the information provided is enough, I could add some more lines of code if you wish.

You can add offset to the value in EBX register and then get the value from the updated address.

Related

How to handle forward referencing when Label size is not fixed

I am trying to write 8086 Emulator in C++.
But I am facing a problem.
Suppose the code is:
MOV AL, BL
JMP X
MOV BL, CL
MOV DL, CL
.
.
.
X:
ADD AX, BX
HLT
Now the machine code for JMP X will depend on X, whether it is near location or short location
near: 8Bit Address[00-ff]
short: 16Bit Address(ff-ffff]
So if the size of the JMP instruction used to be constant(fixed size)than I can just move further and whenever I will find X I can just put it's address back.
But here I can't move further because next location is also depending on JMP X and whose size is not fixed.
I have no idea how to deal with it.
You may have even more problems for jmp. See the following possible opcodes and their meaning:
EB cb JMP rel8 Jump short, relative, displacement relative to next instruction.
E9 cw JMP rel16 Jump near, relative, displacement relative to next instruction.
E9 cd JMP rel32 Jump near, relative, displacement relative to next instruction.
FF /4 JMP r/m16 Jump near, absolute indirect, address given in r/m16.
FF /4 JMP r/m32 Jump near, absolute indirect, address given in r/m32.
EA cd JMP ptr16:16 Jump far, absolute, address given in operand.
EA cp JMP ptr16:32 Jump far, absolute, address given in operand.
FF /5 JMP m16:16 Jump far, absolute indirect, address given in m16:16.
FF /5 JMP m16:32 Jump far, absolute indirect, address given in m16:32.
So, you need to consider more special cases.
The solution is to implement a multipass assembler. You need to store anyway all the opcodes and operands in a std::vector or wherever. Then you can set the correct data in the 2nd step.
If you define a struct for the opcodes and operands and store all these structs in a std::vector, it will not have an influence on the other opcodes/operands. You can also run multiple passes, until everything is correct.
And then, when everything is fixed, you can go over the std::vector again and emit the real needed data.
You may come up with:
struct Operation {
bool updateNeeded{false};
unsigned int opcode{};
unsigned long operand1{};
unsigned long operand2{};
unsigned long operand3{};
size_t indexOfRelated{};
};
std::vector<Operation> operation;
Of course you can add more attributes as needed.
Then you can read the source data and anf fill the std::vector. After having read the complete source code, you will go over the data again, and fix the open issues.
Then, hand this over to the virtual machine, or emit the final instructions.

Mixing c++ and assembly cant pass multiple paramaters from C++ function to assembly

I've been frustrated by passing parameters from a c++ function to assembly. I couldn't find anything that helped on Google and would really like your help. I am using Visual Studio 2017 and masm to compile my assembly code.
This is a simplified version of my c++ file where I call the assembly procedure set_clock
int main()
{
TimeInfo localTime;
char clock[4] = { 0,0,0,0 };
set_clock(clock,&localTime);
system("pause");
return 0;
}
I run into problems in the assembly file. I can't figure out why the second parameter passed to the function turns out huge. I was going off my textbook, which shows similar code with PROC followed by parameters. I don't know why the first parameter is passed successfully and the second one isn't. Can someone tell me the correct way to pass multiple parameters?
.code
set_clock PROC,
array:qword,address:qword
mov rdx,array ; works fine memory address: 0x1052440000616
mov rdi,address ; value of rdi is 14757395258967641292
mov al, [rdx]
mov [rdi],al ; ERROR: cant access that memory location
ret
set_clock ENDP
END
MASM's high-level crap is biting you in the ass. x64 Windows passes the first 4 args in rcx, rdx, r8, r9 (for any of those 4 that are integer/pointer).
mov rdx,array
mov rdi,address
assembles to
mov rdx, rcx ; clobber 2nd arg with a copy of the 1st
mov rdi, rdx ; copy array again
Use a disassembler to check for yourself. Always a good idea to check the real machine code by disassembling or using your debuggers disassembly instead of source mode, if anything weird is happening with assembler macros.
I'm not sure why this would result in an inaccessible memory location. If both args really are pointers to locals, then it should just be loading and storing back into the same stack location. But if char clock[4] is a const in static storage, it might be in a read-only memory page which would explain the store failing.
Either way, use a debugger and find out.
BTW, rdi is a call-preserved (aka non-volatile) register in the x64 Windows convention. (https://msdn.microsoft.com/en-us/library/9z1stfyw.aspx). Use call-clobbered registers for scratch regs unless you run out and need to save/restore some call-preserved regs. See also Agner Fog's calling conventions doc (http://agner.org/optimize/), and other links in the x86 tag wiki.
It's call-clobbered in x86-64 System V, which also passes args in different registers. Maybe you were looking at a different example?
Hopefully-fixed version, using movzx to avoid a false dependency on RAX when loading a byte.
set_clock PROC,
array:qword,address:qword
movzx eax, byte ptr [array]
mov [address], al
ret
set_clock ENDP
I don't use MASM, but I think array:qword makes array an alias for rcx. Or you could skip declaring the parameters and just use rcx and rdx directly, and document it with comments. That would be easier for everyone to understand.
You definitely don't want useless mov reg,reg instructions cluttering your code; if you're writing in asm in the first place, wasted instructions would cut into any speedups you're getting.

macOS - Reading part of other app library code and disassembling it to get offset

My applications read other application memory in order to get pointer. I need firstly to read offset from static library to start working with application itself.
Some function in dylib contains offset to pointer "0x41b1110" - i know that this offset works when used manually, but i need to to read that with my application automatically without checking value manually, if i do simple read from memory from address SomeAddressX as uint64_t it get's ridiculous address which is not equal 0x41b1110. im pretty sure what i got is simply this instruction. Then i have tried read this as byte array, and this byte array was equal to byte array from plain binary at this address. Im wondering how to read simply "0x41b1110" not entire instruction? Do i need to disassembly byte code to x64 instruction and then parse it to get address, or is there smarter way ? Im not much experienced with asm.
SomeAddressX - rax, qword [ds:0x41b1110]
Adding Example byte code and instruction
lea rax, qword [ds:0x1043740]
which gives
48 8D 05 6F D9 99 00
first three 48 8D 05 appears to be lea rax, qword but the other part 6F D9 99 00 is not looking like 01 04 37 40 (0x1043740) ?
It's x64 and enforced PIC (position-independent code) code on OSX (doesn't allow non-PIC executables, as it is using ASLR).
So that disassembly is hiding an important bit of information from you. The true nature of that instruction is revealed here (ba dum ts):
lea rax,[rip+0x99d96f]
It's using current instruction pointer rip to relatively address it's data.
The 0x1043740 is result of addressOfInstruction + 7 + 0x99d96f.
The 0x99d96f part is clearly visible in the bytecode itself.
The +7 is instruction opcode size. Now I'm not 100% sure it's added too at that stage, so do your own math, as you know "addressOfInstruction".
And check out your debugger options, to see if you can switch between the friendly absolute memory display vs. true rip+offset disassembly.

C++ __asm Generating different bytes

In my function I use
__asm
{
mov ecx,dword ptr [0x28F1431]
mov ecx,ds:[0x28F14131]
}
which should produce the following bytes: 0x8B0D (mov ecx, dword ptr []). However the first instruction produces 0xB9 (mov ecx,0x28F14131) and the second one 0x3E:8B0D
So my question is, what instruction should I use to get the desired result inside the C++ __asm?
If you know for 100% certain what your byte sequence is supposed to be for your inlined assembly, you can always explicitly use those bytes. The exact syntax escapes me, but if you are using GCC, you may try ....
__asm {
.byte 0x##
.byte 0x##
...
}
This approach only works if you know with 100% certainty what the byte sequences for the whole instruction are. AND if you are going to do this, be sure to comment appropriately.
(For what it is worth, I have had to use this approach in the past to work around a compiler bug where no matter what it would otherwise use the wrong byte sequence for one of the instructions.)

How to hook C++ functions with asm

I want to hook a C++ function. But I don't want to use the trampoline mechanism of ms detours, instead of it I want to fully patch it. I can get the handle to the DLL, where the function is located and I have the right offset(imageBase stuff ...). So how to hook it? And I don't know the data types of the arguments(var_4 and arg_0), or aren't they needed? In general I want to replace following function with my own one(my function is nearly the same, there's only a line changed):
sub_39001A40 proc near
var_4 = dword ptr -4
arg_0 = dword ptr 4
push ecx
cmp dword_392ADAB4, 0
jnz short loc_39001A4F
call loc_39024840
loc_39001A4F:
push esi
mov esi, [esp+8+arg_0]
lea eax, [esp+8+var_4]
push eax
push esi
call dword_392ADA98
mov ecx, [esp+10h+var_4]
add esp, 8
add dword_392ADA80, ecx
adc dword_392ADA84, 0
add dword_392ADA90, esi
pop esi
adc dword_392ADA94, 0
add dword_392ADA7C, 1
pop ecx
retn
sub_39001A40 endp
It's bad, that I only can hook functions, which names I know with ms detours. I cannot hook those asm functions with detours, cause I need the data types of the arguments passed for creating the function structures!
EDIT::::
"What's wrong with detours, exactly?"
I wrote: "I don't want to use the trampoline mechanism of ms detours, instead of it I want to fully patch it." and "It's bad, that I only can hook functions, which names I know with ms detours. I cannot hook those asm functions with detours, cause I need the data types of the arguments passed for creating the function structures!" and I don't have the source code of the C++ files. I only have the hex-dump.
"Trampoline is an actual technical term :) I'm just wondering why #lua can't use it."
I write: Read my sentences again, if you still don't understand why, my english is bad.
"Overriding just the named function should work, of course you may need to re-implement the whole DLL (depending on if it is of any further use to you). Given your grasp of assembler you might get away with using a hex editor to edit (a copy of) the original DLL you are seeking to subvert."
I want to hook the function, because I don't want to edit the file. I can't overwrite my function, because I don't know the datatypes of the arguments and the function's name.
#asveikau: Thanks for your real help, but I don't want to use a trampoline mechanism, I want to overwrite the function.
A good trick is to replace the first few instructions with this:
push dword xxxx ; where xxx = new code location
ret
This is sort of like an obfuscated jmp. I write it this way because the assembled version of this is very easy to replace the push operand with your pointer at runtime. It assembles to:
68 XX XX XX XX c3
Where "XX XX XX XX" is your address in little-endian.
Then you can make a "call the old version of the function" code location, where the first few instructions are the ones you replaced with the sequence above, followed by a jump to the next valid instruction in the original code.
Overriding just the named function should work, of course you may need to re-implement the whole DLL (depending on if it is of any further use to you). Given your grasp of assembler you might get away with using a hex editor to edit (a copy of) the original DLL you are seeking to subvert.