I've been writing a C application, and I came in need of x86 assembly language. I'm pretty new to assembly, and the following snippet causes recursion:
unsigned int originalBP;
unsigned fAddress;
void f(unsigned short aa) {
printf("Function %d\n", aa);
}
unsigned short xx = 77;
void redirect() {
asm {
pop originalBP
mov fAddress, offset f
push word ptr xx
push fAddress
push originalBP
}
}
If i call redirect , it will repeatedly output: "Function 1135"
First, here are a few information about the environment in which this code is executed:
This code is written to be executed under NTVDM
Tiny memory model is used ( all segment pointer registers point to the same segment )
Here's my expectation of what the code above should do ( this is most likely the culprit of the error ) :
Pop the stack and store value in originalBP; I believe the value is actually the address of the current function i.e. redirect
Push f's argument value ( value of xx ) to the stack
Push address of f to stack ( since there's only one segment, only offset is needed )
Push back the address of redirect
Of course, if this were the correct flow, recursion would be apparent ( except the part where 1135 is printed instead of 7). But interestingly, doing the same with a function with no arguments produces only one line of output i.e.:
unsigned int originalBP;
unsigned fAddress;
void f() {
printf("Function");
}
void redirect() {
asm {
pop originalBP
mov fAddress, offset f
push fAddress
push originalBP
}
}
This probably means that my understanding of the above code is completely wrong. What is the real issue in this code?
EDIT: I probably left some of the things unsaid:
This is a 16 bit application
Compiler used is Borland C++ 3.1, as Eclipse plugin
redirect is called from main as redirect()
EDIT (regarding Margaret Bloom's answer) Here's an example of instruction execution once redirect is called. Values in brackets represent stack pointer register and the value at that location before each instruction is executed:
call redirect
(FFF4-04E6) push bp
(FFF2-FFF6) mov bp, sp
(FFF2-FFF6) mov fAddress, offest f
(FFF2-FFF6) pop originalBP
(FFF4-04E6) pop originalRIP
(FFF6-0000) push xx (I've changed xx to 1187)
(FFF4-0755) push originalRIP
(FFF2-04E6) push fAddress
(FFF0-04AC) push originalBP
(FFEE-FFF6) pop bp
(FFF0-04AC) ret
(in f) (FFF2-04E6) push bp
(FFF0-FFF6) mov bp,sp
printf executes
(FFF0-FFF6) pop bp
(FFF2-04E6) ret
Next statements seems to be return 0; which is the end of main.
Execution continues trough bunch of lines, and somehow comes back to the line calling redirect.
For you second snippet, the one without arguments, the stack states are as follow:
Where | Stack (growing on the left)
----------------------+----------------------------
after redirect prolog redirect rip, redirect bp
pop originalBP redirect rip
push fAddress redirect rip, fAddress
push originalBP redirect rip, fAddress, redirect bp
after redirect epilog redirect rip, fAddress
after redirect return redirect rip (control moved to f)
after f prolog redirect rip, f bp
after f epilog redirect rip
after f return (control moved to redirect caller)
Where redirect rip means the return address (return IP) of the function redirect.
As you can see, upon entering of f the stack correctly points to redirect rip, the return address of redirect.
Upon exit, the control flows back to redirect caller.
For your first snippet, the stack goes as follow:
Where | Stack (growing on the left)
----------------------+----------------------------
after redirect prolog redirect rip, redirect bp
pop originalBP redirect rip
push word ptr xx redirect rip, xx
push fAddress redirect rip, xx, fAddress
push originalBP redirect rip, xx, fAddress, redirect bp
after redirect epilog redirect rip, xx, fAddress
after redirect return redirect rip, xx (control moved to f)
after f prolog redirect rip, xx, f bp
after f epilog redirect rip, xx
after f return (control moved to xx)
Upon entering of f we have redirect rip, xx on the stack when we should really have xx, redirect rip.
with the former configuration the parameter aa contains the return address of redirect and the return address of f is the value of xx.
Based on your answer to my comment the code looped by accident.
If you want to call f with arguments, be sure to push them before the return address:
pop originalBP
pop originalRIP
;Arguments go here
push xx
push originalRIP
push fAddress
push originalBP
You didn't post what compiler and compiling options you use to code that redirect.
With optimizations ON, you can't assume the full C function prologue/epilogue will be used, so you are operating with stack without any idea of it's layout (if there would be zero prologue/epilogue, then you did inject 2 values ahead of return address to caller, so redirect would simply return to caller (main?) which may essentially just exit -> no call to f = not your case).
As inside the asm block you already have the fn address, why don't you simply call it? The stack would be like: somebody calls redirect -> redirect calls some address -> address fn() -> returns to redirect -> returns to caller.
It looks to me like you are trying to modify it to: somebody calls redirect -> redirect calls some address -> address fn() -> returns to caller (skipping return to redirect). As the redirect epilogue is tiny bit of code, I don't see much benefit of that modification (also I don't see how it is "context switch" related).
Anyway, check your compiler options how to produce the assembly listing of final code to see how it does really compile, or even better, check it with debugger (step per instruction on assembly level).
EDIT (after providing the debug info):
when you get to return 0, there's additional alien xx injected in stack (sp being 0xFFF4) instead of sp being original FFF6 pointing to the 0.
The end of main probably does not handle this correctly (doing pop bp ret I guess), assuming the sp is correct upon return. (would it do the other C epilogue including mov sp,bp, it would probably survive your stack tampering).
Then again, if it would do other epilogue in all functions, it would do it in redirect() too, so you would have to modify bp as well to make the end of redirect() do ret into fAddress. Like dec bp, dec bp would probably suffice, as you have grown the stack by injecting 2B into params space.
Check the debug one more time when return 0 in main is hit, how it is implemented, if it can cope with modified sp or not (well, obviously it can't, as it loops by accident into redirect).
If that's the case, you should probably patch the main to restore sp before return 0;. I wonder whether simple mov sp,bp would do (bp should be FFF6 ahead of that).
Conclusion: tampering with stack frames across several calls is always tricky business. ;)
So, are you heading toward something like this? (because I sort of can't exactly put my finger on where your code from question will be used then, seems like basic stack exercise to give you idea how code execution can be affected, which will later evolve probably into something like this... maybe... and maybe not).
Fake context switch in 16b in some C-like pseudo code (ok, more like comments only :) ), has to be installed as some timed interrupt:
// should be some "far" type function to preserve "cs" as well
far void fakeThreadSwitch() {
asm {
cli ; or other means to disable thread switch (re-entry)
; store the current values of all registers
pusha
pushf
push ds
push es
; set `ds` to thread contexts data section
; figure out, which thread is currently running
; (have some "size_t currently_running = index;" in context section)
; if none, then pick some SLEEPING
; but have some [root_context] updated (.stack), so you can
; do final switch to it upon terminating the OS.
; verify the ss points to that thread stack ->
; if you by accident did interrupt OS kernel,
; then just return without touching anything (jump to "pop es")
; store ss:sp to [current_thread_context.stack]
; decide if you want to switch to some other context
; (or kill current) simulating "preemptive multitasking"
; if switch, set up all flags correctly (RUNNING/SLEEPING/index)
; load ss:sp from [next_thread_context.stack]
pop es
pop ds
popf
popa
sti ; or enable thread switch interrupt by other means
}
}
Then to start some new thread executing code at fAddress:
void startNewThread(void far *fAddress) {
// allocate some new context for the new thread
// (probably fixed array for max threads, searching for "FREE" one)
// ... (inits fields in some struct [new_thread_context])
// allocate some new stack memory for the new thread
// ... (sets [new_thread_context.stack_allocated])
// set up the stack for initial threadSwitch
uint16_t far * stackEnd = [new_thread_context.stack]
// reserve: es,ds + flags + all + cs:ip (to be executed) + OS exit trap (3x)
stackEnd -= (2 + 1 + 8 + 2 + 3);
// init the values in "stack"
stackEnd[0] = stackEnd[1] = [new_thread_context.ds]; // es, ds
stackEnd[2] = 0; // flags
stackEnd[3] = stackEnd[4] = stackEnd[5] = 0; // di, si, bp
stackEnd[6] = offset(stackEnd+11); // sp ahead of "pusha"
stackEnd[7] = stackEnd[8] = 0; // bx, dx
stackEnd[9] = stackEnd[10] = 0; // cx, ax
stackEnd[11] = segment(fAddress); // "return" to fAddress
stackEnd[12] = offset(fAddress);
// thread_exit_return is some trap function to handle
// far return inside fAddress code, which would probably require
// different design to make this truly usable (to fit C epilogue of f())
stackEnd[13] = segment(&thread_exit_return);
stackEnd[14] = offset(&thread_exit_return);
stackEnd[15] = thread_id;
[new_thread_context.stack] = stackEnd;
// all context data are ready for context switch, mark this thread "ready"
[new_thread_context.running] = SLEEPING;
// now in some future the context-switch may pick this thread from
// pool of sleeping threads, and will switch execution to it
// (through this artificially prepared stack image)
}
One of kernel handlers, this one designed as "landing" point for any f() finishing normally, which would just return (or calling this explicitly).
void thread_exit_return() {
// get the exited thread_id somehow
[thread_context.running] = FINISHED;
// deallocate [thread_context.stack_allocated]
// deallocate thread context (marking it as "FREE"?)
}
Would require some more thought and design how to run kernel itself (whether in just another thread, or in the original app context), and how to give it running time. And how to control kernel to execute new threads, or kill/exit old ones.
Anyway, the important part of this exercise is that push-all-in-thread-stack / pop-all-from-other-stack, to give you rough idea how preemptive multitasking works (although in 32b protected OS this involves much more trickery with CPU switching to protected layer (and back to user land) and using different stack for kernel, etc.. so only the principle is same).
Of course in 16b unprotected this is quite fragile construction, which can be damaged across different threads easily (and I very likely did oversight something important, so it would require very likely some heavy bug fixing to make it work).
Related
I would like to learn if existing GDB for RISC-V supports Program Context aware breakpoints?
By program context aware breakpoints : I mean, when there is JAL or JALR instruction PC changes when there is a function call. in other cases in Function call ==> PC = PC + (Current Program Counter + 4)
in Function Return : PC = PC - (Return address (ra register value) ).
I have installed fedora(risc-V) on my ubuntu(virtual machine). Since it is virtual machine I can't print PC register value, that is why I couldn't check if it supports Program Context aware breakpoint or not?
My second question is : How can I print PC register value on my qemu risc-v virtual machine?
#include<stdio.h>
int check_prime(int a)
{
int c;
for (c=2;c<a;c++)
{
if (a%c == 0 ) return 0;
if (c == a-1 ) return 1;
}
}
void oddn(int a)
{
printf("oddn --> %d is an odd number \n",a);
if (check_prime(a)) printf("oddn --> %d is a prime number\n",a);
}
int main()
{
int a;
a=7;
if (check_prime(a)) printf("%d is a prime number \n",a);
if (a%2==1) oddn(a);
}
This is the program I am trying to breakpoint using GDB.
As you see on the picture it breaks twice(which should break once only).
It also gives error :
Error in testing breakpoint condition:
Invalid data type for function to be called
What you're looking for is documented here:
https://sourceware.org/gdb/current/onlinedocs/gdb/Convenience-Funs.html#index-_0024_005fstreq_002c-convenience-function
You should look at $_caller_is, $_caller_matches, $_any_caller_is, and $_any_caller_matches.
As an example, to check if the immediate caller is a particular function we could do this:
break functionD if ($_caller_is ("functionC"))
Then main -> functionD will not trigger the breakpoint, while main -> functionC -> functionD will trigger the breakpoint.
The convenience functions I listed all take a frame offset that can be used to specify which frame GDB will check (for $_caller_is and $_caller_matches) or to limit the range of frames checked (for $_any_caller_is and $_any_caller_matches).
I am learning the Cortex-M with the MDK uVision IDE. I wrote a simple SysTick_Handler() to replace the WEAK default SysTick_Handler() which is a simple dead loop.
My SysTick_Handler():
The disassembly:
I am confused by the the highlighted assembly line. It is simply a dead loop.
Why is it there? Why the toolchain still generated it despite that I already overwrite the WEAK default implementation with my own SysTick_Handler?
I can still place a breakpoint at that line and it can be hit. And in that case, my code will never be executed.
But strange thing is, if I removed the breakpoint at that line, my code can then be reached. How is that possible?
(Thanks to all the hints the community provided. I think I can explain it now.)
The dead loop is part of my main() function, which is like below. The main() function is just above my SysTick_Handler in the same C file.
int main (void)
{
LED_Initialize();
SysTick->VAL = 0x9000;
//Start value for the sys Tick counter
SysTick->LOAD = 0x9000;
//Reload value
SysTick->CTRL = SYSTICK_INTERRUPT_ENABLE|SYSTICK_COUNT_ENABLE; //Start and enable interrupt
while(1)
{
; // <========= This is the dead loop I saw!
}
}
To double confirm, I modified the while loop to below:
int main (void)
{
volatile int32_t jj = 0;
LED_Initialize();
SysTick->VAL = 0x9000; //Start value for the sys Tick counter
SysTick->LOAD = 0x9000; //Reload value
SysTick->CTRL = SYSTICK_INTERRUPT_ENABLE|SYSTICK_COUNT_ENABLE; //Start and enable interrupt
while(1)
{
;
jj+=0x12345; // <====== add some landmark value
}
}
The generated code is like this now:
Though it is still placed under the SysTick_Handler. I place a break point there to check what's really going on:
The R1 is the constant 0x12345. The R0 is the local variable jj. We can see the R1 does contain the landmark value 0x12345, which is added to R0 (jj). So it must be part of my while(1) loop in the main().
So, the disassembly is correct. Only that the debugger failed to provide a correct interleaving between the source and the disassembly.
And btw, remember to rebuild the target after modifying the code otherwise the uVision IDE debugger will not reflect the latest change....
I am modifying some sections of an executable code compiled in a dll. But a single byte at a fixed address from the entire segment that I am modifying can't be changed, not even read.
The code is very simple:
SEGMENT_DATA segInfo = getSegmentInfo(mHandle, segmentName);
if (segInfo.inFileSegmentAddr == 0) return false;
DWORD mOlProtection;
DWORD mOlProtection_1;
if (segInfo.architecture != MY_ARCH) {
printf(" Not the same architecture!\n");
return 0;
}
if(VirtualProtect((LPVOID)segInfo.segmentAddr, segInfo.segmentSize, PAGE_EXECUTE_READWRITE, &mOlProtection)==0) return false;
DWORD i=0;
for (size_t k = 0; k < segInfo.segmentSize; k++) {
BYTE *lpByteValue = (BYTE*)(segInfo.segmentAddr + k);
BYTE temp = *lpByteValue;
*lpByteValue = temp ^ lDecryptionKey[i];
i++;
i %= decryptionKeyLength;
}
if(VirtualProtect((LPVOID)segInfo.segmentAddr, segInfo.segmentSize, mOlProtection, &mOlProtection_1)==0) return false;
Observations:
Before I modify the memory, I "unprotect" the region with PAGE_EXECUTE_READWRITE flag.
Memory View in visual studio clearly shows me the value at that particular address. Even weirder is that in the second I modify the value manually from the debugger, my code is also able to change that value.
temp variable in the example code contains the value 0xCC
This byte is literally the only one unchanged in a sea of hundred other bytes. It is the only byte marked black in Memory View (the rest are red because they were changed)
Dll is compiled in Debug/x86 . /MTd flag set. No random address (/DYNAMICBASE : NO , /FIXED: NO). No Whole program optimization.
The unmodified byte IS NOT a variable. So it can't be "uninitialized". It is actually a very important byte: it is the instruction opcode. Everything crashes on that byte.
The decryption routine (XOR code) has no effect on the error. I step into the code and look at temp's value before it reaches the xor. This means the decryption key is never used and therefore it can't cause the problem.
Virtual protect succeeds.
Snapshots:
Visual studio can read the address
Can't read byte inside program
I know it's not the value of the byte at that single address that is causing problems (because I found other bytes with the same value that were processed successfully). Perhaps the byte is still "protected"?
Why is this happening?
You could very well deal with a very common scenario of Software Breakpoints.
Software breakpoints are in fact set by replacing the instruction to be breakpointed with a breakpoint instruction.
The breakpoint instruction is present in most CPUs, and usually as short as the shortest instruction, so only one byte on x86 (0xCC, INT 3).
As I don't know if there are any breakpoints at all in your source I can only assume that this is your problem.
I have a game, I disasembled it and located a jump which I want to rewrite,
but whenever I try to write to the address I get a access voilation exception, even when I use VirtualProtect and set the READWRITE permission.
the instruction on 0x0042BD5F is this:
0x0046AACF E9 FF FF 89 FC | jmp some address here
Now, when I try to write to 0x0042BD5F, to change the relative jump address, I get an access voilation exception.
How do I change the jump on that address?
Code was requested, so here it is:
#define AddVar(Type,Name,Address) Type& Name = *reinterpret_cast<Type*>(Address)
/*
Hooker
1b 0x0042BD5F == E9 <relative jmp>
4b 0x0042BD60 - relative jump offset (always the value 0xFFFF89FC)
*/
AddVar(uqbyte, jump_hook_bytes, 0x0042BD60);
//the user tick function
void(*tick)(void);
void SetTick(void(*passed)(void))
{
tick = passed;
}
void Ticker();
void OnDLLLoad(void(*passed)(void) = nullptr)
{
tick = passed;
//point the game loop end to Ticker()
//replace the jump address
//jmp (DESTINATION_RVA - CURRENT_RVA - 5 [sizeof(E9 xx xx xx xx)])
DWORD old;
VirtualProtect(
(LPVOID)0x0042BD5F,
0x05,
PAGE_EXECUTE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE,
&old
);
jump_hook_bytes = (((uqbyte)((uqbyte*)&Ticker) - (uqbyte)0x0042BD5F) - (uqbyte)0x0000005);
}
void Ticker()
{
if (tick != nullptr)
{
tick();
}
__asm
{
MOV EAX, 0x0042B9EA;//old address
JMP EAX;
}
}
uqbyte is an unsigned long.
When calling getlasterror the code seems to return the decimal error 87 (INVALID_PARAMETERS).
The documentation for the memory protection constants says:
The following are the memory-protection options; you must specify one of the following values when allocating or protecting a page in memory.
It then lists a number of values, including the three that you combined together. When the documentation says "specify one of the following values" it means exactly one. You must not combine them.
You need to use PAGE_EXECUTE_READWRITE on its own.
I recommend that you add error checking around all your API calls. I also think that you could avoid hard coding the addresses.
You need to use a debugger or a program that enables the process token privilege for debugger mode (like a trainer). I assume this isn't being used for an online cheat (offline shouldn't matter).
Passing PAGE_EXECUTE | PAGE_EXECUTE_READ | PAGE_EXECUTE_READWRITE is wrong, you need to pass PAGE_EXECUTE_READWRITE only to Virtual Protect. Now it works.
I am using memcpy in my application. memcpy crashes randomely and below is the logs i got in Dr.Watson files.
100181b5 8bd1 mov edx,ecx
100181b7 c1e902 shr ecx,0x2
100181ba 8d7c030c lea edi,[ebx+eax+0xc]
100181be f3a5 rep movsd
100181c0 8bca mov ecx,edx
100181c2 83e103 and ecx,0x3
FAULT ->100181c5 f3a4 rep movsb ds:02a3b000=?? es:01b14e64=00
100181c7 ff1508450210 call dword ptr [Debug (10024508)]
100181cd 83c424 add esp,0x24
100181d0 6854580210 push 0x10025854
100181d5 ff1508450210 call dword ptr [Debug (10024508)]
100181db 83c404 add esp,0x4
Below is the code
memcpy((char *)dep + (int)sizeof(EntryRec) + (int)adp->fileHdr.keySize, data, dataSize );
Where:
dep is a structure
EntryRec is a charecter pointer
adp is a structure
data is not NULL in this case
Has anyone faced this issue and can help me?
I have tried to debug the prog,
then i got the following error
Unhandled exception in Prog.exe(MSVCRTD.DLL):0xC0000005: Access voilation
Data is passed argument for this program and this is void*
Further Info:
I have tried to Debug the code adapter is crashing in the following area this function is present in OUTPUT.c (I think this is a library function)
#else /* _UNICODE */
if (flags & (FL_LONG|FL_WIDECHAR)) {
if (text.wz == NULL) /* NULL passed, use special string */
text.wz = __wnullstring;
bufferiswide = 1;
pwch = text.wz;
while ( i-- && *pwch )
++pwch;
textlen = pwch - text.wz;
/* textlen now contains length in wide chars */
} else {
if (text.sz == NULL) /* NULL passed, use special string */
text.sz = __nullstring;
p = text.sz;
while (i-- && *p) //Crash points here
++p;
textlen = p - text.sz; /* length of the string */
}
Value for variables:
p= ""(not initialised)
i= 2147483598
There are two very likely explanations:
You are using memcpy across overlapping addresses -- the behavior of this situation is undefined. If you require the ability to handle overlapping addresses, std::memmove is the "equivalent" tool.
You are using memcpy to copy to/from memory that is inaccessible to your program.
From the code you've shown, it looks like (2) is the more likely scenario. Since you are able to debug the source, try setting a breakpoint before the memcpy occurs, and verify that the arguments to memcpy all match up (i.e. source + num < dest or source > dest + num).
From the disassembled code it appears that the source pointer is not in your address space.
rep movsb copies from ds:si to es:di. The ?? indicates that the memory at ds:si could not be read.
Is the data pointed to by (char *)dep + (int)sizeof(EntryRec) + (int)adp->fileHdr.keySize always at least dataSize long?
I have come across similar crashes where variable length strings are later treated like fixed with strings.
eg
char * ptr = strdup("some string");
// ...
memcpy(ptr, dest, fixedLength);
Where fixedLength is greater than 10. Obviously these were in different functions so the length issue was not noticed. Most of the time this will work, dest will contain "some string" and after the null will be random garbage. In this case if you treat dest as a null terminated string you will never notice, as you don't see the garbage after the null.
However if ptr is allocated at the end of a page of memory, you can only read to the end of the allocated memory and no further. As soon as you read past the end of the page the operating system will rightly crash your program.
It looks like you've run over the end of a buffer and generated an access violation.
Edit: There still is not enough information. We cannot spot a bug without knowing much more about how the buffer you are trying to copy to is allocated whether it has enough space (I suspect it does not) and whether dataSize is valid.
If memcpy crashes the usual reason is, that you passed illegal arguments.
Note that with memcpy source and destination may not overlap.
In such a case use memmove.
from your code "memcpy((char *)dep + (int)sizeof(EntryRec) + (int)adp->fileHdr.keySize, data, dataSize)" and the debug infomation, the "data" looks like a local variable (on-stack variable), you'd do "data = malloc(DATA_SIZE)" instead of "char data[DATA_SIZE]" etc; otherwise, at your current code line, the "data" was popped already, so may cause memory accessing fault randomly.
I'd suggest using memmove as this handles overlapping strings, when using memcpy in this situation the result is unpredictable.