How do I find out what writes to an address using C++? - c++

I have an address that get's writen to 1000x per second by 300 different instructions. How can I use c++ to find out the last instruction to write to an address?
I already have made it so it alerts me the instance a specific value is written to an address, but how can I make it print the last instruction address that wrote that specific value?
I would do this in a debugger but all of the debuggers I've found cannot handle doing a conditional breakpoint on an address that changes 1000x per second without freezing the program.
If I can't do this in C++, what are other ways that I can do this? I need to find what address instruction writes a specific value to a memory address that receives over 1000 writes per second from different addresses.
Update:
I am using Windows 7 x32 for those wondering.

Take a look at pin. Briefly, pin allows you to instrument your code at the x86 instruction level, allowing you to track reads and/or writes as you please. I've used it myself to model cache performance and found it fairly fast.

already have made it so it alerts me the instance a specific value is written to an address, but how can I make it print the last instruction address that wrote that specific value?
If it's just for one-off debugging, have the code that alerts system/popen pstack (http://www.linuxcommand.org/man_pages/pstack1.html) or similar - some external program that dumps your call stack. Exactly which program to use is highly OS dependent, and you've said nothing of your environment. (This is a common technique for generating call stacks from signal handlers after invalid memory accesses etc.)

Related

How to Read the program counter / Instruction pointer of a specific core in Kernel Mode?

Windows 10, x64 , x86
My current knowledge
Lets say it is quad core, there will be 4 individual program counters which will point to 4 different locations of code for parallel execution.
Each of this program counters indicates where a computer is in its program sequence.
The address it points to changes after a context switch where another threads program counter gets placed onto the program counter to execute.
What I want to do:
Im in Kernel Mode my thread is running on core 1 and I want to read the current instruction pointer of core 2.
Expected Results:
0x203123 is the address of the instruction pointer and this address belongs to this thread and this thread belongs to this process... etc.
Anyone knows how to do it or can give me good book references, links etc...
Although I don't believe it's officially documented, there is a ZwGetContextThread exported from ntdll.dll. Being undocumented, things can change (and I haven't tried it in quite a while) but at least when I last tried it, you called it with a thread handle and a pointer to a CONTEXT structure, and it would return that thread's context.
I'm not certain exactly how up-to-date that is though. It's never mattered to me, so I haven't checked, but my guess would be that the IP in the CONTEXT you get is whatever was saved the last time the thread was suspended. So, if you want something (reasonably) current, you'd use ZwSuspendThread, get the context, then ZwResumeThread to start it running again.
Here I suppose I'm probably supposed to give the standard lines about undocumented function being subject to change, using them being a bad idea, and that you should generally leave all of this alone. Ah well, I been disappointing teachers and other authority figures for years, and I guess I'm not changing right now.
On the other hand, there may be a practical problem here. If you really need data that's really current, this probably isn't going to work very well for you. What it gives you will be kind of current at best. On the other hand, really current is almost a meaningless concept with information that goes out of date every clock cycle.
Anyone knows how to do it or can give me good book references, links etc...
For 80x86 hardware (regardless of operating system); there are only 3 ways to do this (that I know of):
a) send an inter-processor interrupt to the other CPU, and have an interrupt handler that stores the "return EIP" (from its stack) at a known address in memory so that your CPU can read "value of EIP immediately before interrupt" (with synchronization so that your CPU doesn't read before the value is written, etc).
b) put the other CPU into some kind of "debug mode" (single-stepping, last branch recording, ...) so that (either code in a debug exception handler or the CPU's hardware itself) is constantly writing EIP values to memory that you can read.
Of course both of these options will ruin performance, and the value you get will probably be useless (because EIP would've changed after you obtain it but before you can use the obtained value). To ensure the value is still useful; you'd need the other CPU to wait until after you've consumed the obtained value (and are ready for the next value); and to do that you'd have to resort to single-step debugging facilities (with the waiting in the debug exception handler), where you'll be lucky if you can get performance better than a thousand times slower (and can probably improve performance by simply disabling other CPUs completely).
Also note that they still won't accurately tell you EIP in all cases (e.g. if the CPU is in SMM/System Management Mode and is beyond the control of the OS); and I doubt Windows kernel supports any of it (e.g. kernel should support single-stepping of user-space processes/threads to allow debuggers to work, but won't support single-stepping of kernel and will probably lock up the computer due to various "waiting for lock to be released for 6 days" problems).
The last of the 3 options is:
c) Run the OS inside an emulator/simulator instead of running it on real hardware. In that case you can probably modify the emulator/simulator's code to inject EIP values somewhere (maybe some kind of virtual "EIP reporting device"?). This will ruin performance of the emulator/simulator, but you may be able to hide that (e.g. "virtual time inside the emulator passes at a rate of one second per 1000 seconds of real time outside the emulator").

Is it Possible to Check How Many Times a Remote Process Accesses an Address in Windows

Using the Windows API Debugging functions or any other Windows API functions and assuming full permissions, is it possible to determine whether a process has accessed an address in it's address space?
My ultimate goal is to create a tool that gets the address of a mapped file in memory and determines how many times each part of the file is accessed by the debugee.
My first thought was memory breakpoints, but I'm not sure if I can set those in a remote process or even what commands I would use to set those.
Edit
Assembly answers using debug registers are welcome, note that I am using the ARM v4 based Intel XScale. I am aware of DBRCON and DBRx registers, but I'm not sure how to get a remote process to use them.
Efficiency is not a concern, so single stepping through a process and determining how many times an address is accessed is fine, but I'm not sure how I would accomplish this programatically.

Debug a program with equal memory address locations over multiple runs?

I have a program that I'm debugging in Visual Studio 2010. I have a reproducible error that occurs in the program and I am printing some diagnostic information. The error leaves the program in a bad state so I have to constantly restart the program. Each time I run the program the addresses for my structs are different. There are many of them and it would be much easier to debug if the addresses would stay the same each time I run the program.
The addresses look almost similar but are different. For example one struct has an address of 0x003F5540 one time, 0x003E5540 the next time, 0x00605540 and 0x004F5540 the next time.
The code executes exactly the same every time so I don't know why I see the slightly different addresses. I have turned off ASLR and DEP. What can I do to get the same addresses every time I run the program?
Thanks
Edit- It may not be possible to disable heap and stack randomization:
1st call to "new" always returns different addresses. How do I get it to return the same address?
There's no "may" about it, address randomization has been the core of every OS since 16 bit protected mode ones. Otherwise you couldn't run the same process twice. Or two processes that chose overlapping virtual base addresses.
Use symbol names instead of pointer values, that's what debug symbols are for!

How does a debugger peek into another process' memory?

When every process has its own private memory space that no external process has access to, how does a debugger access a process' memory space?
For eg, I can attach gdb to a running process using gdb -p <pid>
The I can access all the memory of this process via gdb.
How is gdb able to do this?
I read the relevant questions in SO and no post seems to answer this point.
Since the question is tagged Linux and Unix, I'll expand a little on what David Scwartz says, which in short is "there is an API for that in the OS". The same basic principle applies in Windows as well, but the actual implementation is different, and although I suspect the implementation inside the OS does the same thing, there's no REAL way to know that, since we can't inspect the source code for Windows (one can, however, understanding how an OS and a processor works, sort of figure out what must be happening!)
Linux has a function called ptrace, that allows one process (following some checking of privileges) to inspect another process in various ways. It is one call, but the first parameter is a "what do you want to do". Here are some of the most basic examples - there are a couple of dozen others for less "common" operations:
PTRACE_ATTACH - connect to the process.
PTRACE_PEEKTEXT - look at the attached process' code memory (for example to disassemble the code)
PTRACE_PEEKDATA - look at the attached process' data memory (to display variables)
PTRACE_POKETEXT - write to process' code memory
PTRACE_POKEDATA - write to process' data memory.
PTRACE_GETREGS - copy the current register values.
PTRACE_SETREGS - change the current register values (e.g. a debug command of set variable x = 7, if x happens to be in a register)
In Linux, since memory is "all the same", PTRACE_PEEKTEXT and PTRACE_PEEKDATA are actually the same functionality, so you can give an address in code for PTRACE_PEEKDATA and an address, say, on the stack for PTRACE_PEEKTEXT and it will perfectly happily copy that back for you. The distinction is made for OS/processor combinations where memory is "split" between DATA memory and CODE memory. Most modern OS's and processors do not make that distinction. Same obviously applies to PTRACE_POKEDATA and PTRACE_POKETEXT.
So, say that the "debugger process" uses:
long data = ptrace(PTRACE_PEEKDATA, pid, 0x12340128, NULL);
When the OS is called with a PTRACE_PEEKDATA for address 0x12340128 it will "look" at the corresponding memory mapping for the memory at 0x12340128 (page-aligned that makes 0x12340000), if it exists, it will get mapped into the kernel, the data is then copied out from address 0x12340128 into the local memory, the memory unmapped, and the copied data passed back as the return value.
The manual states the initiating of the usage as:
The parent can initiate a trace by calling fork(2) and having the
resulting child do a PTRACE_TRACEME, followed (typically) by an exec(3).
Alternatively, the parent may commence trace of an existing process
using PTRACE_ATTACH.
For several pages more information do man ptrace.
When every process has its own private memory space that no external process has access to ...
That's false. External processes with the correct permissions and using the correct APIs can access other process' memory.
For linux debugging there is a system call ptrace which makes it possible to control another process on the system. Indeed, you need the rights to do that, which is typically given, if you are the owner of the process and you have not removed the permissions manually.
The os call ptrace itself enables access to memory, program counter, registers and nearly all other related things to read and write.
Please see man ptrace for details.
If you are interested how it works in a debugger, please have a look for the files in
gdb-x.x.x/gdb/linux-nat.c. There you can find the core stuff for accessing other processes to debug.

'Hooking' a memory address with C++?

How reliable is hooking for changing a single static memory address when it hits certain values?
What I'm used to doing is using read/write memory out of a basic c++ application, though I find sometimes this is not reliable for addresses that change 1000+ times per second. Often time my application cannot catch the value at the address with a case function in time enough to change it to another value. How exactly does this concept of hooking work, and does it ever miss a value change? I'm using Win 7 Ult. x86
(reusing an answer I gave to a question I thought was related, but turned out not to be.)
There are environment-specific ways to detect when a variable is changed. You can use the MMU access control flags (via mprotect or VirtualProtect) to generate an exception on the first write, and set a dirty flag from inside the handler. (Almost every modern OS does this with memory-mapped files, to find out whether it needs to be written back to disk). Or you can use a hardware breakpoint to match a write to that address (debuggers use this to implement breakpoints on variables).
Hooking can be done in many ways.
Most require you to have code inside your target process making ReadProcessMemory obsolete (just use pointers and dereference them).
If you want to hook though you can do it like this:
Find out what instruction(s) write to that address (debugger memory breakpoint), it will most likely be a function so what I usually do is just patch some bytes near the beginning to redirect execution flow to my code where it will be executed every time that function is called, what I sometimes do is also alter the return address on the stack so that I can examine and control the return value as well as execute code I want executed after the function is finished (for example, get some info from the stack because I am either too lazy to dig out the structures used to store it or if it's temporary it will be discarded and never saved).