I need to know if there is a way with linux debugger gdb to detect if a function (any function) of a specific C++ class (represented by file Chord.cc) access a specific memory location (let's say 0xffffbc).
That will help me a lot.
Thanks.
GDB watchpoints are what you're looking for:
Quote from that page:
You can use a watchpoint to stop execution whenever the value of an
expression changes, without having to predict a particular place where
this may happen. (This is sometimes called a data breakpoint.) The
expression may be as simple as the value of a single variable, or as
complex as many variables combined by operators. Examples include:
A reference to the value of a single variable.
An address cast to an appropriate data type. For example, `*(int
*)0x12345678' will watch a 4-byte region at the specified address (assuming an int occupies 4 bytes).
You can then try to apply the techniques from this post to make it a conditional watchpoint, and see if you can find a way to restrict it to particular function calls from that class. You may also find this discussion relevant in that respect.
Related
I was writing some logging logic and wanted to make some indentations. The easiest way to understand whether any function call was present or if some function has finished is to look at the current address of the stack/frame. Let's suppose that stack grows upside down. Then if the stack address in the log() call is smaller than during the previous call, we can increase the indent since some function call was present. I know there are functions like backtrace() that know how to dump it, or you can use some assembly. However, I remember reading about external variables that can be used to retrieve this information. Can someone name these variables or give a reference where I can find them (as far as I remember, it was in some computer systems book like "Computer Systems: A Programmer's Perspective "). Otherwise, what is the most convenient/fast way of getting this information?
Update: I have accidentally found the link I was referring to - Print out value of stack pointer
TLDR: There is no portable way to do what I have described...
This method is highly nonportable and will break under various transformations, but if you're just using it for debug logging it might be suitable.
The easiest way to get something resembling the current stack frame address is just take the address of any automatic-storage (local, non-static) variable. If you want a baseline to compare it against, save the address of some local in main or similar to a global variable. If your program is or might be multi-threaded, use a thread-local variable for this if needed.
I've been looking a bit into Cheat Engine, which allows you to inspect and manipulate the memory of running processes on Windows: You scan for variables based on their value, then you can modify them, e.g. to cheat in a game.
In order to write a bot or something similar, you need to find a static address for the variable you want to change - i.e. one that stays the same if the process is restarted. The method for that goes roughly like this:
Look for the address of the variable you're interested in, searching by value
Look for code using that address, e.g. to find the address of the struct it belongs to (since struct offsets are fixed)
Look for another pointer pointing to that pointer until you find one with a static address (shows as green in Cheat Engine)
It seems to work just fine judging from the tutorials I've looked at, but I have trouble understanding why it works.
Don't all variables, including global static ones, get a pretty random address at runtime time?
Bonus questions:
How can Cheat Engine tell if an address is static (i.e. will stay the same on restart)?
A tutorial referred to the fact that many older and some modern games (e.g. Call of Duty 4) use only static addresses. How is that possible?
I will answer the bonus questions first because they introduce some concepts you may need to know to understand the answer for the main question.
Answering the first bonus question is easy if you know how an executable file works: all the global/static variables are inside the .data section, in which the .exe stores the address offset for the section so Cheat Engine just checks if the variable is in this address range (from this section to the next one).
For the second question, it is possible to use only static addresses, but that is nearly impossible for a game. Even the older ones. What the tutorial creator was probably trying to say is that all variables that he wants, actually had a static pointer pointing to them. But solely by the fact that you create a local variable, or even pass an argument to a function, their values are being stored into the stack. That's why it is nearly impossible to have a "static-only" program. Even if you compile a program that actually doesn't do anything, it will probably have some stuff being stored in the stack.
For the whole question itself, not all dynamic address variables are pointed by a global variable. It depends totally on the programmer. I can create a local variable and never assign its address to a global/static pointer in a C program, for example. The only certain way to find that address in this case is to actually know the code when the variable was first assigned a value in the stack.
Some variables have a dynamic address because they are just local variables, which are stored in the stack the first time they have a value assigned to them.
Some other variables have a static address because they are declared either as a global or a static variable to the compiler. These variables have a fixed address offset that is part of the .data section in the executable file.
The executable file has a fixed offset address for each section inside it, and the .data section is no exception.
But it is worth to note that the offset inside the executable itself is fixed. In the operating system things might be different (all random addresses), but that is the job of an OS, abstracting this kind of stuff for you (creating the executable's virtual address space in this case). So it just looks like static variables are actually static, but only inside the executable's memory space. On the RAM things might be anywhere.
Finally, it is difficult to try to explain this to you because you'll have to understand how executable files work. A good start would be to search for some explanations regarding low-level programming, like stack frame, calling conventions, the Assembly language itself and how compilers use some well-known techniques to manage functions (scopes in general), global/static/local/constant variables, and the memory system (sections, the stack, etc.), and maybe some research into PE (and even ELF) files.
As far as I understand it, variables declared static have a permanent offset within the program data. This means that when the program is loaded into RAM, the offset of the variable will always be the same. Because the beginning address of the program is known globally, finding a static variable based on offset, as you mentioned, should be a trivial task. Therefore, while a pointer to a static variable might be random in the scheme of things, its offset to the beginning of program memory should remain the same no matter when the program starts. So Cheat Engine (though I don't know the software) most likely stores the offset of the static variable, and then when the software starts, applies this logic to find that variable.
As to how it can tell it's a static variable in the first place... well, this is partially a guess, but when you declare a variable static in C, I'm assuming the compiler/linker puts some kind of flag so the OS knows that it's a static variable. It could also be that all static variables are stored in a certain way, or at a certain address offset, for all programs compiled for a certain target system. Again, not too sure about that, but from what I understand about memory management, that seems to make the most sense. With these assumptions, it's quite possible for a program to contain solely static variables. The difference is that memory is assigned statically at program runtime, as a opposed to dynamically (as with a call to malloc() or similar). If the variables were stored dynamically, I'm sure there'd be a way to find them easily, so I don't think it matters to Cheat Engine whether or not a variable is static or not. However, as I'm assuming Cheat Engine wants to modify a game upon startup (just like the old GameSharks used to... ahh, miss those days) it's probably more reliable to modify variables that are static, instead of trying to locate pointers and disassemble the code, etc. etc.
If you're interested in learning more, I'd recommend checking out something like this tutorial over at OSDev!
This is something that recently crossed my mind, quoting from wikipedia: "To initialize a function pointer, you must give it the address of a function in your program."
So, I can't make it point to an arbitrary memory address but what if i overwrite the memory at the address of the function with a piece of data the same size as before and than invoke it via pointer ? If such data corresponds to an actual function and the two functions have matching signatures the latter should be invoked instead of the first.
Is it theoretically possible ?
I apologize if this is impossible due to some very obvious reason that i should be aware of.
If you're writing something like a JIT, which generates native code on the fly, then yes you could do all of those things.
However, in order to generate native code you obviously need to know some implementation details of the system you're on, including how its function pointers work and what special measures need to be taken for executable code. For one example, on some systems after modifying memory containing code you need to flush the instruction cache before you can safely execute the new code. You can't do any of this portably using standard C or C++.
You might find when you come to overwrite the function, that you can only do it for functions that your program generated at runtime. Functions that are part of the running executable are liable to be marked write-protected by the OS.
The issue you may run into is the Data Execution Prevention. It tries to keep you from executing data as code or allowing code to be written to like data. You can turn it off on Windows. Some compilers/oses may also place code into const-like sections of memory that the OS/hardware protect. The standard says nothing about what should or should not work when you write an array of bytes to a memory location and then call a function that includes jmping to that location. It's all dependent on your hardware and your OS.
While the standard does not provide any guarantees as of what would happen if you make a function pointer that does not refer to a function, in real life and in your particular implementation and knowing the platform you may be able to do that with raw data.
I have seen example programs that created a char array with the appropriate binary code and have it execute by doing careful casting of pointers. So in practice, and in a non-portable way you can achieve that behavior.
It is possible, with caveats given in other answers. You definitely do not want to overwrite memory at some existing function's address with custom code, though. Not only is typically executable memory not writeable, but you have no guarantees as to how the compiler might have used that code. For all you know, the code may be shared by many functions that you think you're not modifying.
So, what you need to do is:
Allocate one or more memory pages from the system.
Write your custom machine code into them.
Mark the pages as non-writable and executable.
Run the code, and there's two ways of doing it:
Cast the address of the pages you got in #1 to a function pointer, and call the pointer.
Execute the code in another thread. You're passing the pointer to code directly to a system API or framework function that starts the thread.
Your question is confusingly worded.
You can reassign function pointers and you can assign them to null. Same with member pointers. Unless you declare them const, you can reassign them and yes the new function will be called instead. You can also assign them to null. The signatures must match exactly. Use std::function instead.
You cannot "overwrite the memory at the address of a function". You probably can indeed do it some way, but just do not. You're writing into your program code and are likely to screw it up badly.
How can I statically tell Visual C++ to place a global variable at a given absolute address in memory, like what __attribute__((at(address))) does?
It can be done but I don't believe there is a predefined way to do it so it will take some experimentation. Even though I don't see much benefit if you create your variable at run time just at the start of user code execution.
So first specify the section/segment where to init your variable using the allocate MS specific specifier. Then either start your application in real scenario, dump it or debug it and see where your variable appears. Watch for relocations (there is some ways to try to enforce no relocation but they are not guaranteed to be honored all the time). Another way is to use some code in your app like this one to find the address of the section you defined.
If you for some reason cannot get a consistent behavior you can use this utility to manipulate the virtual address of your object file. All in all except hurdles along the way but overall I don't see why you wouldn't be able to get it to work for your specific scenario if you are persistent enough.
I have a large inherited C/C++ project. Are there any good tools or techniques to produce a report on the "sizeof" of all the datatypes, and a breakdown of the stack footprints of each function in such a project.
I'm curious to know why you want to do this, but that's merely a curiosity.
Determining the sizeof for every class used should be simple, unless they've been templated, in which case you'd have to check every instantiation, also.
Likewise, determining the per call sizeof on a function is simple: it's a sizeof on each passed parameter plus some function overhead.
To determine the full memory usage of the whole program, if it's not all statically defined, couldn't be done without a runtime profiler.
Writing a shell scrip that would collect all the class names into a file would be pretty simple. That file could be constructed as a .cpp file that was a series of calls to sizeof on each class. If the file also #included each header file, it could be compiled and run to get an output of the memory footprint of just the classes.
Likewise, culling all of the function definitions to see when they're not using reference or pointer arguments (ie copying the entire class instance onto the stack) should be pretty straight-forward.
All this goes to say that I know of no existing tool, but writing one shouldn't be difficult.
I'm not aware of any tools, but if you're working under MSVC you can use DIA SDK to extract size information from .PDB files. Sadly, this wont work for stack footprints IIRC.
I'm not sure if the concept of the stack footprint actually exists with modern compilers. That is to say, I think that determining the amount of stack space used depends on the branches taken, which in turn depends on input parameters, and in general requires solving the halting problem.
I am looking for the same information about stack footprint for functions, and I dont believe what warren said is true. Yes, part of what impacts the stack in a function is the parameters, but I've also found that every local variable in a function, regardless of the scoping of said variable, is used to determine the amount of stack space to reserve for the function.
In the particular poor code example I am working with, there are >200 local class instances, each guarded by if (blah-blah) clauses, but the stack space reserved is modified by these guarded local variables.
I know what I need is to be able to read the function prologue for each method to determine the amount of space being reserved for the function, now how would I do that....?