How do I trick linux into thinking a memory read/write was successful? I am writing a C++ library such that all reads/writes are redirected and handled transparently to the end user. Anytime a variable is written or read from, the library will need to catch that request and shoot it off to a hardware simulation which will handle the data from there.
Note that my library is platform dependent on:
Linux ubuntu 3.16.0-39-generic #53~14.04.1-Ubuntu SMP x86_64 GNU/Linux
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
Current Approach: catch SIGSEGV and increment REG_RIP
My current approach involves getting a memory region using mmap() and shutting off access using mprotect(). I have a SIGSEGV handler to get the info containing the memory address, export the read/write elsewhere, then increment context REG_RIP.
void handle_sigsegv(int code, siginfo_t *info, void *ctx)
{
void *addr = info->si_addr;
ucontext_t *u = (ucontext_t *)ctx;
int err = u->uc_mcontext.gregs[REG_ERR];
bool is_write = (err & 0x2);
// send data read/write to simulation...
// then continue execution of program by incrementing RIP
u->uc_mcontext.gregs[REG_RIP] += 6;
}
This works for very simple cases, such as:
int *num_ptr = (int *)nullptr;
*num_ptr = 10; // write segfault
But for anything even slightly more complex, I receive a SIGABRT:
30729 Illegal instruction (core dumped) ./$target
Using mprotect() within SIGSEGV handler
If I were to not increment REG_RIP, handle_sigsegv() will be called over and over again by the kernel until the memory region becomes available for reading or writing. I could run mprotect() for that specific address, but that has multiple caveats:
Subsequent memory access will not trigger a SIGSEGV due to the memory region now having PROT_WRITE ability. I have tried to create a thread that continuously marks the region as PROT_NONE, but that does not elude the next point:
mprotect() will, at the end of the day, perform the read or write into memory, invalidating the use case of my library.
Writing a device driver
I have also attempted to write a device module such that the library can call mmap() on the char device, where the driver will handle the reads and writes from there. This makes sense in theory, but I have not been able to (or do not have the knowledge to) catch every load/store the processor issues to the device. I have attempted overwrite the mapped vm_operations_struct and/or the inode's address_space_operations struct, but that will only call reads/writes when a page is faulted or a page is flushed into backing store.
Perhaps I could use mmap() and mprotect(), like explained above, on the device that writes data nowhere (similar to /dev/null), then have a process that recognizes the reads/writes and routes the data from there (?).
Utilize syscall() and provide a restorer assembly function
The following was pulled from the segvcatch project1 that converts segfaults into exceptions.
#define RESTORE(name, syscall) RESTORE2(name, syscall)
#define RESTORE2(name, syscall)\
asm(\
".text\n"\
".byte 0\n"\
".align 16\n"\
"__" #name ":\n"\
" movq $" #syscall ", %rax\n"\
" syscall\n"\
);
RESTORE(restore_rt, __NR_rt_sigreturn)
void restore_rt(void) asm("__restore_rt") __attribute__
((visibility("hidden")));
extern "C" {
struct kernel_sigaction {
void (*k_sa_sigaction)(int, siginfo_t *, void *);
unsigned long k_sa_flags;
void (*k_sa_restorer)(void);
sigset_t k_sa_mask;
};
}
// then within main ...
struct kernel_sigaction act;
act.k_sa_sigaction = handle_sigegv;
sigemptyset(&act.k_sa_mask);
act.k_sa_flags = SA_SIGINFO|0x4000000;
act.k_sa_restorer = restore_rt;
syscall(SYS_rt_sigaction, SIGSEGV, &act, NULL, _NSIG / 8);
But this ends up functioning no different than a regular sigaction() configuration. If I do not set the restorer function the signal handler is not called more than once, even when the memory region is still not available. Perhaps there is some other trickery I could do with the kernel signal here.
Again, the entire objective of the library is to transparently handle reads and writes to memory. Perhaps there is a much better way of doing things, maybe with ptrace() or even updating the kernel code that generates the segfault signal, but the important part is that the end-user's code does not require changes. I have seen examples using setjmp() and longjmp() to continue after a segfault, but that would require adding those calls to every memory access. The same goes for converting a segfault to a try/catch.
1 segvcatch project
You can use mprotect and avoid the first problem you note by also having the SIGSEGV handler set the T flag in the flags register. Then, you add a SIGTRAP handler that restores the mprotected memory and clears the T flag.
The T flag causes the processor to single step, so when the SEGV handler returns it will execute that single instruction, and then immediately TRAP.
This still leaves you with your second problem -- the read/write instruction will actually occur. You may be able to get around that problem by carefully modifying the memory before and/or after the instruction in the two signal handlers...
Related
I try to implement functional ISA simulator: targets are RISC-V and MIPS.
It is step by step instruction interpreter.
abstract step:
while(num_steps)
{
try
{
take_interrupt();// take pending interrupts
fetch(); // fetch instruction from memory
decode(); // find handler to instruction
execute(); // perform instruction
}
catch (Trap& e)
{
take_trap(e); //configure appropriate system registers and jump to trap vector.
}
}
As you can see C++ exceptions are used to transfer the control flow.
Maybe there can be more handsome design?
Question: What is best way/practise to implement traps at functional ISA simulators. Also i interested in exceptions/trap implementation at translation simulators, like QEMU.
Note: by the word trap i mean ISA defined traps, not application error: misaligned memory access, illegal instruction, system register access fault, privilege level change, etc.
QEMU uses the C setjmp()/longjmp() mechanism for dealing with most exceptions: when we detect something like a page fault we set some flags to indicate the type of exception, and then longjmp() out to the top-level "execute code" loop. That loop then looks at the flags and sets the CPU state up for "enter exception handler" before continuing to execute guest code.
So we use the C equivalent of throwing an exception; as NonNumeric says there is no requirement to implement guest exceptions like this (the coincidence of names is just coincidence). But since a memory access triggering a page fault is the non-common case, it's more efficient to longjmp or throw a C++ exception rather than include "handle failure return" in all the memory access codepaths. Guest memory access is a particular hotspot and QEMU implements its memory access fastpath with a bit of custom inline assembly, so we care about the extra handful of instructions that would be required to exit to the top level loop on a page fault without doing the longjmp. A simulator which uses a simple "fetch/decode/execute" loop without doing JIT of guest code doesn't care so much about performance, so your choice may come down to preferences for code style and maintainability.
QEMU is written in C so it doesn't use C++ exceptions. You don't have to handle ISA traps via C++ exceptions either. Exceptions should be used when useful to you as implementer, nothing more.
Also note that the traps are not something way too special, they are still part of the emulated system's workflow. It is perfectly legal to encode division like:
if (reg[divisor] != 0)
reg[target] = reg[divident] / reg[divisor];
else
trap(TRAP_DIV0)
Where the trap() function directly updates the archutecture state so that the next instruction to emulate would be from the exception handler.
void trap(int trap_id)
{
// save relevant registers according to platform spec
...
// set instruction pointer to trap handler start
reg[IP_INDEX] = trap_table[trap_id].ip;
// update other registers according to spec
...
}
C++ exceptions can make your life easier. For example memory accesses on many platforms need to convert virtual to physical addresses. This conversion may result in a trap (due to insufficient access or wrong configuration). It may be easier to write:
void some_isa_instruction_handler()
{
int value1 = read_memory(address1);
int value2 = read_memory(address2);
int res = perform_something(value1, value2);
write_memory(address3, res);
}
where read_memory() and write_memory() would simply throw C++ exception when ISA trap is needed than manually checking if each operation has generated a trap. Then the take_trap() function would rollback whatever changes were performed by the interrupted instruction handler (if needed) and set up execution to go emulate the trap handler as trap() above did.
Emulating a CISC system may benefit more from such style.
I have to use c++ classes which is not properly written - there is no information if one function in loop is executed properly or not.
If it is not, I receive segmentation fault and I'm loosing everything what was calculated. I would like to convert SIGSEGV signal to break loop. Is there any possibility?
Using signal handlers from #include <csignal> doesn't help.
A segmentation fault may happen in two ways:
uncontrolled segfaults where a process is accessing addresses for which this access is not well defined.
#JSB this is the case you're dealing with and there's little you can do about it, other then getting the offending code fixed.
When an uncontrolled segfault happens, which is the case for buggy code in 99.999% of all cases the only reasonable thing to do is cover your losses (you may write to already opened files from a SIGSEGV handler) and terminate the process.
#JSB the following does not apply to you! This is just for completenes!
"controlled" segfaults where a process accesses addresses which are allocated by the process, but read/write/execute access is disabled.
A controlled segfault may be induced in the following way
size_t const sz_p = pagesize;
char *p = mmap(NULL, sz_p, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
strcpy(p, "sigsegv");
So why is this a controlled segfault? Because you can actually react to it in a sensible way. In the SIGSEGV handler you can set the memory protection of the pages which access caused the segfault to allow the access
void sigsegv_handler(int, siginfo_t *info, void *)
{
if( ((char*)info->si_addr - p) < sz_p
&& ((char*)info->si_addr - p) >= 0 ) {
mprotect(p, sz_p, PROT_READ | PROT_WRITE);
}
}
It is important to understand that this kind of SIGSEGV handler is well behaved and defined only if the segfault was caused by access to an actually allocated memory objects and if the signal handler action only sets memory protection flags on memory objects owned by the process. You can't use it to make broken code magically work!
So why would one actually do this? One example would be client side implementation of APIs that allow network distribution and also allow to map objects into memory, like OpenGL, which has the API functions glMapBuffer / glUnmapBuffer. To avoid unneccesary round trips and transfer you'd want to transfer only those parts of the buffer actually read from and/or modified. For this you have to somehow detect which pages a program touches. Some OSs (like Windows) have a dedicated API for this, but in *nix-es you have to work with mmap + mprotect + SIGSEGV handler tricks to implement this.
I just informed myselve about "signals" in C/C++ and played around. But i have a problem to understand the logic of SIGFPE.
I wrote a little program which will run into a division by zero, if this happens then the signal should be triggered and the signal handler should be executed. But instead my program just crashes. So what is the purpose of the SIGFPE if it does not even work on division by zero?
#include <stdio.h>
#include <signal.h>
#include <iostream>
int signal_status = 0;
void my_handler (int param)
{
signal_status = 1;
printf ("DIVISION BY ZERO!");
}
int main ()
{
signal (SIGFPE, my_handler);
int result = 0;
while(1)
{
system("cls");
printf ("signaled is %d.\n", signal_status);
for(int i=10000; i>-1; i--)
{
result = 5000 / i;
}
}
getchar();
return 0;
}
As I commented, most signals are OS specific. For Linux, read carefully signal(7). You forgot a \n inside your printf (usually, you'll be lucky enough to see something work in your code, but read all my answer). And in principle you should not call printf (which is not an async-signal-safe function, you should use directly and only write(2) inside) from your signal handler.
What probably is happening is (ignoring the undefined behavior posed by wrongly using printf inside the signal handler) is that:
your stdout buffer is never flushed since you forgot a \n (you might add a fflush(NULL);...) in the printf inside my_handler in your code
probably, the SIGFPE handler restarts again the machine code instruction triggering it. (More exactly, after returning from sigreturn(2) your machine is in the same state as it was before SIGFPE was delivered, so the same divide-by-zero condition happens, etc...)
It is difficult (but painfully possible, if you accept coding hardware-specific and operating-system specific code) to handle SIGFPE; you would use sigaction(2) with SA_SIGINFO and handle the third argument to the signal handler (which is a ucontext_t pointer indirectly giving the machine state, including processor registers, which you might change inside your handler; in particular you could change your return program counter there). You might also consider using sigsetjmp(3) inside your signal handler (but it is in theory forbidden, since not async-signal-safe).
(You certainly need to understand the details of your processor's instruction set architecture and your operating system's ABI; and you probably would need a week of coding work after having mastered these)
In a portable POSIX way, SIGFPE cannot really be handled, as explained in Blue Moon's answer
Probably, the runtime of JVM or of SBCL is handling SIGFPE in a machine & operating system specific way to report zero-divides as divide-by-zero exceptions .... (to Java programs for JVM, to Common Lisp programs for SBCL). Alternatively their JIT or compiler machinery could generate a test before every division.
BTW, a flag set inside a signal handler should be declared volatile sig_atomic_t. See POSIX specification about <signal.h>
As a pragmatical rule of thumb, a POSIX portable and robust signal handler should only set some volatile sig_atomic_t and/or perhaps write(2) a few bytes to some pipe(7) (your process could set up a pipe to itself -as recommended by Qt-, with another thread and/or some event loop reading it), but this does not work for asynchronous process-generated signals like SIGFPE, SIGBUS, SIGILL, and SIGSEGV, etc... (which could only be handled by painful computer-specific code).
See also this answer to a very related question.
At last, on Linux, signal processing is believed to be not very quick. Even with a lot of machine-specific coding, emulating GNU Hurd external pagers by tricky SIGSEGV handling (which would mmap lazily ....) is believed to be quite slow.
Divide by zero is undefined behaviour. So whether you have installed a handler for SIGFPE or not is of little significance when your program invokes undefined behaviour.
POSIX says:
Delivery of the signal shall have no effect on the process. The
behavior of a process is undefined after it ignores a SIGFPE, SIGILL,
SIGSEGV, or SIGBUS signal that was not generated by kill(),
sigqueue(), or raise().
A signal is raised as a result of an event (e.g. sending SIGINT by pressing CTRL+C) which can be handled by the process if said event non-fatal. SIGFPE is an erroneous condition in the program and you can't handle that. A similar case would be attempting to handle SIGSEGV, which is equivalent to this (undefined behaviour). When your process attempts to access some memory for which it doesn't have access. It would be silly if you could just ignore it and carry on as if nothing happened.
I am using mprotect to set some memory pages as write protected. When any writing is tried in that memory region, the program gets a SIGSEGV signal. From the signal handler I know in which memory address the write was tried, but I don't know the way how to find out which instruction causes write protection violation. So inside the signal handler I am thinking of reading the program counter(PC) register to get the faulty instruction. Is there a easy way to do this?
If you install your signal handler using sigaction with the SA_SIGINFO flag, the third argument to the signal handler has type void * but points to a structure of type ucontext_t, which in turn contains a structure of type mcontext_t. The contents of mcontext_t are implementation-defined and generally cpu-architecture-specific, but this is where you will find the saved program counter.
It's also possible that the compiler's builtins (__builtin_return_address with a nonzero argument, I think) along with unwinding tables may be able to trace across the signal handler. While this is in some ways more general (it's not visibly cpu-arch-specific), I think it's also more fragile, and whether it actually works may be cpu-arch- and ABI-specific.
I need to print stack trace from a signal handler of 64-bit mutli-threaded C++ application running on Linux. Although I found several code examples, none of them compiles. My blocking point is getting the caller's (the point where the signal was generated) address from the ucontext_t structure. All of the information I could find, points to the EIP register as either ucontext.gregs[REG_EIP] or ucontext.eip. It looks like both of them are x86-specific. I need 64-bit compliant code for both Intel and AMD CPUs. Can anybody help?
there is a glibc function backtrace. The man page lists an example the the call:
#define SIZE 100
void myfunc3(void) {
int j, nptrs;
void *buffer[100];
char **strings;
nptrs = backtrace(buffer, SIZE);
printf("backtrace() returned %d addresses\n", nptrs);
/* The call backtrace_symbols_fd(buffer, nptrs, STDOUT_FILENO)
would produce similar output to the following: */
strings = backtrace_symbols(buffer, nptrs);
if (strings == NULL) {
perror("backtrace_symbols");
exit(EXIT_FAILURE);
}
for (j = 0; j < nptrs; j++)
printf("%s\n", strings[j]);
free(strings);
}
See the man page for more context.
it's difficult to tell if this really is guaranteed to work from a signal handler, since posix lists only a few reentrant functions that are guaranteed to work. Remember: a signal handler may be called while the rest of your process is right in the middle of an malloc call.
My guess is, that this usually works, but it may fail from time to time. For debugging this may be good enough.
The usual way of getting a stack trace is to take the address of a local
variable, then add some magic number to it, depending on how the
compiler generates code (which may depend on the optimization options
used to compile the code), and work back from there. All very system
dependent, but doable if you know what you're doing.
Whether this works in a signal handler is another question. I don't
know about the platform you describe, but a lot of systems install a
separate stack for the signal handlers, with no link back to the
interrupted stack in user accessible memory.