Kernel module to intercept system calls causes issues in execution of userspace programs

Kernel module to intercept system calls causes issues in execution of userspace programs - c++

I've been trying to write a kernel module (using SystemTap) that would intercept system calls, capture its information and add it to a system call buffer region that is kmalloc'd. I have implemented a mmap file operation so that a user space process can access this kmalloc'd region and read from it.
Note: For now, the module only intercepts the memfd_create system call. To test this out I have compiled a test application that calls memfd_create twice.
SystemTap script/kernel module code
In addition to the kernel module, I also wrote a user space application that would periodically read the system calls of this buffer, determine whether the system call is legit or malicious and then adds a response to a response buffer region (also included in the kmalloc'd region and can be accessed using mmap) indicating whether to let the system call proceed or to terminated the calling process.
User space application code
The kernel module also has a timer that kicks every few milliseconds to check the response buffer for responses added by the user space. Depending on the response the kernel module would either terminate the calling process or let it proceed.
The issue I am facing is that after intercepting a few system calls (I keep executing test application) and processing them properly, I start facing some issues executing normal commands in the userspace. For example: A simple command like ls:
[virsec#redhat7 stap-test]$ ls
Segmentation fault
[virsec#redhat7 stap-test]$ strace ls
execve("/usr/bin/ls", ["ls"], [/* 30 vars */]) = -1 EFAULT (Bad address)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
+++ killed by SIGSEGV +++
Segmentation fault
This happens with every terminal command that I run. The dmesg output shows nothing but the printk debug outputs of the kernel module. There is no kernel panic, the kernel module and the userspace application are still running and waiting for the next system call to intercept. What do you think could be the issue? Let me know if you need any more information. I was not able to post any code snippets because I think the code would make much more sense as a whole.

Related

linux driver check process is alive

I have a simple linux module for work with interruptions.
I send the signal to the my process pid every time when have interraption.
But how i can check the PID is alive or not?
I tryed use find_task_by_vpid in a interraption handler function.
But after that some times kernel is crashed.
NIP [c003ba9c] find_task_by_vpid+0x2c/0x4cfind_task_by_vpi d[ 782.391934] Unable to handle kernel
So now i get find_task_by_vpid only one time.
And it's work ok.
But when i kill my process like a "kill -9" my core is crashed.
Please, help me

PIDs are not a reliable to identify a process. PIDs may be reused, so if your destination process terminates (for whatever reason), its PID may be reused, without your kernel module getting notice for that. So:
Don't use PIDs for communicating with processes!¹
Use file operations for communication between kernel and user space. Either in the form of a character or block device in /dev or as an entry in procfs /proc or in the sysfs /sys.
Have the user space process open the file; your kernel module offers a set of file operation handler functions (fops). When the process terminates, all file descriptors are closed, and the close fop of your kernel module gets called.
1: As a matter of fact, PIDs are unreliable everywhere and their use for anything should be avoided. The only sitation in which a PID is somewhat reliable is inside the parent process, that forked off that PID, since wait-s will prevent the process to completely vanish until the parent dealt with its demise. But anyone else on the system has no information about that.

debug port problem while running Lauterbach CMM script

Currently Im developing Lauterbach CMM scripts to automate test cases for SPC58NG84
As part of Test case:
- Need to reset target system before and after test case.
- Need to read and wrte variable values from C code.
When I run test scripts I got error 'debug port problem' and in 'watch window' all variable values showing BUS ERROR.
Can you please let me know how to debug this issue?.
What are the reasons causing 'debug port problem'?
Error Message in Area winodw:
CO:2 error: CPU suddenly left debug mode (OSR=0x3C1)
CO:0 JTAGID=0x11110041
Warning: CO:1 Core currently in reset. Stopping core on activation.
CMM Script:
Test Pre condition: Reset target
Break.Delete
WAIT 100.ms
SYStem.Mode Down
SYStem.DETECT.CPU
SYStem.Mode Up
B:: Go
WAIT 500.ms
Test case Execution:
--Read and write Variables in software-----
Test Post condition: Reset target
Break
Break.Delete
WAIT 100.ms
SYStem.Mode Down
SYStem.Mode Up
B:: Go
WAIT 1000.ms

The error 'debug port problem' after the Break command usually means that the target application crashed so badly that core does not respond to the debugger's halt command anymore.
In order to debug the problem, make sure that your boot loader sets up the interrupt vector start address (IVPR) as early as possible, and also put branch-to-self instructions to all interrupt handler addresses, unless interrupt handler code already exists.
Once this is done, set program preakpoints to the interrupt handlers typically involved in crashes: machine check, data storage, instruction storage, program interrupt. Doing so should catch the core when the crash occurs, and the SRR0 (CSRR or MCSRR, depending on interrupt type) will show you at which address the problem occurred.

Remote GDB checkpoint/fork failure

I am trying to debug on a remote target that does not support run or restart without a checkpoint. The only user available is root, so there shouldn't be any permission issues. I tried:
Breakpoint 1, main (argc=4, argv=0x7fffffffe348) at foo.cpp:40
(gdb) checkpoint
checkpoint -1: fork returned pid 6145.
Failed to find new fork
(gdb) i checkpoints
No checkpoints.
Does anyone know how to get run to work? Or how I can check to see what is actually causing the fork to fail and prevent the checkpoint?

After some experimentation, add the following to your .gdbinit file
target extended-remote <host>:<port>
This should allow you to use the run command, eliminating the need to use restart.

Once you fork how could you restore a checkpoint? A checkpoint does a rewind to a processes saved state at a moment in time. Once the fork occurs, I imagine the checkpoint would only exist for the original process.
From the manual there is this entry:
Finally, there is one bit of internal program state that will be
different when you return to a checkpoint — the program's process id.
Each checkpoint will have a unique process id (or pid), and each will
be different from the program's original pid. If your program has
saved a local copy of its process id, this could potentially pose a
problem.
Regarding the required use of checkpoints to perform restarts on remote sessions. I have never used checkpoints before but I have restarted many a remote session.

How to log the segmentation faults and run time errors which can crash the program, through a remote logging library?

What is the technique to log the segmentation faults and run time errors which crash the program, through a remote logging library?
The language is C++.

Here is the solution for printing backtrace, when you get a segfault, as an example what you can do when such an error happens.
That leaves you a problem of logging the error to the remote library. I would suggest keeping the signal handler, as simple, as possible and logging to the local file, because you cannot assume, that previously initialized logging library works correctly, when segmentation fault occured.

What is the technique to log the segmentation faults and run time errors which crash the program, through a remote logging library?
From my experience, trying to log (remotely or into file) debugging messages while program is crashing might not be very reliable, especially if APP takes system down along with it:
With TCP connection you might lose last several messages while system is crashing. (TCP maintains data packet order and uses error correction, AFAIK. So if app just quits, some data can be lost before being transmitted)
With UDP connection you might lose messages because of the nature of UDP and receive them out-of-order
If you're writing into file, OS might discard most recent changes (buffers not flushed, journaled filesystem reverting to earlier state of the file).
Flushing buffers after every write or sending messages via TCP/UDP might induce performance penalties for a program that produces thousands of messages per second.
So as far as I know, the good idea is to maintain in-memory plaintext log-file and write a core dump once program has crashed. This way you'll be able to find contents of log file within core dump. Also writing into in-memory log will be significantly faster than writing into file or sending messages over network. Alternatively, you could use some kind of "dual logging" - write every debug message immediately into in-memory log, and then send them asynchronously (in another thread) into log file or over the network.
Handling of exceptions:
Platform-specific. On windows platform you can use _set_se_handlers and use it to either generate backtrace or to translate platform exceptions into c++ exceptions.
On linux I think you should be able to create a handler for SIGSEGV signal.
While catching segfault sounds like a decent idea, instead of trying to handle it from within the program it makes sense to generate core dump and bail. On windows you can use MiniDumpWriteDump from within the program and on linux system can be configured to produce core dumps in shell (ulimit -c, I think?).

I'd like to give some solutions:
using core dump and start a daemon to monitor and collect core dumps and send to your host.
GDB (with GdbServer), you can debug remotely and see backtrace if crashed.

To catch the segfault signal and send a log accordingly, read this post:
Is there a point to trapping "segfault"?
If it turns out that you wont be able to send the log from a signal handler (maybe the crash occurred before the logger has been intitialized), then you may need to write the info to file and have an external entity send it remotely.
EDIT: Putting back some original info to be able to send the core file remotely too
To be able to send the core file remotely, you'll need an external entity (a different process than the one that crashed) that will "wait" for core files and send them remotely as they appear. (possibly using scp) Additionally, the crashing process could catch the segfault signal and notify the monitoring process that a crash has occurred and a core file will be available soon.

Force crash an application

I'm currently testing an application that my company wrote. One of the scenarios was to see what happens to the system state if that application was to crash. Is there an application out there that could force crash my application? I'd rather not write a crash into the code itself (ie. null pointer dereference). Using the task manager to kill the process doesn't yield the same results.

On Windows you can attach WinDbg to a process, corrupt some register or memory and detach. For instance you can set instruction pointer to 0 for some active application thread.
windbg -pn notepad.exe
Right after attach, current thread is set to debug thread, so you need to change to app thread to make it crash with RIP register update
0:008> ~0s
0:000> rip=0
0:000> qd

Assuming Windows, see Application Verifier.
It can do fault injection (Low Resource Simulation) that makes various API calls fail, at configurable rates. E.g. Heap allocations, Virtual Alloc, WaitForXxx, Registry APIs, Filesystem APIs, and more.
You can even specify a grace period (in milliseconds) when no faults will be injected during startup.

The best way is to call RaiseException API from windows.h
RaiseException(0x0000DEAD,0,0,0);
Or you can do a runtime linking to KeBugCheckEx() from ntoskrnl.exe and call it in your code.
Example:
#include <windows.h>
#include <iostream>
using namespace std;
int main()
{
HINSTANCE h = LoadLibrary("ntoskrnl.exe");
cout<<h<<endl;
void* a;
a = (void*) GetProcAddress(h,"KeBugCheckEx");
int(*KeBugCheckEx)(ULONG,ULONG_PTR,ULONG_PTR,ULONG_PTR,ULONG_PTR);
KeBugCheckEx = (int(*)(ULONG,ULONG_PTR,ULONG_PTR,ULONG_PTR,ULONG_PTR))a;
cout << a;
KeBugCheckEx(0,0,0,0,0); //crash in module ntoskrnl.exe means that call success!
}

You can use the winapiexec tool for that:
winapiexec64.exe CreateRemoteThread ( OpenProcess 0x1F0FFF 0 1234 ) 0 0 0xDEAD 0 0 0
Replace 1234 with the process id and run the command, the process will crash.

You haven't stated which OS you're running on but, if it's Linux (or another UNIX-like system), you can just kill -9 your process. This signal can't be caught and will result in the rug being pulled out from under your process pretty quickly.
If you're not on a UNIX-like system, I can't help you, sorry, but you may find some useful information here (look for "taskkill").

If the system runs on UNIX/Linux you can send it a signal: SIGQUIT should produce a core-dump, you can also send it SIGSEGV if you want to test it getting a "segmentation fault". Those are signal 3 and 11 respectively.
If the system is Windows I do not know a way to raise a signal in a different application but if you can modify the application to handle a specific Windows message number that will call raise() you can emulate that. raise() causes the signal to be raised without actually having to write code that performs an illegal action. You can then post a message to the application which will have the handler that raises this signal.

You could override the global new operator. Then, you can use a counter and at a specific value you perform a null pointer dereference to force your application to crash. By simply changing the value of when to perform the dereference you can easily vary the time of crash.

Where is this "system state" defined? If this were unix, you could send a signal 9 to the process...
If you really needed to, you could share all the application memory with another process (or thread), and have that thread randomly write random data some unfortunate memory location - I think NASA did this for some of their space projects, but I really couldn't give a reference.
The real question is why you want to do this - what are you /really/ testing?
If this is, for example, some program that controls some medical service that prescribes drugs... Unit test that service instead, analyse the API, and look for flaws.

Make a bufferoverflow yourself.
#include <string.h>
void doSomething(char *Overflow)
{
char Buffer[1];
strcpy(Buffer, Overflow);
}
int main()
{
doSomething("Muhaha");
}
And your program will crash

An alternative would be to run the application in a good debugger, set a breakpoint to a particular line of code, and viola, your application has "crashed". Now, this might not cause all your threads to stop running, depending on the debugger being used. Alternatively, you could run the application in the debugger, and simply "stop" the application after a time.
This doesn't neccessarily result in a crash with the kernel killing the application (and possibly dumping core) but it would probably do what you want regardless.

Call abort() function from your code. Other programs can't reliably "crash" your program - they have their own process context which is isolated from your program's context. You could use something like TerminateProcess() in Windows API or another platform-specific function but that would be more or less the same as using Task Manager.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js