Can two process attach to same PID via ptrace - c++

So, the title says it all.
Is it possible that one process has two tracers?
I am playing around with ptrace, and I can see that whenever someone attaches to process, then in /proc//status under TracerPID will be PID of the tracer. However, is it possible to have two tracers?
I have two programs (tracer, and tracee). And I ran tracee in debug mode, and then I ran tracer, and got error Operation not permited (even with root permissions).
Regards,
golobich

They can't. It is indirectly confirmed in ptrace man page:
EPERM The specified process cannot be traced. This could be because
the tracer has insufficient privileges (the required capability
is CAP_SYS_PTRACE); unprivileged processes cannot trace pro‐
cesses that they cannot send signals to or those running set-
user-ID/set-group-ID programs, for obvious reasons. Alterna‐
tively, the process may already be being traced, or (on kernels
before 2.6.26) be init(1) (PID 1).

Related

Detecting that a child process was killed because the OS is out of memory

I'm working on a large-scale application that spawns numerous processes for dealing with various tasks. In some situations, the OS will kill one of my processes because of memory pressure. That's ok, it's entirely expected, the parent process handles this gracefully.
What I'd like to know is find out why a process was killed. If it was killed because of memory pressure, I want to respawn the treatment a little later. If it was killed for any other reason – because, say, of an assertion failure or an out of bounds memory access, I want to log and investigate.
So, here's my question: how do you find out that a child process was killed because the OS needed the memory?
Question applies to:
Windows;
MacOS;
Linux;
(for bonus points, I'm also interested in Android, but that's not my priority).
Processes are not running as root/admin.
On Linux, you can read the syslog to find out whether a process was killed by the OS. you can investigate it by reading the syslog (/var/log/messages or /var/log/syslog on some distributions) or via the dmesg command.
If you spawned the process you can also detect that it was killed with the SIGKILL(9) signal, as opposed to the SIGSEGV(11) signal that corresponds to the app crashing all by itself, and SIGINT(2)/SIGTERM(15) that means that the applications was aked to terminate gracefully.
Regarding Windows, I only know that this type of monitoring can be enabled via the Application Event Log. There's a GUI Application that can help you set it up.
When the OS intervenes in the execution of a process in order to kill, it does so via signals.
What you can do (on IX based/like platforms) is -- dmesg.
It outputs the kernel activity logs.
From there, you can identify the signal that was sent to your process.
For example this code below --
#include <stdio.h>
int main (void)
{
char *p = NULL;
printf ("\n%c", *p);
return 0;
}
Causes this obtained from dmesg --
[8478285.606105] crash.out[16830]: segfault at 0 ip 0000000000400531 sp 00007fffc373b090 error 4 in crash.out[400000+1000]

Rare EXCEPTION_ACCESS_VIOLATION when debugging any process started with CREATE_SUSPENDED

While writing an x86 WinAPI-based debugger, I've encountered a rare condition when the debuggee (which usually works well) suddenly terminates with EXCEPTION_ACCESS_VIOLATION after I attach to it with my native debugger. I can stably reproduce this on any applications it seems (tried on .NET Hello World-styled application and on notepad.exe on multiple Windows 10 machines).
Essentially I've written a simple WaitForDebugEvent loop:
CreateProcessW(L"C:\\Windows\\SYSWOW64\\notepad.exe", […], CREATE_SUSPENDED, […]);
DebugActiveProcess(processId);
DEBUG_EVENT debugEvent = {};
while (WaitForDebugEvent(&debugEvent, INFINITE)) {
switch (debugEvent.dwDebugEventCode) {
// log all the events
}
ContinueDebugEvent(debugEvent.dwProcessId, debugEvent.dwThreadId, DBG_EXCEPTION_NOT_HANDLED);
}
DebugActiveProcessStop(processId);
(here's the full listing: I won't paste it all here, because there's some additional non-essential boilerplate there; the MCVE is 136 lines long)
For the sake of an example, I'll just log all the debugger events and detect whether the debuggee is ready to "proceed normally" or it will terminate due to an exception.
Most of the time, my debugging session looks like that:
CREATE_PROCESS_DEBUG_EVENT (which reports creation of both the process and its initial thread)
LOAD_DLL_DEBUG_EVENT (I was never able to get the name for this DLL, but this is documented in MSDN)
CREATE_THREAD_DEBUG_EVENT (which, I suspect, is a thread injected by debugger)
LOAD_DLL_DEBUG_EVENT […] — after this, many DLLs get loaded into the target process and everything looks okay, the process works as intended
But sometimes (in about 1.5% of all runs), the event sequence changes:
CREATE_PROCESS_DEBUG_EVENT
LOAD_DLL_DEBUG_EVENT
CREATE_THREAD_DEBUG_EVENT
EXCEPTION_DEBUG_EVENT: EXCEPTION_ACCESS_VIOLATION (which I never was able to gather details for: it reports a DEP violation, and the address is empty)
After that, I cannot proceed with debugging, because my debuggee is in exception state and will terminate soon. I was never able to catch notepad.exe crash without my debugger attached (and I doubt it is that bad and will crash for no reason), so I suspect that my debugger causes these exceptions.
One bizarre detail is that I could "fix" the situation by calling Sleep(1) immediately after WaitForDebugEvent. So, this is possibly some sort of race condition, but race condition between what? Between the debugger thread and other threads in the debuggee? Is it a thing? How are we supposed to debug other applications, then? How could actual debuggers work if it is a thing?
I couldn't reproduce the issue with the same code compiled for x64 CPU (and debugging an x64 process).
What could actually cause this erroneous behavior? I've carefully read the documentation about the API functions I call, and checked some other debugger examples online, but still wasn't able to find what's wrong with my debugger: it looks like I follow all the right conventions.
I have tried to debug my debuggee with WinDBG while it is still paused in my debugger, but had no luck doing that. First of all, it's difficult to attach to the debuggee with another debugger (WinDBG only allows to use non-intrusive mode, which is less functional it seems?), and the call stacks for the process' threads aren't usually meaningful.
Steps to reproduce
Checkout this repository, compile with MSVC and then execute in cmd:
Debug\NetRuntimeWaiter.exe > log.txt
It is important to redirect output to the log file and not show it in the terminal: without that, timings for the log writer get changed, and the issue won't reproduce (due to a possible race condition I mentioned earlier?).
Usually the program will start and terminate 1000 notepads in about 10 seconds, and 10-15 of 1000 invocations will hold the error condition (i.e. EXCEPTION_ACCESS_VIOLATION).
the DebugActiveProcess (and undocumented DbgUiDebugActiveProcess which is internally called by DebugActiveProcess) have serious design problem: after calling NtDebugActiveProcess it create remote thread in the target process, via DbgUiIssueRemoteBreakin call - as result new thread in target process is created - DbgUiRemoteBreakin - this thread call DbgBreakPoint and then RtlExitUserThread
all this not documented and explained, only this note from DebugActiveProcess:
After all of this is done, the system resumes all threads in the
process. When the first thread in the process resumes, it executes a
breakpoint instruction that causes an EXCEPTION_DEBUG_EVENT
debugging event to be sent to the debugger.
of course this is wrong. why is DbgUiRemoteBreakin first (??) thread ? and which thread resume first undefined. why not exactly write - we create additional (but not first) thread in process ? and this thread execute breakpoint.
however, when process already running - create this additional thread not create problems. but in case we create process in suspended state, and then just call DebugActiveProcess - the DbgUiRemoteBreakin really became first executing thread in process and process initialization was done on this thread, instead of created first thread. on xp this always lead to fail process initialize at connect to csrss phase. (csrss wait connect to it only on first created thread in process). on later systems this is fixed and process can execute as usual. but can and not, because thread on which it was initialized is exit. it can cause subtle problems.
solution here - not use DebugActiveProcess but NtDebugActiveProcess in it place.
the debug object we can create or via DbgUiConnectToDbg() and then get it via DbgUiGetThreadDebugObject() (system store debug object in thread TEB) or direct by call NtCreateDebugObject
also if we create debuggee process from another process(B) we can do next:
duplicate debug object from debugger process to this B process
call DbgUiSetThreadDebugObject(hDdg) just before call
CreateProcessW with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS
system will be use DbgUiGetThreadDebugObject() for get debug object
from your thread and pass it to low level process create api
remove debug object from your thread via
DbgUiSetThreadDebugObject(0)
really no matter who is create process with debug object. matter who is handle events posted to this debug object.
all undocumented api definitions you can take from ntdbg.h and then link with ntdll.lib or ntdllp.lib

How can I get GDB to stop tracing a detached process?

I'm debugging a C++ application which creates trees of forks. Using GDB defaults, the child processes will be detached on the fork and as a result I see only one inferior shown afterwards.
I tried to attach to one of the child processes and despite it not being listed as an inferior for the other GDB process, in the new GDB session I get an error that the process is already being traced (by the first GDB session).
Is this expected behavior? What steps can I take to debug the forked process in a separate GDB session? What steps can I take to debug the problem further?

Qt check if external process crashes

I'm building a failsafe application for professional video. The Qt application checks the 4 corners of the 2nd screen and if they are a certain RGB value (I use a special background) the Qt program knows it crashed so it sends a signal to the videomixer to fade to the other input.
Now I also want to add a check to see if the video program didn't crash (it can be the video program doesn't respond but still shows an output so I can't see the desktop on the 2nd screen). I know I can use Qprocess to start an external process. It's not that easy to hook it up to a process that already runs.
Now the question: how can I check if the program crashed (so "not responding") and see this as quick as possible so I can fade to the other video input. And what happens when my Qt program crashes, will it also exit the child process?
Thanks!
Using QProcess creates an attached process, so unfortunately it will be killed when your process dies. When you create a detached process using the static method QProcess::startDetached, you don't get the monitoring functionality.
You need to write a little platform-specific monitoring class that can launch a detached process and inform you of changes in its status. You need to use the native APIs in implementing that. QProcess's sources can be a good inspiration for where to start.
#KubaOber is partially correct in his statement. If you start and detach a process indeed you loose the Qt way of communicating with it and monitory what it does. However you OS offers plenty solutions to oversee what happens with it.
On Linux you can use:
pgrep to check if the process is running or not (execute the command as a child process and see if it returns 0 (process is running) or 1 (process is no longer running)
you can use proc filesystem to see when a process terminates (see here) and then use $? or a variable (as in described in the link) to check its exit status
kill allows you a great amount of control possibilities along with pipes
You should note however that especially on Windows there are plenty of programs that do not follow the Unix convention for exit codes (0 = exited normally, anything else - error has occurred). Also a crash is just an error state that the process ended up with. The exit code tells you that an error has occurred but in terms of a crash you will probably not be able to make the difference just by looking at it.

What is an easy way to test whether any process of a given id is presently running on Linux?

In C++, I have a resource that is tied to a pid. Sometimes the process associated with that pid exits abnormally and leaks the resource.
Therefore, I'm thinking of putting the pid in the file that records the resource as being in use. Then when I go to get a resource, if I see an item as registered as being in use, I would search to see whether a process matching the pid is currently running, and if not, clean up the leaked resource.
I realize there is a very small probability that a new unrealated pid is now sharing the same number, but this is better than leaking with no clean up I have now.
Alternatively, perhaps there is a better solution for this, if so, please suggest, otherwise, I'll pursue the pid recording.
Further details: The resource is a port number for communication between a client and a server over tcp. Only one instance of the client may use a given port number on a machine. The port numbers are taken from a range of available port numbers to use. While the client is running, it notes the port number it is using in a special file on disk and then cleans this entry up on exit. For abnormal exit, this does not always get cleaned up and the port number is left annotated as being in use, when it is no longer being used.
To check for existence of process with a given id, use kill(pid,0) (I assume you are on POSIX system). See man 2 kill for details.
Also, you can use waitpid call to be notified when the process finishes.
I would recommend you use some kind of OS resource, not a PID. Mutexes, semaphores, delete-on-close files. All of these are cleaned up by the OS when a process exits.
On Windows, I would recommend a named mutex.
On Linux, I would recommend using flock on a file.
How about a master process that starts your process (the one which terminates abnormally) waits for your process to crash (waitpid) and spawns it again when waitpid returns.
while(1) {
fork exec
waitpid
}
The problem domain isn't clear, unfortunately, you could try re-explaining it in some other way.
But if I understand you correctly, you could create a map like
std::map< ProcessId, boost::shared_ptr<Resource> > map;
// `Resource` here references to some abstract resource type
// and `ProcessId` on Windows system would be basically a DWORD
and in this case you simply have to list every running process (this can be done via EnumProcesses call on Windows) and remove every entry with inappropriate id from your map. After doing this you would have only valid process-resource pairs left. This action can be repeated every YY seconds depending on your needs.
Note that in this case removing an item from your map would basically call the corresponding destructor (because, if your resource is not being used in your code somewhere else, it's reference count would drop to zero).
The API that achieves that on windows are OpenProcess which takes process ID as input, and GetExitCodeProcess which returns STILL_ACTIVE when the process is, well, still active. You could also use any Wait function with zero timeout, but this API seems somewhat cleaner.
As other answers note, however, this doesn't seem a promising road to take. We might be able to give more focused advice if you provide more scenario details. What is your platform? What is the leaked resource exactly? Do you have access to the leaking app code? Can you wrap it in a high-level try-catch with some cleanup? If not, maybe wait on the leaker to finish with a dedicated thread (or dedicated process altogether)? Any detail you provide might help.