What will happen if I single step debug a blocked thread? - c++

What will happen if I use a debugger like gdb to attach to a process and use single step when the current thread is in the middle of a blocked operation?
For example, the current thread is waiting for sigwait to return, and I use single step. Will the execution continue, or will it just stop until sigwait returns?

What will happen if I use a debugger like gdb to attach to a process and use single step
On most modern processors, GDB performs single step by setting a single-step processor flag, and resuming current instruction. Once that instruction finishes, the processor will raise single-step trap, which will cause the OS to notify GDB that something happened to the inferior (being debugged) process. GDB knows to expect that something (since it remembers that you just did a stepi).
when the current thread is in the middle of a blocked operation?
In that case, the current instruction is a syscall, and so the single-step trap will not execute until after the system call completes.
For example, the current thread is waiting for sigwait to return, and I use single step. Will the execution continue, or will it just stop until sigwait returns?
The latter.

Related

How to safely terminate a multithreaded process

I am working on a project where we have used pthread_create to create several child threads.
The thread creation logic is not in my control as its implemented by some other part of project.
Each thread perform some operation which takes more than 30 seconds to complete.
Under normal condition the program works perfectly fine.
But the problem occurs at the time of termination of the program.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
When I call exit() or return from main, the exit handlers and global objects' destructors are called. And I believe these operations are having a race condition with the running threads. And I believe there are many race conditions, which is making hard to solve all of theses.
The way I see it there are two solutions.
call _exit() and forget all de-allocation of resources
When SIGINT is there, close/kill all threads and then call exit() from main thread, which will release resources.
I think 1st option will work, but I do not want to abruptly terminate the process.
So I want to know if it is possible to terminate all child threads as quickly as possible so that exit handler & destructor can perform required clean-up task and terminate the program.
I have gone through this post, let me know if you know other ways: POSIX API call to list all the pthreads running in a process
Also, let me know if there is any other solution to this problem
What is it that you need to do before the program quits? If the answer is 'deallocate resources', then you don't need to worry. If you call _exit then the program will exit immediately and the OS will clean up everything for you.
Be aware also that what you can safely do in a signal hander is extremely limited, so attempting to perform any cleanup yourself is not recommended. If you're interested, there's a list of what you can do here. But you can't flush a file to disk, for example (which is about the only thing I can think of that you might legitimately want to do here). That's off limits.
I need to exit from main as quickly as possible when I receive the SIGINT signal.
How is that defined? Because there's no way to "exit quickly as possible" when you receive one signal like that.
You can either set flag(s), post to semaphore(s), or similar to set a state that tells other threads it's time to shut down, or you can kill the entire process.
If you elect to set flag(s) or similar to tell the other threads to shut down, you set those flags and return from your signal handler and hope the threads behave and the process shuts down cleanly.
If you elect to kill threads, there's effectively no difference in killing a thread, killing the process, or calling _exit(). You might as well just keep it simple and call _exit().
That's all you can chose between when you have to make your decision in a single signal handler call. Pick one.
A better solution is to use escalating signals. For example, when you get SIGQUIT or SIGINT, you set flag(s) or otherwise tell threads it's time to clean up and exit the process - or else. Then, say five seconds later whatever is shutting down your process sends SIGTERM and the "or else" happens. When you get SIGTERM, your signal handler simply calls _exit() - those threads had their chance and they messed it up and that's their fault. Or you can call abort() to generate a core file and maybe provide enough evidence to fix the miscreant threads that won't shut down.
And finally, five seconds later the managing process will nuke the process from orbit with SIGKILL just to be sure.

How to find which thread will execute an instruction?

I'm very surprised this hasn't been asked before. I'm trying to put a breakpoint on a specific instruction and read the registers in an already running process (Following this post: Read eax register).
I found the instruction I'm looking for, however the problem I've been running into is how do I find the right thread where the instruction is going to be executed, so I can do SetThreadContext() on it. This is a multithreaded program, so its not as simple as looking up for the single thread that is associated with the process.
I tried looking through Cheat Engine's source to see how they did it, however I couldn't find much, so I'm wondering how exactly they did it.
One idea that comes to mind is just setting every thread's context to it, however I'd like to avoid that.
EDIT: Forgot to mention I'm trying to do this with hardware breakpoints (using debug registers)
Unless you already know the answer / can predict the future, you need to set a hardware breakpoint in every thread that might run the instruction you care about.
The debug registers are per-core (and thus per-thread with context-switching), so a core will only actually break if the thread it's executing has its debug registers set to break on that instruction.
It might be easier to use a software breakpoint (0xcc byte replacing the first byte of the instruction) because you just have to store that once and every thread will see it. (x86 has coherent instruction caches; you don't have to invalidate them.)
As Margaret points out, once your breakpoint handler runs, you check the EIP / RIP of every thread, and the ones that are currently at that instruction are the one(s) that have reached the breakpoint and will run that instruction if single-stepped or resumed. (Or an address in your handler, if the handler runs in the context of that thread.)

thread 'disappears' when blocking on read() how do i debug it?

I have a multithreaded application, in c++ running under Linux (Fedora 27). One of the threads keep reading data from a file on the local disk using low-level IO (open, read, etc.) and supplies that data to a buffer that is rotated between other threads.
Now, i suddenly ran into a strange problem where read() would start blocking infinitely for no apparent reason at arbitrary offset into the file. I added a monitor thread that would detect this block (by setting a timestamp before entering read() ) and attempt to shut down the program when it occurred.
The weird thing now, is that at the end of the main thread, it waits for pthread_join, and on that read thread - it returns 0 (success).
I tried again, but replaced the call to read() with a while(1); and now, pthread_join does not finish as expected.
I then examined the program in gdb, and to my surprise when i reach the pthread_join, the read thread is GONE!
Looking at info thread when the monitor thread detects a blocking read() the thread is still there, but at some point it disappears, and i can't catch it!
I'm trying to catch this thread exiting and i'm looking for ideas on how to do so. I am using pthread_cleanup_push/pop but my function is not being invoked by the read thread (all other threads do).
any ideas? i'm at my wits end!
edit ----------------------------------------
it appears to have something to do with syslog being called from a completely unrelated thread.
read is a cancellation point, so if your application calls pthread_cancel to terminate the thread at some point, the thread will cease to exist (after executing the cleanup actions). Joining a canceled thread succeeds and yields the special value PTHREAD_CANCELED for the void * value optionally filled out by pthread_join.
If you replace read with an endless loop, then there is no cancellation point, the cancellation request is not acted upon, and pthread_join will also wait indefinitely.

what signal does GDB use to implement control transfer between tracee and tracer

By control transfer, I mean, after the tracee executing a function and return, which signal is generated so that GDB can wait*() on it and seize control again? It is not SIGTRAP though many people claim that ...
after the tracee executing a function and return, which signal is generated so that GDB can wait*() on it and seize control again?
The tracee is stopped, and control is transferred back to GDB, only when one of "interesting" events happens.
The interesting events are:
A breakpoint fires,
The tracee encounters a signal (e.g. SIGSEGV or SIGFPE as a result of performing invalid memory access or invalid floating-point operation),
The tracee disappears altogether (such as getting SIGKILLed by an outside program),
[There might be other "interesting" events, but I can't think of anything else right now.]
Now, a technically correct answer to "what signal does GDB use ..." is: none at all. The control isn't transferred, unless one of above events happen.
Perhaps your question is: how does control get back to GDB after executing something like finish command (which steps out of the current function)?
The answer to that is: GDB sets a temporary breakpoint on the instruction immediately after the CALL instruction that got us into the current function.
Finally, what causes the kernel to stop tracee and make waitpid in GDB to return upon execution of the breakpoint instruction?
On x86, GDB uses the INT3 (opcode 0xCC) instruction to set breakpoints (there is an alternate mechanism using debug registers, but it is limited to 4 simultaneous breakpoints, and usually reserved for hardware watchpoints instead). When the tracee executes INT3 instruction, SIGTRAP is indeed the signal that the kernel generates (i.e. other answers you've found are correct).
Without knowing what led you to believe it isn't SIGTRAP, it's hard to guess how you convinced yourself that it isn't.
Update:
I try to manually send a SIGTRAP signal to the tracee, trying to causing a spuriously wake-up of GDB, but fail.
Fail in what way?
What I expect you observe is that GDB stops with Program received signal SIGTRAP .... That's because GDB knows where it has placed breakpoints.
When GDB receives SIGTRAP and the tracee instruction pointer matches one of its breakpoints, then GDB "knows" that is's the breakpoint that has fired, and acts accordingly.
But when GDB receives SIGTRAP and the tracee IP doesn't match any of the breakpoints, then GDB treats it as any other signal: prints a message and waits for you to tell it what to do next.
"GDB sets a temporary breakpoint ... that means GDB has to modify tracee's code area, which may be read-only. So, how does GDB cope with that?
You are correct: GDB needs to modify (typically non-writable) .text section to insert any breakpoint using INT3 method. Fortunately, that is one of the "superpowers" granted to it by the kernel via ptrace(POKE_TEXT, ...).
P.S. It's a fun exercise to white a program that checksums code bytes of one of its own functions. You can then perform the checksum before and after placing a breakpoint on the "to be checksummed" function, and observe that the checksum differs when a breakpoint is present.
P.P.S. If you are curious about what GDB is doing, setting maintenance debug inferior will provide a lot of clues.

Change context of an alertable / waitable thread

I want to inject a piece of code into a running module using thread suspension method.
SuspendThread
GetThreadContext
DoSomething
ResumeThread
My question is what would happen if the thread I'm currently injecting is in alertable / waitable mode(WaitForSingleObject, GetMessage). what would happen once i hit the ResumeThread command.
The same thing that would have happened otherwise, I assume.
Lets say the target thread is currently in user mode. You save all the registers for later, set RIP to point to your code and call ResumeThread(). At some point your code start to execute, does whatever it does, restores all the registers the injection code saved, and lets the program resume its normal operation.
Now lets say the target thread is waiting. Waiting means the thread performs a system call that tells the scheduler not to schedule the thread for execution until something happens (an event is signaled, etc.). You save the registers of the user mode context (the way they were when sysenter was called), set RIP to point to you code and call ResumeThread(). That all well and nice, but the scheduler still won't schedule it for execution until the terms of the wait are satisfied.
When the wait finally ends, the thread does finished its business in kernel-mode, returns to user mode, and instead of executing the ret command following the sysenter goes on to perform your code. Finally your code restores all the registers and jumps to the saved RIP (from ntdll!ZwWaitForSingleObject or whatever) and everything continues as normal.
Finally, lets say you were performing an alertable wait. The story goes on pretty much as in the previous two paragraphs (you don't really need me to repeat that a third time, do you? :)), except that before the wait function returns it executes all the user APCs queued for the thread - exactly as it would have happened without your intervention - and then goes on to execute your code etc.
So basically what happens is what you should have expected to happen:
If you called SetThreadContext() the user-mode context is changed and the computer behaves accordingly, regardless of whether the thread was waiting or not.
If the thread was waiting for something it continues waiting for the same thing, regardless of whether you called 'SetThreadContext()' or not.
If the thread was in an alertable wait, before the system call returns it makes sure the user APC queue is empty (either because there were user APCs and it called them or because the queue was empty and the 'regular' wait condition finally happened). This, again, regardless of whether you called SetThreadContext() or not.