I want to inject a piece of code into a running module using thread suspension method.
SuspendThread
GetThreadContext
DoSomething
ResumeThread
My question is what would happen if the thread I'm currently injecting is in alertable / waitable mode(WaitForSingleObject, GetMessage). what would happen once i hit the ResumeThread command.
The same thing that would have happened otherwise, I assume.
Lets say the target thread is currently in user mode. You save all the registers for later, set RIP to point to your code and call ResumeThread(). At some point your code start to execute, does whatever it does, restores all the registers the injection code saved, and lets the program resume its normal operation.
Now lets say the target thread is waiting. Waiting means the thread performs a system call that tells the scheduler not to schedule the thread for execution until something happens (an event is signaled, etc.). You save the registers of the user mode context (the way they were when sysenter was called), set RIP to point to you code and call ResumeThread(). That all well and nice, but the scheduler still won't schedule it for execution until the terms of the wait are satisfied.
When the wait finally ends, the thread does finished its business in kernel-mode, returns to user mode, and instead of executing the ret command following the sysenter goes on to perform your code. Finally your code restores all the registers and jumps to the saved RIP (from ntdll!ZwWaitForSingleObject or whatever) and everything continues as normal.
Finally, lets say you were performing an alertable wait. The story goes on pretty much as in the previous two paragraphs (you don't really need me to repeat that a third time, do you? :)), except that before the wait function returns it executes all the user APCs queued for the thread - exactly as it would have happened without your intervention - and then goes on to execute your code etc.
So basically what happens is what you should have expected to happen:
If you called SetThreadContext() the user-mode context is changed and the computer behaves accordingly, regardless of whether the thread was waiting or not.
If the thread was waiting for something it continues waiting for the same thing, regardless of whether you called 'SetThreadContext()' or not.
If the thread was in an alertable wait, before the system call returns it makes sure the user APC queue is empty (either because there were user APCs and it called them or because the queue was empty and the 'regular' wait condition finally happened). This, again, regardless of whether you called SetThreadContext() or not.
Related
I would like to understand how 'wait' on a thread is actually working ?
Is there an endless loop behind the scene (does not sound resonable) ?
For example in MSDN/MFC manual page for 'WaitForSingleObject' function it says
The WaitForSingleObject function checks the current state of the specified object. If the object's state is nonsignaled, the calling thread enters the wait state until the object is signaled or the time-out interval elapses.
(http://msdn.microsoft.com/en-us/library/windows/desktop/ms687032(v=vs.85).aspx)
What is this "wait state"?
How does the thread 'wake up' i.e. how rising an event or signale an object cause the thread to run again?
Who checks the synchronization object and how often?
Thank you
This is handled by the OS thread scheduler.
When a thread waits on something, the OS creates a link from the object it's waiting on back to the waiting object. When the state of the object being waited on changes, the scheduler looks through the objects that are waiting on it. If the state change un-blocks any of those, then it marks them as un-blocked, so they become eligible for scheduling.
The scheduler then has algorithms to choose which threads that are eligible for scheduling will actually be scheduled to run. The exact details change between OSes (and even between versions of the same OS), but based on what you asked, I'd guess you probably don't care much about that right now.
The bottom line is that once a thread blocks like this, (virtually) no CPU time is expended on seeing whether it can run again. Rather than going through all blocked threads, and seeing if the situation has changed so any of them can run, it looks only at the changes in situation, and when those happen it figures out which threads that will allow to run.
Of course, it's also possible that at least in theory some OS could work differently from this--but Windows definitely does work pretty much as described above, and most other typical systems (e.g., Linux, *BSD, MacOS) are pretty similar in this respect.
I'm new with multi-threading and I need to get the whole idea about the "join" and do I need to join every thread in my application ?, and how does that work with multi-threading ?
no, you can detach one thread if you want it to leave it alone.
If you start a thread, either you detach it or you join it before the program ends, otherwise this is undefined behaviour.
To know that a thread needs to be detached you need to ask yourself this question: "do I want the the thread to run after the program main function is finished?". Here are some examples:
When you do File/New you create a new thread and you detach it: the thread will be closed when the user closes the document Here you don't need to join the threads
When you do a Monte Carlo simulation, some distributed computing, or any Divide And Conquer type algorithms, you launch all the threads and you need to wait for all the results so that you can combine them. Here you explicitly need to join the thread before combining the results
Not joining a thread is like not deleteing all memory you new. It can be harmless, or it could be a bad habit.
A thread you have not synchronized with is in an unknown state of execution. If it is a file writing thread, it could be half way through writing a file and then the app finishes. If it is a network communications thread, it could be half way through a handshake.
The downside to joining every thread is if one of them has gotten into a bad state and has blocked, your app can hang.
In general you should try to send a message to your outstanding threads to tell them to exit and clean up. Then you should wait a modest amount of time for them to finish or otherwise respond that they are good to die, and then shut down the app. Now prior to this you should signify your program is no longer open for business -- shit down GUI windows, respond to requests from other processes that you are shutting down, etc -- so if this takes longer than anticipated the user is not bothered. Finally if things go imperfectly -- if threads refuse to respond to your request that they shut down and you give up on them -- then you should log errors as well, so you can fix what may be a symptom of a bigger problem.
The last time a worker thread unexpectedly hung I initially thought was a problem with a network outage and a bug in the timeout code. Upon deeper inspection it was because one of the objects in use was deleted prior to the shutdown synchronization: the undefined behaviour that resulted just looked like a hang in my reproduction cases. Had we not carefully joined, that bug would have been harder to track down (now, the right thing to do would have been to use a shared resource that we could not delete: but mistakes happen).
The pthread_join() function suspends execution of the calling thread
until the target thread terminates, unless the target thread has
already terminated. On return from a successful pthread_join() call
with a non-NULL value_ptr argument, the value passed to pthread_exit()
by the terminating thread is made available in the location referenced
by value_ptr. When a pthread_join() returns successfully, the target
thread has been terminated. The results of multiple simultaneous calls
to pthread_join() specifying the same target thread are undefined. If
the thread calling pthread_join() is canceled, then the target thread
will not be detached.
So pthread_join does two things:
Wait for the thread to finish.
Clean up any resources associated
with the thread.
This means that if you exit the process without call to pthread_join, then (2) will be done for you by the OS (although it won't do thread cancellation cleanup), and (1) will not be done.
So whether you need to call pthread_join depends whether you need (1) to happen.
Detached thread
If you don't need the thread to run, then you may as well pthread_detach it. A detached thread cannot be joined (so you can't wait on its completion), but its resources are freed automatically if it does complete.
do I need to join every thread in my application ?
Not necessarily - depends on your design and OS. Join() is actively hazardous in GUI apps - tend If you don't need to know, or don't care, about knowing if one thread has terminated from another thread, you don't need to join it.
I try very hard to not join/WaitFor any threads at all. Pool threads, app-lifetime threads and the like often do not require any explicit termination - depends on OS and whether the thread/s own, or are explicitly bound to, any resources that need explicit termination/close/whatever.
Threads can be either joinable or detached. Detached threads should not be joined. On the other hand, if you didn't join the joinable thread, you app would leak some memory and some thread structures. c++11 std::thread would call std::terminate, if it wasn't marked detached and thread object went out of scope without .join() called. See pthread_detach and pthread_create. This is much alike with processes. When the child exits, it will stay as zombee while it's creater willn't call waitpid. The reson for such behavior is that thread's and process's creater might want to know there exit code.
Update: if pthread_create is called with attribute argument equal to NULL (default attributes are used), joinable thread will be created. To create a detached thread, you can use attributes:
pthread_attr_t attrs;
pthread_attr_init(&attrs);
pthread_attr_setdetachstate(&attrs, PTHREAD_CREATE_DETACHED);
pthread_create(thread, attrs, callback, arg);
Also, you can make a thread to be detached by calling pthread_detach on a created one. If you will try to join with a detached thread, pthread_join will return EINVAL error code. glibc has a non portable extension pthread_getattr_np that allows to get attributes of a running thread. So you can check if thread is detached with pthread_attr_getdetachstate.
The deal is:
I want to create a thread that works similarly to executing a new .exe in Windows, so if that program (new thread) crashes or goes into infinite loop: it will be killed gracefully (after the time limit exceeded or when it crashed) and all resources freed properly.
And when that thread has succeeded, i would like to be able to modify some global variable which could have some data in it, such as a list of files for example. That is why i cant just execute external executable from Windows, since i cant access the variables inside the function that got executed into the new thread.
Edit: Clarified the problem a lot more.
The thread will already run after calling CreateThread.
WaitForSingleObject is not necessary (unless you really want to wait for the thread to finish); but it will not "force-quit" the thread; in fact, force-quitting - even if it might be possible - is never such a good idea; you might e.g. leave resources opened or otherwise leave your application in a state which is no good.
A thread is not some sort of magical object that can be made to do things. It is a separate path of execution through your code. Your code cannot be made to jump arbitrarily around its codebase unless you specifically program it to do so. And even then, it can only be done within the rules of C++ (ie: calling functions).
You cannot kill a thread because killing a thread would utterly wreck some of the most fundamental assumptions a programmer makes. You would now have to take into account the possibility that the next line doesn't execute for reasons that you can neither predict nor prevent.
This isn't like exception handling, where C++ specifically requires destructors to be called, and you have the ability to catch exceptions and do special cleanup. You're talking about executing one piece of code, then suddenly ending the execution of that entire call-stack. That's not going to work.
The reason that web browsers moved from a "thread-per-tab" to "process-per-tab" model is exactly this: because processes can be terminated without leaving the other processes in an unknown state. What you need is to use processes instead of threads.
When the process finishes and sets it's data, you need to use some inter-process communication system to read that data (I like Boost.Interprocess myself). It won't look like a regular C++ global variable, but you shouldn't have a problem with reading it. This way, you can effectively kill the process if it's taking too long, and your program will remain in a reasonable state.
Well, that's what WaitForSingleObject does. It blocks until the object does something (in case of a thread it waits until the thread exits or the timeout elapses). What you need is
HANDLE thread = CreateThread(0, 0, do_stuff, NULL, 0, 0);
//rest of code that will run paralelly with your new thread.
WaitForSingleObject(thread, 4000); // wait 4 seconds or for the other thread to exit
If you want your worker thread to shut down after a period of time has elapsed, the best way to do that is to have the thread itself monitor the elapsed time in some way and then exit when the time is up.
Another way to do this is to monitor the elapsed time in the main thread or even a third, monitor type thread. When the time has elapsed, set an event. Your worker thread could wait for this event in it's main loop, and then exit when it has been raised. These kinds of events, which are used to signal the thread to kill itself, are sometimes called "death events." (Or at least, I call them that.)
Yet another way to do this is to queue a user job to the worker thread, which needs to be in an alterable wait state. The APC can then set some internal state variable which will trigger the death sequence in the thread when it resumes.
There is another method which I hesitate even mentioning, because it should only be used in extremely dire circumstances. You can kill the thread. This is a very dangerous method akin to turning off your sink by detonating an atomic bomb. You get the sink turned off, but there could be other unintended consequences as well. Please don't do this unless you know exactly what you're doing and why.
Remove the call to WaitForSingleObject. That causes your parent thread to wait.
Remove the WaitForSingleObject call?
I understand the problem with just killing the thread directly (via AfxEndThread or other means), and I've seen the examples using CEvent objects to signal the thread and then having the thread clean itself up. The problem I have is that using CEvent to signal the thread seems to require a loop where you check to see if the thread is signaled at the end of the loop. The problem is, my thread doesn't loop. It just runs, and the processing could take a while (which is why I'd like to be able to stop it).
Also, if I were to just kill the thread, I realize that anything I've allocated will not have a chance to clean itself up. It seems to me like any locals I've been using that happen to have put stuff on the heap will also not be able to clean themselves up. Is this the case?
There is no secret magic knowledge here.
Just check the event object periodically throughout the function code, where you deem it is safe to exit.
Does your thread ever exit? If so, you could set an event in the thread at exit and have the main process wait for that event via waitforsingleevent. This is best to do with a timeout so the main process doesn't appear to lockup when it's closing. At the timeout event, kill the thread via AfxKillThread. You'll have to determine what a reasonable timeout is, though.
Since you don't loop in the thread this seems to me to be the only way to do this. Of course, you could something like set a boolean flag in the main process and have the thread periodically check this flag, but then your thread code will be littered with "if(!canRun) return;" type code.
If the thread never exits, then AfxKillThread/AfxTerminateThread is the only way to stop the thread.
Locals would be placed on the stack and, hence, WOULD be freed on forcing the thread shut (I think). Destructors won't get called though and any critical sections the thread holds will not get released.
If the thread is ONLY doing things with simple data types on the stack, however, it IS a safe thing to be doing.
In C++, Windows platform, I want to execute a set of function calls as atomic so that execution doesn't switches to other threads in my process. How do I go about doing that? Any ideas, hints?
EDIT: I have a piece of code like:
someObject->Restart();
WaitForSingleObject(handle, INFINITE);
Now the Restart() function does its work asynchronously, so it returns quickly and when that someObject is restarted it sends me an event from another thread where I signal the event handle on which I'm waiting and thus continue processing. But now the problem is that before the code reaches WaitForSingleObject() part, I receive the restart completion event and I signal the event and after that WaitForSingleObject() never returns since it is not signaled again. That's why I want to execute both Restart() and WaitForSingleObject() as atomic.
This is generally not possible. You can't force the OS to not switch to other threads.
What you can do is one of the following:
Use locks, mutexes, criticals sections or semaphores to synchronize a handful of threads that touch the same data.
Use basic operations that are atomic such as compare-and-exchange or atomic-add in the form of win32 api calls such as InterlockedIncrement() and InterlockedCompareExchange()
You don't want all threads to wait, you just want to wait for the new thread to be done, without the risk of missing the signal. This can be done using a semaphore.
Create a semaphore known by both this code and the code eventually executed by Restart, using CreateSemaphore(NULL,0,1,NULL).
In the code you've shown, you'll still use WaitforSingleObject to wait for your semaphore. When the thread executing the Release code is done with it's work, have it call ReleaseSemaphore.
If ReleaseSemaphore is called first, WaitforSingleObject will let you pass immediately. If WaitforSingleObject is called first, it will wait for ReleaseSemaphore.
MSDN should also help you.
A general solution to lost event race is a counting semaphore.
Are you using PulseEvent() to signal your handle? If so, that's the problem.
According to MSDN,
If no threads are waiting, or if no
thread can be released immediately,
PulseEvent simply sets the event
object's state to nonsignaled and
returns.
So if the handle is signaled before you wait on it, the handle is placed immediately in the nonsignaled state by PulseEvent(). That would appear to be why your are "missing" the event. To correct this, replace PulseEvent() with SetEvent().
With this scenario, though, you may need to reset the event after the wait is complete. This of course depends on if this code is executed more than once during the lifetime of your application. Assuming your waiting thread is the only thread that is waiting on the handle, use CreateEvent() to create an auto reset event. This will automatically reset the handle after your waiting thread is released, making it automatically available for the next time through.
Well, you could suspend (using SuspendThread) all other threads in the process, but I suppose you should rethink design of your program.
This is very easy to fix. Just make sure that the event is the auto-reset event (see the parameters of the CreateEvent) and only call SetEvent to the event handle, never call ResetEvent or PulseEvent or some other things. So the WaitForSingleObject will always return properly. If the event has been already set, the WaitForSingleObject will return immediately and reset the event.
Although I worry about your design in general (ie you are making concurrent tasks sequential, thus losing all the benefits of the hard work to make it concurrent), I think I see the simple solution.
Change your event handle to be MANUAL RESET instead of AUTORESET. (see CreateEvent).
Then you won't miss the signal.
After WaitForSingleObject(...), call ResetEvent().
EDIT:
forget what I just said. That won't work. see comments below.