I know there is no WINAPI which would do it, but if a thread is hung and holds an open handle of file. how do we determine the thread id and Terminate it within our processes.
I'm not talking about releasing file locks in other processes but within my own process.
it could also be possible that thread has crashed / terminated without closing the handle.
You cannot determine which thread holds an open handle to a file. Nearly all kernel handles, including file handles, are not associated with a thread but only with a process (mutexes are an exception - they have a concept of an owning thread.)
Suppose I have the following code. Which thread "owns" the file handle?
void FuncCalledOnThread1()
{
HANDLE file = CreateFile(...);
// Hand off to a background thread.
PostWorkItemToOtherThread(FuncCalledOnThread2, file);
}
void FuncCalledOnThread2(HANDLE file)
{
DoSomethingWithFile(file);
CloseHandle(file);
}
Use Process Explorer - http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
[edit, or Handle - http://technet.microsoft.com/en-us/sysinternals/bb896655.aspx ]
You could attach to the process with the debugger when this happens, pause the program, search for the thread that caused this, and, traversing the stack, find out what it's doing, which code it executes, and which variables are on the stack.
If you used RAII for locking, this should be enough, as the lock must be on the stack.
I haven't seen anything browsing MSDN, although there's certainly something perhaps undocumented out there that can give you the information you need.
If your threads are creating these resources, and there's an expectation that one of these threads might go out to lunch, would it make more sense for them to query resources from a utility thread who's sole job is to create and dispose of resources? Such a thread would be unlikely to crash, and you would always know, in case of a crash, that the resource thread is in fact the owner of these handles.
Related
I have code that creates a thread and closes it with CloseHandle when the program finishes.
int main()
{
....
HANDLE hth1;
unsigned uiThread1ID;
hth1 = (HANDLE)_beginthreadex( NULL, // security
0, // stack size
ThreadX::ThreadStaticEntryPoint,
o1, // arg list
CREATE_SUSPENDED, // so we can later call ResumeThread()
&uiThread1ID );
....
CloseHandle( hth1 );
....
}
But why do I need to close the handle at all? What will happen if I will not do so?
But why do I need to close handle at all?
Handles are a limited resource that occupy both kernel and userspace memory. Keeping a handle alive not only takes an integer worth of storage, but also means that the kernel has to keep the thread information (such as user time, kernel time, thread ID, exit code) around, and it cannot recycle the thread ID since you might query it using that handle.
Therefore, it is best practice to close handles when they are no longer needed.
This is what you are to do per the API contract (but of course you can break that contract).
What will happens if I will not do so?
Well, to be honest... nothing. You will leak that handle, but for one handle the impact will not be measurable. When your process exits, Windows will close the handle.
Still, in the normal case, you should close handles that are not needed any more just like you would free memory that you have allocated (even though the operating system will release it as well when your process exits).
Although it may even be considered an "optimization" not to explicitly free resources, in order to have a correct program, this should always be done. Correctness first, optimization second.
Also, the sooner you release a resource (no matter how small it is) the sooner it is available again to the system for reuse.
Here you have a topic that response your question:
Can I call CloseHandle() immediately after _beginthreadex() succeeded?
Greetings.
Yes, you need to close the handle at some point, otherwise you will be leaking a finite OS resource.
May I suggest creating an RAII wrapper for your handles? Basically you write a wrapper that stores the handle you created and calls CloseHandle in its destructor. That way you never have to worry about remembering to close it (it will automatically close when it goes out of scope) or leaking a handle if an exception happens between opening the handle and closing it.
If you don't close the handle, then it will remain open until your process terminates. Depending on what's behind the handle, this can be bad. Resources are often associated to handles and these won't be cleaned up until your program terminates; if you only use a few handles and these happen to be 'lightweight' then it doesn't really matter. Other handles, such as file handles, have other side-effects to keeping the handle opened, for instance locking an opened file until your process exits. This can be very annoying to the user or other applications.
In general, it's best to clean-up all handles, but at the end of the process, all handles are closed by Windows.
Assume that I have a code something like this :
void *my_thread(void *data)
{
while (1) { }
}
void foo_init(struct my_resource *res)
{
pthread_create(&res->tid, NULL, my_thread, res);
/* Some init code */
}
void foo_exit(void)
{
/* Some exit code */
}
The scenario is something like this. When the process gets initialized, the function foo_init() is called with a pointer to my allocated resources(the allocation is done automatically by some other function, which isn't under my control). Within the function I am creating a pthread, which runs in infinite loop.
After a while when the process is about to terminate, the function foo_exit() is called, but this time without the pointer to my resources, and hence I am unable to call pthread_join(), as my tid is contained within my_resource structure.
Now my question is that, whether the resources pertaining to the pthreads are destroyed upon the termination of the process by the OS or not? If yes, how can I make that sure.
Also is it safe to terminate the process without calling pthread_join()?
Thanks in advance.
If you're talking about allocated memory, yes. When a process exits all virtual memory pages allocated to that process are returned to the system, which will clean up all memory allocated within your process.
Generally the OS is supposed to clean up all resources associated with a process on exit. It will handle closing file handles (which can include sockets and RPC mechanisms), wiping away the stack, and cleaning up kernel resources for the task.
Short answer, if the OS doesn't clean up after a process it is a bug in the OS. But none of us write buggy software right?
All "regular" resources needed by a process are released automatically by the OS when the process terminates (e.g. memory, sockets, file handles). The most important exception is shared memory but also other resources can be problematic if they're managed not by OS but by other processes.
For example if your process talks to a daemon or to another process like a window manager and allocates resources, whether or not those are released in case the process terminates without releasing them depends on the implementation.
I think the question can be answered another way: pthreads do not own any resources, resources are owned by the process. (A pthread may be the "custodian" of resources, such as memory it has malloc'ed, but it is not the owner.) When the process terminates, any still running pthreads suddenly stop and then the usual process clean-up happens.
POSIX says (for _Exit()):
• Threads terminated by a call to _Exit() or _exit() shall not invoke their cancellation cleanup handlers or per-thread data destructors.
For exit() POSIX specifies a little more clean-up -- in particular running all atexit() things and flushing streams and such -- before proceeding as if by _Exit(). Note that this does not invoke any pthread cancellation cleanup for any pthread -- the system cannot tell what state any pthread is in, and cannot be sure of being able to pthread_cancel() all pthreads, so does the only thing it can do, which is to stop them all dead.
I can recommend the Single UNIX® Specification (POSIX) -- like any standard, it's not an easy read, but worth getting to know.
I have thread A that is creating another thread B, than thread A is waiting using WaitForSingleObject to wait until thread B dies.
The problem is that even though thread B returns from the thread's "thread_func", thread A does not get signaled!.
I know that because I added traces (OutputDebugString) to the end of the thread_func (thread B's main function) and I can see that thread B finishes its execution, but thread A never comes out of the WaitForSingleObject.
Now, I must also add that this code is in a COM object, and the scenario described above is happening when I'm calling regsvr32.exe (it get stuck!), so I believe that thread A is coming from the DLLMain.
Any ideas why thread A does not get signaled ?!?!
You could be hitting a problem with the loader lock. Windows, has an internal critical section that gets locked whenever a DLL is loaded/unloaded or when thread is started/stopped (DllMain is always called inside that lock). If your waiting thread A has that critical section locked (i.e. you are waiting somewhere from DllMain), while another thread B tries to shutdown and attempts to acquire that loader critical section, you have your deadlock.
To see where the deadlock happens, just run your app from the VS IDE debugger and after it gets stuck, break execution. Then look at all running threads, and note the stack of each one. You should be able to follow each stack and see what each thread is waiting on.
I think #DXM is right. The documentation on exactly what you can or can't do inside of DllMain is sparse and hard to find, but the bottom line is that you should generally keep it to a bare minimum -- initialize internal variables and such, but that's about it.
The other point I'd make is that you generally should not "call" regsvr32.exe -- ever.
RegSvr32 is basically just a wrapper that loads a DLL into its address space with LoadLibrary, Calls GetProcAddress to get the address of a function named DllRegisterServer, then calls that function. It's much cleaner (and ultimately, easier) to do the job on your own, something like this:
HMODULE mod = LoadLibrary(your_COM_file);
register_DLL = GetProcAddress(mod, "DllRegisterServer");
if ( register_DLL == NULL) {
// must not really be a COM object...
}
if ( S_OK != register_DLL()) {
// registration failed.
}
I have a TCP Server application that serves each client in a new thread using POSIX Threads and C++.
The server calls "listen" on its socket and when a client connects, it makes a new object of class Client. The new object runs in its own thread and processes the client's requests.
When a client disconnects, i want some way to tell my main() thread that this thread is done, and main() can delete this object and log something like "Client disconnected".
My question is, how do i tell to the main thread, that a thread is done ?
The most straightforward way that I can see, is to join the threads. See here. The idea is that on a join call, a command thread will then wait until worker threads exit, and then resume.
Alternatively, you could roll something up with some shared variables and mutexes.
If the child thread is really exiting when it is done (rather than waiting for more work), the parent thread can call pthread_join on it which will block until the child thread exits.
Obviously, if the parent thread is doing other things, it can't constantly be blocking on pthread_join, so you need a way to send a message to the main thread to tell it to call pthread_join. There are a number of IPC mechanisms that you could use for this, but in your particular case (a TCP server), I suspect the main thread is probably a select loop, right? If that's the case, I would recommend using pipe to create a logical pipe, and have the read descriptor for the pipe be one of the descriptors that the main thread selects from.
When a child thread is done, it would then write some sort of message to the pipe saying "I'm Done!" and then the server would know to call pthread_join on that thread and then do whatever else it needs to do when a connection finishes.
Note that you don't have to call pthread_join on a finished child thread, unless you need its return value. However, it is generally a good idea to do so if the child thread has any access to shared resources, since when pthread_join returns without error, it assures you that the child thread is really gone and not in some intermediate state between having sent the "I'm Done!" message and actually having exited.
pthreads return 0 if everything went okay or they return errno if something didn't work.
int ret, joined;
ret = pthread_create(&thread, NULL, connect, (void*) args);
joined = pthread_join(&thread, NULL);
If joined is zero, the thread is done. Clean up that thread's object.
While it is possible to implement IPC mechanisms to notify a main thread when other threads are about to terminate, if you want to do something when a thread terminates you should try to let the terminating thread do it itself.
You might look into using pthread_cleanup_push() to establish a routine to be called when the thread is cancelled or exits. Another option might be to use pthread_key_create() to create a thread-specific data key and associated destructor function.
If you don't want to call pthread_join() from the main thread due to blocking, you should detach the client threads by either setting it as option when creating the thread or calling pthread_detach().
You could use a queue of "thread objects to be deleted", protect access to the queue with a mutex, and then signal a pthread condition variable to indicate that something was available on the queue.
But do you really want to do that? A better model is for each thread to just clean up after itself, and not worry about synchronizing with the main thread in the first place.
Calling pthread_join will block execution of the main thread. Given the description of the problem I don't think it will provide the desired solution.
My preferred solution, in most cases, would be to have the thread perform its own cleanup. If that isn't possible you'll either have to use some kind of polling scheme with shared variables (just remember to make them thread safe, hint:volatile), or perhaps some sort of OS dependant callback mechanism. Remember, you want to be blocked on the call to listen, so really consider having the thread clean itself up.
As others have mentioned, it's easy to handle termination of a given thread with pthread_join. But a weak spot of pthreads is funneling information from several sources into a synchronous stream. (Alternately, you could say its strong spot is performance.)
By far the easiest solution for you would be to handle cleanup in the worker thread. Log the disconnection (add a mutex to the log), delete resources as appropriate, and exit the worker thread without signaling the parent.
Adding mutexes to allow manipulation of shared resources is a tough problem, so be flexible and creative. Always err on caution when synchronizing, and profile before optimizing.
I had exactly the same problem as you described. After ~300 opened client connections my Linux application was not able to create new thread because pthread_join was never called. For me, usage of pthread_tryjoin_np helped.
Briefly:
have a map that holds all opened thread descriptors
from the main thread before new client thread is opened I iterate through map and call pthread_tryjoin_np for each thread recorded in map. If thread is done the result of call is zero meaning that I can clean up resources from that thread. At the same time pthread_tryjoin_np takes care about releasing thread resources. If pthread_tryjoin_np call returns number different from 0 this means that thread is still running and I simply do nothing.
Potential problem with this is that I do not see pthread_tryjoin_np as part official POSIX standard so this solution might not be portable.
How can I wait for a detached thread to finish in C++?
I don't care about an exit status, I just want to know whether or not the thread has finished.
I'm trying to provide a synchronous wrapper around an asynchronous thirdarty tool. The problem is a weird race condition crash involving a callback. The progression is:
I call the thirdparty, and register a callback
when the thirdparty finishes, it notifies me using the callback -- in a detached thread I have no real control over.
I want the thread from (1) to wait until (2) is called.
I want to wrap this in a mechanism that provides a blocking call. So far, I have:
class Wait {
public:
void callback() {
pthread_mutex_lock(&m_mutex);
m_done = true;
pthread_cond_broadcast(&m_cond);
pthread_mutex_unlock(&m_mutex);
}
void wait() {
pthread_mutex_lock(&m_mutex);
while (!m_done) {
pthread_cond_wait(&m_cond, &m_mutex);
}
pthread_mutex_unlock(&m_mutex);
}
private:
pthread_mutex_t m_mutex;
pthread_cond_t m_cond;
bool m_done;
};
// elsewhere...
Wait waiter;
thirdparty_utility(&waiter);
waiter.wait();
As far as I can tell, this should work, and it usually does, but sometimes it crashes. As far as I can determine from the corefile, my guess as to the problem is this:
When the callback broadcasts the end of m_done, the wait thread wakes up
The wait thread is now done here, and Wait is destroyed. All of Wait's members are destroyed, including the mutex and cond.
The callback thread tries to continue from the broadcast point, but is now using memory that's been released, which results in memory corruption.
When the callback thread tries to return (above the level of my poor callback method), the program crashes (usually with a SIGSEGV, but I've seen SIGILL a couple of times).
I've tried a lot of different mechanisms to try to fix this, but none of them solve the problem. I still see occasional crashes.
EDIT: More details:
This is part of a massively multithreaded application, so creating a static Wait isn't practical.
I ran a test, creating Wait on the heap, and deliberately leaking the memory (i.e. the Wait objects are never deallocated), and that resulted in no crashes. So I'm sure it's a problem of Wait being deallocated too soon.
I've also tried a test with a sleep(5) after the unlock in wait, and that also produced no crashes. I hate to rely on a kludge like that though.
EDIT: ThirdParty details:
I didn't think this was relevant at first, but the more I think about it, the more I think it's the real problem:
The thirdparty stuff I mentioned, and why I have no control over the thread: this is using CORBA.
So, it's possible that CORBA is holding onto a reference to my object longer than intended.
Yes, I believe that what you're describing is happening (race condition on deallocate). One quick way to fix this is to create a static instance of Wait, one that won't get destroyed. This will work as long as you don't need to have more than one waiter at the same time.
You will also permanently use that memory, it will not deallocate. But it doesn't look like that's too bad.
The main issue is that it's hard to coordinate lifetimes of your thread communication constructs between threads: you will always need at least one leftover communication construct to communicate when it is safe to destroy (at least in languages without garbage collection, like C++).
EDIT:
See comments for some ideas about refcounting with a global mutex.
To the best of my knowledge there's no portable way to directly ask a thread if its done running (i.e. no pthread_ function). What you are doing is the right way to do it, at least as far as having a condition that you signal. If you are seeing crashes that you are sure are due to the Wait object is being deallocated when the thread that creates it quits (and not some other subtle locking issue -- all too common), the issue is that you need to make sure the Wait isn't being deallocated, by managing from a thread other than the one that does the notification. Put it in global memory or dynamically allocate it and share it with that thread. Most simply don't have the thread being waited on own the memory for the Wait, have the thread doing the waiting own it.
Are you initializing and destroying the mutex and condition var properly?
Wait::Wait()
{
pthread_mutex_init(&m_mutex, NULL);
pthread_cond_init(&m_cond, NULL);
m_done = false;
}
Wait::~Wait()
{
assert(m_done);
pthread_mutex_destroy(&m_mutex);
pthread_cond_destroy(&m_cond);
}
Make sure that you aren't prematurely destroying the Wait object -- if it gets destroyed in one thread while the other thread still needs it, you'll get a race condition that will likely result in a segfault. I'd recommend making it a global static variable that gets constructed on program initialization (before main()) and gets destroyed on program exit.
If your assumption is correct then third party module appears to be buggy and you need to come up with some kind of hack to make your application work.
Static Wait is not feasible. How about Wait pool (it even may grow on demand)? Is you application using thread pool to run?
Although there will still be a chance that same Wait will be reused while third party module is still using it. But you can minimize such chance by properly queing vacant Waits in your pool.
Disclaimer: I am in no way an expert in thread safety, so consider this post as a suggestion from a layman.