I have a third party library. This library has function named int foo(). Function is thread based and I cannot change the content of the function. (It does not belong to me.)
When I call the function, it becomes locked and does not return the value. Is there any way to kill this thread based function, when the function is locked? For example, when the function doesn't return the value within 5 seconds, I want to kill it without any memory leak.
Since its a third party library which you have no control over, you cannot portably terminate the thread that runs that code, though you can call for the native_handle and use its thread termination facilities, you will most likely introduce leaks.
Note that, threads live in the same Address space, hence a corruption or a leak from one thread affects your entire program.
The option I can think of is to spawn a new process to run that code, if after 5 seconds it doesn't complete, you can request the OS to kill it. {No memory leaks and resources are freed} :-) ...your best choice...
A possible solution, as suggested by StoryTeller, is to call foo() in a different thread, which you control. When timeout happens, you leave the thread running in the background. This means foo() continues execution, but your program can continue. This method is portable, so you don't need to write any operation system dependent code.
Leaving foo() running can have unwanted side effects, and foo() will continue to use resources in the background, so you have to test whether this works in your situation.
#include <boost/thread.hpp>
#include <ctime>
void FooWrapper(bool& hasResult, int& result){
result = foo();
hasResult = true;
}
void AnotherFunction(){
bool hasResult = false;
int result;
boost::thread(&FooWrapper, boost::ref(hasResult), boost::ref(result));
// Wait until result, or until timeout
std::time_t startTime = time(0);
while(!hasResult && time(0) < startTime + 5){
// Do nothing
}
if(!hasResult){
throw "timeout";
}
else{
// Use result
}
}
I am using boost thread here, but you can convert it to use any thread library you want.
No, there is no way to do so without memory leaks, because the thread running foo may allocate heap data, might put some private data inside values owned by your main program or some other thead.
Notice that data liveness (and virtual address space) is a whole-program property: some heap data does not belong to a (particular) thread, but to the whole process. The library could (and probably should) use smart pointers as a convention.
Related
In Microsoft Visual C++ I can call CreateThread() to create a thread by starting a function with one void * parameter. I pass a pointer to a struct as that parameter, and I see a lot of other people do that as well.
My question is if I am passing a pointer to my struct how do I know if the structure members have been actually written to memory before CreateThread() was called? Is there any guarantee they won't be just cached? For example:
struct bigapple { string color; int count; } apple;
apple.count = 1;
apple.color = "red";
hThread = CreateThread( NULL, 0, myfunction, &apple, 0, NULL );
DWORD WINAPI myfunction( void *param )
{
struct bigapple *myapple = (struct bigapple *)param;
// how do I know that apple's struct was actually written to memory before CreateThread?
cout << "Apple count: " << myapple->count << endl;
}
This afternoon while I was reading I saw a lot of Windows code on this website and others that passes in data that is not volatile to a thread, and there doesn't seem to be any memory barrier or anything else. I know C++ or at least older revisions are not "thread aware" so I'm wondering if maybe there's some other reason. My guess would be the compiler sees that I've passed a pointer &apple in a call to CreateThread() so it knows to write out members of apple before the call.
Thanks
No. The relevant Win32 thread functions all take care of the necessary memory barriers. All writes prior to CreateThread are visible to the new thread. Obviously the reads in that newly created thread cannot be reordered before the call to CreateThread.
volatile would not add any extra useful constraints on the compiler, and merely slow down the code. In practice thiw wouldn't be noticeable compared to the cost of creating a new thread, though.
No, it should not be volatile. At the same time you are pointing at the valid issue. Detailed operation of the cache is described in the Intel/ARM/etc papers.
Nevertheless you can safely assume that the data WILL BE WRITTEN. Otherwise too many things will be broken. Several decades of experience tell that this is so.
If thread scheduler will start thread on the same core, the state of the cache will be fine, otherwise, if not, kernel will flush the cache. Otherwise, nothing will work.
Never use volatile for interaction between threads. It is an instruction on how to handle data inside the thread only (use a register copy or always reread, etc).
First, I think optimizer cannot change the order at expense of the correctness. CreateThread() is a function, parameter binidng for function calls happens before the call is made.
Secondly, volatile is not very helpful for the purpose you intend. Check out this article.
You're struggling into a non-problem, and are creating at least other two...
Don't worry about the parameter given to CreateThread: if they exist at the time the thread is created they exist until CreateThread returns. And since the thread who creates them does not destroy them, they are also available to the other thread.
The problem now becomes who and when they will be destroyed: You create them with new so they will exist until a delete is called (or until the process terminates: good memory leak!)
The process terminate when its main thread terminate (and all other threads will also be terminated as well by the OS!). And there is nothing in your main that makes it to wait for the other thread to complete.
Beware when using low level API like CreateThread form languages that have thir own library also interfaced with thread. The C-runtime has _beginthreadex. It call CreateThread and perform also other initialization task for the C++ library you will otherwise miss. Some C (and C++) library function may not work properly without those initializations, that are also required to properly free the runtime resources at termination. Unsing CreateThread is like using malloc in a context where delete is then used to cleanup.
The proper main thread bnehavior should be
// create the data
// create the other thread
// // perform othe task
// wait for the oter thread to terminate
// destroy the data
What the win32 API documentation don't say clearly is that every HANDLE is waitable, and become signaled when the associate resource is freed.
To wait for the other thread termination, you main thread will just have to call
WaitForSingleObject(hthread,INFINITE);
So the main thread will be more properly:
{
data* pdata = new data;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, pdata,0,0);
WaitForSingleObject(htread,INFINITE);
delete pdata;
}
or even
{
data d;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, &d,0,0);
WaitForSingleObject(htread,INFINITE);
}
I think the question is valid in another context.
As others have pointed out using a struct and the contents is safe (although access to the data should by synchronized).
However I think that the question is valid if you hav an atomic variable (or a pointer to one) that can be changed outside the thread. My opinion in that case would be that volatile should be used in this case.
Edit:
I think the examples on the wiki page are a good explanation http://en.wikipedia.org/wiki/Volatile_variable
I a have third party function which I use in my program. I can't replace it; it's in a dynamic library, so I also can't edit it. The problem is that it sometimes runs for too long.
So, can I do anything to stop this function from running if it runs more than 10 seconds for example? (It's OK to close program in this scenario.)
PS. I have Linux, and this program won't have to be ported anywhere else.
What I want is something like this:
#include <stdio.h>
#include <stdlib.h>
void func1 (void) // I can not change contents of this.
{
int i; // random
while (i % 2 == 0);
}
int main ()
{
setTryTime(10000);
timeTry{
func1();
} catchTime {
puts("function executed too long, aborting..");
}
return 0;
}
Sure. And you'd do it just the way you suggested in your title: "signals".
Specifically, an "alarm" signal:
http://linux.die.net/man/2/alarm
http://beej.us/guide/bgipc/output/html/multipage/signals.html
If you really have to do this, you probably want to spawn a process that does nothing but invoke the function and return its result to the caller. If it runs too long, you can kill that process.
By putting it into its own process, you stand a decent (not great, but decent) chance of cleaning up at least most of what it was doing so when it dies unexpectedly it probably won't make a complete mess of things that will lead to later problem.
The potential problem with forcefully cancelling a running function is that it may "own" resources that it intended to return later. The kind of resources that can be problems include:
heap memory allocations (free store)
shared memory segments
threads
sockets
file handles
locks
Some of these resources are managed on a per-process basis, so letting the function run in a different process (perhaps using fork) makes it easier to kill cleanly. Other resources can outlive a process, and really must be cleaned up explicitly. Depending on your operating system, it's also possible that the function may be part-way through interacting with some hardware driver or device, and killing it unexpectedly may leave that driver or device in a bizarre state such that it won't work until after a restart.
If you happen to know that the function doesn't use any of these kind of resources, then you can kill it confidently. But, it's hard to guarantee that: in a large system with many such decisions - which the compiler can't check - evolution of code in functions like func1() is likely to introduce dependencies on such resources.
If you must do this, I'd suggest running it in a different process or thread, and using kill() for processes, pthread_kill if func1() has some support for terminating when a flag is set asynchronously, or the non-portable pthread_cancel if there's really no other choice.
I am using ZThreads to illustrate the question but my question applies to PThreads, Boost Threads and other such threading libraries in C++.
class MyClass: public Runnable
{
public:
void run()
{
while(1)
{
}
}
}
I now launch this as follows:
MyClass *myClass = new MyClass();
Thread t1(myClass);
Is it now possible to kill (violently if necessary) this thread? I can do this for sure instead of the infinite loop I had a Thread::Sleep(100000) that is, if it is blocking. But can I kill a spinning thread (doing computation). If yes, how? If not, why not?
As far as Windows goes (from MSDN):
TerminateThread is a dangerous function that should only be used in
the most extreme cases. You should call TerminateThread only if you
know exactly what the target thread is doing, and you control all of
the code that the target thread could possibly be running at the time
of the termination. For example, TerminateThread can result in the
following problems:
If the target thread owns a critical section, the critical section will not be released.
If the target thread is allocating memory from the heap, the heap lock will not be released.
If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.
Boost certainly doesn't have a thread-killing function.
A general solution to the kind of question posted can be found in Herb Sutter article:
Prefer Using Active Objects Instead of Naked Threads
This permits you to have something like this (excerpt from article):
class Active {
public:
typedef function<void()> Message;
private:
Active( const Active& ); // no copying
void operator=( const Active& ); // no copying
bool done; // le flag
message_queue<Message> mq; // le queue
unique_ptr<thread> thd; // le thread
void Run() {
while( !done ) {
Message msg = mq.receive();
msg(); // execute message
} // note: last message sets done to true
}
In the active object destructor you can have then:
~Active() {
Send( [&]{ done = true; } ); ;
thd->join();
}
This solution promotes a clean thread function exist, and avoids all other issues related to an unclean thread termination.
It is possible to terminate a thread forcefully, but the call to do it is going to be platform specific. For example, under Windows you could do it with the TerminateThread function.
Keep in mind that if you use TerminateThread, the thread will not get a chance to release any resources it is using until the program terminates.
If you need to kill a thread, consider using a process instead.
Especially if you tell us that your "thread" is a while (true) loop that may sleep for a long period of time performing operations that are necessarily blocking. To me, that indicate a process-like behavior.
Processes can be terminated in a various number of ways at almost any time and always in a clean way. They may also offer more reliability in case of a crash.
Modern operating systems offer an array of interprocess communications facilities: sockets, pipes, shared memory, memory mapped files ... They may even exchange file descriptors.
Good OSes have copy-on-write mechanism, so processes are cheap to fork.
Note that if your operations can be made in a non-blocking way, then you should use a poll-like mechanism instead. Boost::asio may help there.
You can with TerminateThread() API, but it is not recommended.
More details at:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686717(v=vs.85).aspx
As people already said, there is no portable way to kill a thread, and in some cases not possible at all. If you have control over the code (i.e. can modify it) one of the simplest ways is to have a boolean variable that the thread checks in regular intervals, and if set then terminate the thread as soon as possible.
Can't you do add something like below
do {
//stuff here
} while (!abort)
And check the flag once in a while between computations if they are small and not too long (as in the loop above) or in the middle and abort the computation if it is long?
Not sure of the other libraries but in pthread library pthread_kill function is available pthread_kill
Yes,
Define keepAlive variable as an int .
Initially set the value of keepAlive=1 .
class MyClass: public Runnable
{
public:
void run()
{
while(keepAlive)
{
}
}
}
Now, when every you want to kill thread just set the value of keepAlive=0 .
Q. How this works ?
A. Thread will be live until the execution of the function continuous . So it's pretty simple to Terminate a function . set the value of variable to 0 & it breaks which results in killing of thread . [This is the safest way I found till date] .
I'm making a dll that has to respond to an application's requests. One of the application's requirements is that a call should not take long to complete.
Say, I have a function foo(), which is called by the host application:
int foo(arg){
// some code i need to execute, say,
LengthyRoutine();
return 0;
}
Lets say, foo has to perform a task (or call a function) that is certain to take a long time. The application allows me to set a wait variable; if this variable is non-zero when foo returns, it calls foo again and again (resetting the wait variable before each call) until wait is returned 0.
What's the best approach to this?
Do I go:
int foo(arg){
if (inRoutine == TRUE) {
wait = 1;
return 0;
} else {
if (doRoutine == TRUE) {
LengthyRoutine();
return 0;
}
}
return 0;
}
This doesn't really solve the problem that LengthyRoutine is gonna take a long time to complete. Should I spawn a thread of some sort that updates inRoutine depending on whether or not it has finished its task?
Thanks..
Spawning another thread is pretty much the best way to do it, just make sure you set the result variables before you set the variable that says you're finished to avoid race conditions. If this is called often you might want to spawn a worker thread ahead of time and reuse it to avoid thread start overhead.
There is another possible solution, do part of the work each time the function is called, however this spends more time in the DLL and probably isn't optimal, as well as being more complex to implement the worker code for most algos.
If C programming, use callback - pass the callback to foo. You have to agree on the callback signature and do some housekeeping to trigger it when the work in LengthyRoutine is done.
typedef (void) callbackFunction(void);
int foo(arg, callbackFunction)
{
// some code i need to execute, say,
// register callback and return right away
// Trigger the LengthyRoutine to run after this function returns
return 0;
}
LengthyRoutine()
{
// do lenghty routine
// now inform the caller with their suppiled callback
callbackFunction();
}
Essentially Observer Pattern in C. C++ makes the work a lot easier/cleaner in my opinion
If incredible rare situation where LengthyRoutine() isn't 3rd party code and you have all the source and it's possible to split it up then you can consider using coroutines.
Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.