I have a class which calls an asynchronous task using std::async in his constructor for loading its content. ( I want the loading of the object done asynchronously )
The code looks like this:
void loadObject(Object* object)
{
// ... load object
}
Object::Object():
{
auto future = std::async(std::launch::async, loadObject, this);
}
I have several instances of these objects getting created and deleted on my main thread, they can get deleted any time, even before their loading has finished.
I'd like to know if it is dangerous to having object getting destroyed when it is still getting handled on another thread. And how can I stop the thread if the object gets destroyed ?
EDIT: The std::future destructor does not block my code with the VS2013's compiler that I am using due to a bug.
As MikeMB already mentioned, your constructor doesn't finish until the load has been completed. Check this question for how to overcome that: Can I use std::async without waiting for the future limitation?
I'd like to know if it is dangerous to having object getting destroyed when it is still getting handled on another thread.
Accessing object's memory after deletion is certainly dangerous, yes. The behaviour will be undefined.
how can I stop the thread if the object gets destroyed ?
What I recommend you to take care of first, is to make sure that the object doesn't get destroyed while it's still being pointed at by something that is going to use it.
One approach is to use a member flag signifying completed load that is updated in the async task and checked in the destructor and synchronize the access with a condition variable. That will allow the destructor to block until the async task is complete.
Once you've managed to prevent the object from being destroyed, you can use another synchronized member flag to signify that the object is being destroyed and skip the loading if it's set. That'll add synchronization overhead but may be worth it if loading is expensive.
Another approach which avoids blocking destructor is to pass a std::shared_ptr to the async task and require all Object instances to be owned by a shared pointer. That limitation may not be very desireably and you'll need to inherit std::enable_shared_from_this to get the shared pointer in the constructor.
There is nothing asynchronous happening in your code, because the constructor blocks until loadObject() returns (The destructor of a future returned by std::async implicitly joins).
If it would not, it would depend on how you have written your code (and especially your destructor), but most probably, your code would incur undefined behavior.
Yes it is dangerous to having object getting destroyed when it is still getting handled on another thread
You can implement a lot of strategies actually depending on requirements and desired behaviour.
I would implement sort of pimpl strategy here, that means that all actual data will be stored in the pointer that your object holds. You will load all the data to the data-pointer-object and store it in the public-object atomically.
Techincally speaking object should be fully constrcuted and ready to use by the time the constrcutor is finished. In your case data-pointer-object will still probably be not ready to use. And you should make your class to handle correctly that state.
So here we go:
class Object
{
std::shared_ptr<Object_data> d;
Object::Object():
d(std::make_shared<Object_data>())
{
some_futures_matser.add_future(std::async(std::launch::async, loadObject, d));
}
}
Then you make atomic flag in your data-object that will signal that loading is complete and object is ready to use.
class Object_data
{
// ...
std::atomic<bool> loaded {false};
};
loadObject(std::shared_ptr<Object_data> d)
{
/// some load code here
d->loaded = true;
}
You have to check if your object is constrcuted every time when you acces it (with thread safe way) through loaded flag
Related
I'm using boost to start a thread and the thread function is a member function of my class just like this:
class MyClass {
public:
void ThreadFunc();
void StartThread() {
worker_thread_ = boost::shared_ptr<boost::thread>(
new boost::thread(boost::bind(&MyClass::ThreadFunc, this)));
}
};
I will access some member variables in ThreadFunc:
while (!stop) {
Sleep(1000); // here will be some block operations
auto it = this->list.begin();
if (it != this->list.end())
...
}
I can not wait forever for thread return, so I set timeout:
stop = true;
worker_thread_->interrupt();
worker_thread_->timed_join(boost::posix_time::milliseconds(timeout_ms));
After timeout, I will delete this MyClass pointer. Here will be a problem, the ThreadFunc hasn't return, it will have chances to access this and its member variables. In my case, the iterator will be invalid and it != this->list.end() will be true, so my program will crash if using invalid iterator.
My question is how to avoid it ? Or how to check whether this is valid or member variables is valid ? Or could I set some flags to tell ThreadFunc the destructor has been called ?
There are a lot of possible solutions. One is to use a shared_ptr to the class and let the thread hold its own shared_ptr to the class. That way, the object will automatically get destroyed only when both threads are done with it.
How about you create a stopProcessing flag (make it atomic) as a member of MyClass and in your ThreadFunc method check at each cycle if this flag is set?
[EDIT: making clearer the answer]
There a two orthogonal problems:
stopping the processing (I lost my patience, stop now please). This can be arranged by setting a flag into MyClass and make ThreadFunc checking it as often as reasonable possible
deallocation of resources. This is best by using RAII - one example being the use of shared_ptr
Better keep them as separate concerns.
Combining them in a single operation may be possible, but risky.
E.g. if using shared_ptr, the once the joining thread decided "I had enough", it simply goes out of the block which keeps its "copy" of shared_ptr, thus the shared_ptr::use_count gets decremented. The thread function may notice this and decide to interpret it as an "caller had enough" and cut short the processing.However, this implies that, in the future releases, nobody else (but the two threads) may acquire a shared_ptr, otherwise the "contract" of 'decremented use_count means abort' is broken.
(a use_count==1 condition may still be usable - interpretation "Only me, the processing thread, seems to be interested in the results; no consumer for them, better abort the work").
I have a class object whose functions can be called from different threads.
It is possible to get into a situation where Thread1(T1) is calling destructor,
whereas Thread(T2) is executing some other function on the same object.
Let's say T1 is able to call the destructor first, then what happens to the code running in T2?
Will it produce a crash or since the object is already destroyed, the member function will stop running?
Will taking a mutex lock at entry and unlocking at exit of all class functions ensure that there is no crash in any kind of race that will happen between destructor and member functions?
Appreciate your help!
No. Because the logic is flawed. If it is possible at all that T1 is calling the destructor, a mutex will not prevent it from still trying to do that.
There is shared data between T1 and T2 and as such you must make sure that neither will try to call methods on a non-existing shared object.
A mutex alone will not help.
What you'll get is unknown and unpredictable. You're describing a classic multithreading bug / poor design.
An instance should be properly locked before use, and that includes destruction. Once you do that, it's impossible to call a member function on an instance that's being destroyed in another thread.
Locking inside the functions won't help since if 2 separate threads hold references to the same instance and 1 of them causes it to be destroyed, the other thread is holding a bad reference and doesn't know that. A typical solution is to use refcounts, where destruction is at the responsibility of the object itself once no references to it exist. In C++, you'd usually use shared_ptr<> to do this.
I need to have a class with one activity that is performed once per 5 seconds in its own thread. It is a web service one, so it needs an endpoint to be specified. During the object runtime the main thread can change the endpoint. This is my class:
class Worker
{
public:
void setEndpoint(const std::string& endpoint);
private:
void activity (void);
mutex endpoint_mutex;
volatile std::auto_ptr<std::string> newEndpoint;
WebServiceClient client;
}
Does the newEndpoint object need to be declared volatile? I would certainly do it if the read was in some loop (to make the complier not optimize it out), but here I don't know.
In each run the activity() function checks for a new endpoint (if a new one is there, then passes it to the client and perform some reconnection steps) and do its work.
void Worker::activity(void)
{
endpoint_mutex.lock(); //don't consider exceptions
std::auto_ptr<std::string>& ep = const_cast<std::auto_ptr<string> >(newEndpoint);
if (NULL != ep.get())
{
client.setEndpoint(*ep);
ep.reset(NULL);
endpoint_mutex.unlock();
client.doReconnectionStuff();
client.doReconnectionStuff2();
}
else
{
endpoint_mutex.unlock();
}
client.doSomeStuff();
client.doAnotherStuff();
.....
}
I lock the mutex, which means that the newEndpoint object cannot change anymore, so I remove the volatile class specification to be able to invoke const methods.
The setEndpoint method (called from another threads):
void Worker::setEndpoint(const std::string& endpoint)
{
endpoint_mutex.lock(); //again - don't consider exceptions
std::auto_ptr<std::string>& ep = const_cast<std::auto_ptr<string> >(newEndpoint);
ep.reset(new std::string(endpoint);
endpoint_mutex.unlock();
}
Is this thing thread safe? If not, what is the problem? Do I need the newEndpoint object to be volatile?
volatile is used in the following cases per MSDN:
The volatile keyword is a type qualifier used to declare that an
object can be modified in the program by something such as the
operating system, the hardware, or a concurrently executing thread.
Objects declared as volatile are not used in certain optimizations
because their values can change at any time. The system always reads
the current value of a volatile object at the point it is requested,
even if a previous instruction asked for a value from the same object.
Also, the value of the object is written immediately on assignment.
The question in your case is, how often does your NewEndPoint actually change? You create a connection in thread A, and then you do some work. While this is going on, nothing else can fiddle with your endpoint, as it is locked by a mutex. So, per my analysis, and from what I can see in your code, this variable doesn't necessarily change enough.
I cannot see the call site of your class, so I don't know if you are using the same class instance 100 times or more, or if you are creating new objects.
This is the kind of analysis you need to make when asking whether something should be volatile or not.
Also, on your thread-safety, what happens in these functions:
client.doReconnectionStuff();
client.doReconnectionStuff2();
Are they using any of the shared state from your Worker class? Are they sharing and modifying any other state use by another thread? If yes, you need to do the appropriate synchronization.
If not, then you're ok.
Threading requires some thinking, you need to ask yourself these questions. You need to look at all state and wonder whether or not you're sharing. If you're dealing with pointers, then you need wonder who own's the pointer, and whether you're ever sharing it amongst threads, accidentally or not, and act accordingly. If you pass a pointer to a function that is run in a different thread, then you're sharing the object that the pointer points to. If you then alter what it points to in this new thread, you are sharing and need to synchronize.
Suppose you have an object which can be accesed by many threads. A critical section is used to protect the sensitive areas. But what about the destructor? Even if I enter a critical section as soon as I enter the destructor, once the destructor has been called, is the object already invalidated?
My train of thought: Say I enter the destructor, and I have to wait on the critical section because some other thread is still using it. Once he is done, I can finish destroying the object. Does this make sense?
In general, you should not destroy an object until you know that no other thread is using it. Period.
Consider this scenario, based on your 'train of thought':
Thread A: Get object X reference
Thread A: Lock object X
Thread B: Get object X reference
Thread B: Block on object X lock
Thread A: Unlock object X
Thread B: Lock object X; unlock object X; destroy object X
Now consider what happens if the timing is slightly different:
Thread A: Get object X reference
Thread B: Get object X reference
Thread B: Lock object X; unlock object X; destroy object X
Thread A: Lock object X - crash
In short, object destruction must be synchronized somewhere other than the object itself. One common option is to use reference counting. Thread A will take a lock on the object reference itself, preventing the reference from being removed and the object being destroyed, until it manages to increment the reference count (keeping the object alive). Then thread B merely clears the reference and decrements the reference count. You can't predict which thread will actually call the destructor, but it will be safe either way.
The reference counting model can be implemented easily by using boost::shared_ptr or std::shared_ptr; the destructor will not run unless all shared_ptrs in all threads have been destroyed (or made to point elsewhere), so at the moment of destruction you know that the only pointer to the object remaining is the this pointer of the destructor itself.
Note that when using shared_ptr, it's important to prevent the original object reference from changing until you can capture a copy of it. Eg:
std::shared_ptr<SomeObject> objref;
Mutex objlock;
void ok1() {
objlock.lock();
objref->dosomething(); // ok; reference is locked
objlock.unlock();
}
void ok2() {
std::shared_ptr<SomeObject> localref;
objlock.lock();
localref = objref;
objlock.unlock();
localref->dosomething(); // ok; local reference
}
void notok1() {
objref->dosomething(); // not ok; reference may be modified
}
void notok2() {
std::shared_ptr<SomeObject> localref = objref; // not ok; objref may be modified
localref->dosomething();
}
Note that simultaneous reads on a shared_ptr is safe, so you can choose to use a read-write lock if it makes sense for your application.
If a object is in use then you should make sure that the destructor of the object is not being called before the use of the object ends. If this is the behavior you have then its a potential problem and it really needs to be fixed.
You should make sure that if one thread is destroying your objects then another thread should not be calling functions on that object or the first thread should wait till second thread completes the function calling.
Yes, even destructors might need critical sections to protect updating some global data which is not related to the class itself.
It's possible that while one thread is waiting for CS in destructor the other is destroying the object and if CS belongs to object it will be destroyed as well. So that's not a good design.
You absolutely, positively need to make sure your object lifetime is less than the consumer threads, or you are in for some serious headaches. Either:
Make the consumers of the object children so it's impossible for them to exist outside of your object, or
use message passing/broker.
If you go the latter route, I highly recommend 0mq http://www.zeromq.org/.
Yes while you are in destructor, the object is already invalidated.
I used Destroy() method that enters critical section and then destroys it self.
Lifetime of object is over before destructor is called?
Yes, it is fine to do that. If a class supports such use, clients don't need to synchronize destruction; i.e. they don't need to make sure that all other methods on the object have finished before invoking the destructor.
I would recommend that clients not assume they can do this unless it is explicitly documented. Clients do have this burden, by default, with standard library objects in particular(§17.6.4.10/2).
There are cases where it is fine, though; std::condition_variable's destructor, for example, specifically allows ongoing condition_variable::wait() method invocations when ~condition_variable() starts. It only requires that clients not initiate calls to wait() after ~condition_variable() starts.
It might be cleaner to require that the client synchronize access to the destructor – and constructor for that matter – like most of the rest of the standard library does. I would recommend doing that if feasible.
However, there are certain patterns where it might make sense to relieve clients of the burden of fully synchronizing destruction. condition_variable's overall pattern seems like one: consider use of an object that handles possibly long-running requests. The user does the following:
Construct the object
Cause the object to receive requests from other threads.
Cause the object to stop receiving requests: at this point, some outstanding requests might be ongoing, but no new ones can be invoked.
Destroy the object. The destructor will block until all requests are done, otherwise the ongoing requests might have a bad time.
An alternative would be to require that clients do need to synchronize access. You could imagine step 3.5 above where the client calls a shutdown() method on the object that does the blocking, after which it is safe for the client to destroy the object. However, this design has some downsides; it complicates the API and introduces an additional state for the object of shutdown-but-valid.
Consider instead perhaps getting step (3) to block until all requests are done. There are tradeoffs...
In general, if you have a class that inherits from a Thread class, and you want instances of that class to automatically deallocate after they are finished running, is it okay to delete this?
Specific Example:
In my application I have a Timer class with one static method called schedule. Users call it like so:
Timer::schedule((void*)obj, &callbackFunction, 15); // call callbackFunction(obj) in 15 seconds
The schedule method creates a Task object (which is similar in purpose to a Java TimerTask object). The Task class is private to the Timer class and inherits from the Thread class (which is implemented with pthreads). So the schedule method does this:
Task *task = new Task(obj, callback, seconds);
task->start(); // fork a thread, and call the task's run method
The Task constructor saves the arguments for use in the new thread. In the new thread, the task's run method is called, which looks like this:
void Timer::Task::run() {
Thread::sleep(this->seconds);
this->callback(this->obj);
delete this;
}
Note that I can't make the task object a stack allocated object because the new thread needs it. Also, I've made the Task class private to the Timer class to prevent others from using it.
I am particularly worried because deleting the Task object means deleting the underlying Thread object. The only state in the Thread object is a pthread_t variable. Is there any way this could come back to bite me? Keep in mind that I do not use the pthread_t variable after the run method finishes.
I could bypass calling delete this by introducing some sort of state (either through an argument to the Thread::start method or something in the Thread constructor) signifying that the method that is forked to should delete the object that it is calling the run method on. However, the code seems to work as is.
Any thoughts?
I think the 'delete this' is safe, as long as you don't do anything else afterwards in the run() method (because all of the Task's object's member variables, etc, will be freed memory at that point).
I do wonder about your design though... do you really want to be spawning a new thread every time someone schedules a timer callback? That seems rather inefficient to me. You might look into using a thread pool (or even just a single persistent timer thread, which is really just a thread pool of size one), at least as an optimization for later. (or better yet, implement the timer functionality without spawning extra threads at all... if you're using an event loop with a timeout feature (like select() or WaitForMultipleObjects()) it is possible to multiplex an arbitrary number of independent timer events inside a single thread's event loop)
There's nothing particularly horrible about delete this; as long as you assure that:the object is always dynamically allocated, andno member of the object is ever used after it's deleted.
The first of these is the difficult one. There are steps you can take (e.g. making the ctor private) that help, but nearly anything you do can be bypassed if somebody tries hard enough.
That said, you'd probably be better off with some sort of thread pool. It tends to be more efficient and scalable.
Edit: When I talked about being bypassed, I was thinking of code like this:
class HeapOnly {
private:
HeapOnly () {} // Private Constructor.
~HeapOnly () {} // A Private, non-virtual destructor.
public:
static HeapOnly * instance () { return new HeapOnly(); }
void destroy () { delete this; } // Reclaim memory.
};
That's about as good of protection as we can provide, but getting around it is trivial:
int main() {
char buffer[sizeof(HeapOnly)];
HeapOnly *h = reinterpret_cast<HeapOnly *>(buffer);
h->destroy(); // undefined behavior...
return 0;
}
When it's direct like this, this situation's pretty obvious. When it's spread out over a larger system, with (for example) an object factory actually producing the objects, and code somewhere else entirely allocating the memory, etc., it can become much more difficult to track down.
I originally said "there's nothing particularly horrible about delete this;", and I stand by that -- I'm not going back on that and saying it shouldn't be used. I am trying to warn about the kind of problem that can arise with it if other code "Doesn't play well with others."
delete this frees the memory you have explicitly allocated for the thread to use, but what about the resources allocated by the OS or pthreads library, such as the thread's call stack and kernel thread/process structure (if applicable)? If you never call pthread_join() or pthread_detach() and you never set the detachstate, I think you still have a memory leak.
It also depends on how your Thread class is designed to be used. If it calls pthread_join() in its destructor, that's a problem.
If you use pthread_detach() (which your Thread object might already be doing), and you're careful not to dereference this after deleting this, I think this approach should be workable, but others' suggestions to use a longer-lived thread (or thread pool) are well worth considering.
If all you ever do with a Task object is new it, start it, and then delete it, why would you need an object for it anyway? Why not simply implement a function which does what start does (minus object creation and deletion)?