I've already seen similar question: Destructor vs member function race
.. but didn't find the answer for the following. Suppose we have a class owning some worker thread. Destructor of the class can look like:
~OurClass
{
ask_the_thread_to_terminate;
wait_for_the_thread_to_terminate;
....
do_other_things;
}
The question is: may we call OurClass's member functions in the worker thread as we are sure all these calls will be done before do_other_things in the destructor?
Yes you can. Destruction of member variables willonly start after do_other_things completes its execution. It is safe to call member functions before object gets destructed.
Well, you kinda can do that, but it's easy to make a mistake this way. For example, if you do something like:
~OurClass
{
delete *memberPointer_;
ask_the_thread_to_terminate; // the thread calls OurClass::foo();
wait_for_the_thread_to_terminate;
....
do_other_things
}
void OurClass::foo()
{
memberPointer->doSomething();
}
you will be in trouble. It also makes it harder to read through the code. So you will have to be careful with the order of operations and stuff like that. In general if it's possible to change the architecture to avoid such complicated constructions, I would suggest to do it.
The implicit destruction of any data members occurs after execution of the last explicit line of your destructor, so all those data members are still fully functional whilst you wait for the thread. Hence, you must only avoid any premature explicit destruction of any data members (as exemplified in SingerOfTheFall's pathological code).
Related
I have a class object whose functions can be called from different threads.
It is possible to get into a situation where Thread1(T1) is calling destructor,
whereas Thread(T2) is executing some other function on the same object.
Let's say T1 is able to call the destructor first, then what happens to the code running in T2?
Will it produce a crash or since the object is already destroyed, the member function will stop running?
Will taking a mutex lock at entry and unlocking at exit of all class functions ensure that there is no crash in any kind of race that will happen between destructor and member functions?
Appreciate your help!
No. Because the logic is flawed. If it is possible at all that T1 is calling the destructor, a mutex will not prevent it from still trying to do that.
There is shared data between T1 and T2 and as such you must make sure that neither will try to call methods on a non-existing shared object.
A mutex alone will not help.
What you'll get is unknown and unpredictable. You're describing a classic multithreading bug / poor design.
An instance should be properly locked before use, and that includes destruction. Once you do that, it's impossible to call a member function on an instance that's being destroyed in another thread.
Locking inside the functions won't help since if 2 separate threads hold references to the same instance and 1 of them causes it to be destroyed, the other thread is holding a bad reference and doesn't know that. A typical solution is to use refcounts, where destruction is at the responsibility of the object itself once no references to it exist. In C++, you'd usually use shared_ptr<> to do this.
Recently I am following the tutorials on rastertek and find that they suggest use a Shutdown() method for cleaning up instead of the class own destructor.The reason they mention is that the destructor is not guaranteed to be executed when calling some unsafe function like ExitThread().
However, I doubt if that method would get executed when even the destructor cannot be called. Indeed you can always call Shutdown() before you call ExitThread() but why not the same for the destructor? If I can do something before calling ExitThread(), I can certainly call the destructor as well.
Isn't placing the clean up code in the destructor more or less safer than using another method to do the trick? I know that releasing some vital resources like closing a file may need this separate method to do the trick. But are there any reasons other than that since this does not seem to be the case in the tutorials?
For the record, I know there is a similar question out there. But that one got no answer.
Isn't placing the clean up code in the destructor more or less safer than using another method to do the trick?
The problem here is that while ExitThread (and other functions like it) is a perfect API for C, with C++ code, it breaks stack unwinding.
The correct solution for C++ is to make sure you do not call ExitThread (and such) in code using anything with destructors.
Problem:
void thread_function()
{
raii_resource r { acquire_resource() };
ExitThread();
// problem: r.~raii_resource() not called
}
The shutdown solution:
void thread_function()
{
raii_resource r { acquire_resource() };
r.shutdown(); // release resources here
ExitThread();
// r.~raii_resource() still not called
}
The shutdown solution is not obvious at all in client code. As #stefan said, kill it with fire.
Better solution (than the Shutdown thing):
void thread_function()
{
{ // artificial scope where RAII objects live
raii_resource r { acquire_resource() };
}
// this space does not support RAII life
ExitThread();
}
RAII works fine here, but the artificial scope is not very elegant. On top, it's as inelegant as the shutdown solution (it requires a non-obvious artifice in client code).
Better (cleaner) solution:
template<typename F>
void run_thread(F functor)
{
functor(); // all RAII resources inside functor; this is simple and
// obvious from client code
ExitThread();
}
The only advantage to moving initialization out of the constructor, and for removing cleanup out of the destructor is when you've got a base class framework where you want to reliably call virtual methods during these stages.
Since the vtable is changing during construction/destruction calls to virtual functions don't resolve to the most derived instance. By having explicit Initialize/Shutdown methods you can be sure the virtual functions dispatch correctly.
Please note, this isn't an answer that advocates this approach, just one that is trying to work out why they've suggested it!
destructors are guaranteed to be called when an object is destroyed however thread clean up can require a tad more than just object destruction. Generally speaking cleanup methods are added when you need to handle releasing of shared resources, dealing with ancient libraries, etc.
Specifically you're dealing with the Win32 API which qualifies as an ancient c-style library considering ExitThread has been around longer than I have ...
With such approach you'll need to call Shutdown in all cases where the object should be destroyed - each time when it leaves the scope. In the case of exceptions (if they are used) you'll cannot call it. Destructor will be called automatically in these cases.
"I can certainly call the destructor as well" - calling the destructor explicitly is highly not recommended because it will be called automatically in any case. This should be avoided except only in special cases. If you mean this code from the tutorial:
System->Shutdown();
delete System;
then I don't see the difference because delete System; will call the destructor anyway.
In any case I would prefer
{
// work with System
...
}
mentioned in #utnapistim answer. I do not see any minuses in such way of coding, and also it is common way to specify scope. Even not going deep in the details of legacy ::ExitThread you gain auto cleanup. It is possible to work with WinApi with C++ RAII technique, look for instance at my code: multithreading sample project. Initial commit was bare WinAPI, and next commit introduced resource wrappers. You can compare both variants - second is much clearer, imho.
I would like to create a thread pool. I have a class called ServerThread.cpp, whose constructor should do something like this:
ServerThread::ServerThread()
{
for( int i=0 ; i<init_thr_num ; i++ )
{
//create a pool of threads
//suspend them, they will wake up when requests arrive for them to process
}
}
I was wondering if creating pthreads inside a constructor can cause any undefined behavior that one should avoid running into.
Thanks
You can certainly do that in a constructor but should be aware of a problem that is clearly explained by Scott Meyers ins his Effective/More Effective C++ books.
In short his point is that if any kind of exception is raised within a constructor, then your half-backed object will not be destroyed. This leads to memory leaks. So Meyers' suggestion is to have "light" constructors and then do the "heavy" work in an init method called after the object has been fully created.
This argument is not strictly related to creating a pool of pthreads within a constructor (whereby you might argue that no exception will be raised if you simply create them and then immediately suspend them), but is a general consideration about what to do in a constructor (read: good practices).
Another considerations to be done is that a constructor has no return value. While it is true that (if no exceptions are thrown) you can leave the object is a consistent state even if the thread creation fails, it would be possibly better to manage a return value from a kind of init or start method.
You could also read this thread on S.O. about the topic, and this one.
From a strictly formal point of view, a constructor is really just a
function like any other, and there shouldn't be any problem.
Practically, there could be an issue: the threads may actually start
running before the constructor has finished. If the threads need a
fully constructed ServerThread to operate, then you're in
trouble—this is often the case when ServerThread is a base
class, and the threads need to interact with the derived class. (This
is a very difficult problem to spot, because with the most frequently
used thread scheduling algorithms, the new thread will usually not
start executing immediately.)
I am not very good in multithreading programming so I would like to ask for some help/advice.
In my application I have two threads trying to access a shared object.
One can think about two tasks trying to call functions from within another object. For clarity I will show some parts of the program which may not be very relevant but hopefully can help to get my problem better.
Please take a look at the sample code below:
//DataLinkLayer.h
class DataLinkLayer: public iDataLinkLayer {
public:
DataLinkLayer(void);
~DataLinkLayer(void);
};
Where iDataLinkLayer is an interface (abstract class without any implementation) containing pure virtual functions and a reference (pointer) declaration to the isntance of DataLinkLayer object (dataLinkLayer).
// DataLinkLayer.cpp
#include "DataLinkLayer.h"
DataLinkLayer::DataLinkLayer(void) {
/* In reality task constructors takes bunch of other parameters
but they are not relevant (I believe) at this stage. */
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
/* Start multithreading */
task1->start(); // task1 extends thread class
task2->start(); // task2 also extends thread class
}
/* sample stub functions for testing */
void DataLinkLayer::from_task_1() {
printf("Test data Task 1");
}
void DataLinkLayer::from_task_2() {
printf("Test data Task 2");
}
Implementation of task 1 is below. The dataLinLayer interface (iDataLinkLayer) pointer is passed to the class cosntructor in order to be able to access necessary functions from within the dataLinkLayer isntance.
//data_task_1.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_1.h"
dll_task_1::dll_task_1(iDataLinkLayer* pDataLinkLayer) {
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_1.h
}
// Run method - executes the thread
void dll_task_1::run() {
// program reaches this point and prints the stuff
this->datalinkLayer->from_task_1();
}
// more stuff following - not relevant to the problem
...
And task 2 looks simialrly:
//data_task_2.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_2.h"
dll_task_2::dll_task_2(iDataLinkLayer* pDataLinkLayer){
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_2.h
}
// // Run method - executes the thread
void dll_task_2::run() {
// ERROR: 'Access violation reading location 0xcdcdcdd9' is signalled at this point
this->datalinkLayer->from_task_2();
}
// more stuff following - not relevant to the problem
...
So as I understand correctly I access the shared pointer from two different threads (tasks) and it is not allowed.
Frankly I thought that I will be able to access the object nevertheless however the results might be unexpected.
It seems that something goes terribly wrong at the point when dll_task_2 tries to call the function using pointer to the DataLinkLayer. dll_task_2 has lower priority hence it is started afterwards. I don't understand why i still cannot at least access the object...
I can use the mutex to lock the variable but I thought that the primary reason for this is to protect the variable/object.
I am using Microsoft Visual C++ 2010 Express.
I don't know much about multithreading so maybe you can suggest a better solution to this problem as well as explain the reason of the problem.
The address of the access violation is a very small positive offset from 0xcdcdcdcd
Wikipedia says:
CDCDCDCD Used by Microsoft's C++ debugging runtime library to mark uninitialised heap memory
Here is the relevant MSDN page.
The corresponding value after free is 0xdddddddd, so it's likely to be incomplete initialization rather than use-after-free.
EDIT: James asked how optimization could mess up virtual function calls. Basically, it's because the currently standardized C++ memory model makes no guarantees about threading. The C++ standard defines that virtual calls made from within a constructor will use the declaring type of the constructor currently being run, not the final dynamic type of the object. So this means that, from the perspective of the C++ sequential execution memory model, the virtual call mechanism (practically speaking, a v-table pointer) must be set up before the constructor starts running (I believe the specific point is after base subobject construction in the ctor-initializer-list and before member subobject construction).
Now, two things can happen to make the observable behavior different in a threaded scenario:
First, the compiler is free to perform any optimization that would, in the C++ sequential execution model, act as-if the rules were being followed. For example, if the compiler can prove that no virtual calls are made inside the constructor, it could wait and set the v-table pointer at the end of the constructor body instead of the beginning. If the constructor doesn't give out the this pointer, since the caller of the constructor also hasn't received its copy of the pointer yet, then none of the functions called by the constructor can call back (virtually or statically) to the object under construction. But the constructor DOES give away the this pointer.
We have to look closer. If the function to which the this pointer is given is visible to the compiler (i.e. included in the current compilation unit), the the compiler can include its behavior in the analysis. We weren't given that function in this question (the constructor and member functions of class task), but it seems likely that the only thing that happens is that said pointer is stored in a subobject which is also not reachable from outside the constructor.
"Foul!", you cry, "I passed the address of that task subobject to a library CreateThread function, therefore it is reachable and through it, the main object is reachable." Ah, but you do not comprehend the mysteries of the "strict aliasing rules". That library function does not accept a parameter of type task *, now does it? And being a parameter whose type is perhaps intptr_t, but definitely neither task * nor char *, the compiler is permitted to assume, for purposes of as-if optimization, that it does not point to a task object (even if it clearly does). And if it does not point to a task object, and the only place our this pointer got stored is in a task member subobject, then it cannot be used to make virtual calls to this, so the compiler may legitimately delay setting up the virtual call mechanism.
But that's not all. Even if the compiler does set up the virtual call mechanism on schedule, the CPU memory model only guarantees that the change is visible to the current CPU core. Writes may become visible to other CPU cores in a completely different order. Now, the library create thread function ought to introduce a memory barrier that constrains CPU write reordering, but that fact that Koz's answer introducing a critical section (which definitely includes a memory barrier) changes the behavior suggests that perhaps no memory barrier was present in the original code.
And, CPU write reordering can not only delay the v-table pointer, but the storage of the this pointer into the task subobject.
I hope you have enjoyed this guided tour of one small corner of the cave of "multithreaded programming is hard".
printf is not, afaik, thread safe. Try surrounding the printf with a critical section.
To do this you InitializeCriticalSection inside iDataLinkLayer class. Then around the printfs you need an EnterCriticalSection and a LeaveCriticalSection. This will prevent both functions entering the printf simultaneously.
Edit: Try changing this code:
dll_task_1* task1 = new task(this);
dll_task_2* task2 = new task(this);
to
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
Im guessing that task is in fact the base class of dll_task_1 and dll_task_2 ... so, more than anything, im surprised it compiles ....
I think it's not always safe to use 'this' (i.e. to call a member function) before the end of the constructor. It could be that task are calling member function of DataLinkLayer before the end of DataLinkLayer constructor. Especially if this member function is virtual:
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.7
I wanted to comment on the creation of the DataLinkLayer.
When I call the DataLinkLayer constructor from main:
int main () {
DataLinkLayer* dataLinkLayer = new DataLinkLayer();
while(true); // to keep the main thread running
}
I, of coruse, do not destruct the object, this is first. Now, inside the DataLinkLayer cosntructor I initialize many (not only these two tasks) other objects isntances and pass to most of them dataLinkLayer pointer (using this). This is legal, as far as I am concerned. Put it further - it compiles and runs as expected.
What I became curious about is the overall design idea that I am following (if any :) ).
The DataLinkLayer is a parent class that is accessed by several tasks which try to modify it parameters or perform some other processing. Since I want that everything remain as decoupled as possible I provide only interfaces for the accessors and encapsulate the data so that I don't have any global variables, friend functions etc.
It would have been a pretty easy task to do if only multithreading would not be there. I beleive I will encounter many other pitfalls on my way.
Feel free to discuss it please and merci for your generous comments!
UPD:
Speaking of passing the iDataLinkLayer interface pointer to the tasks - is this a good way to do it? In Java it would be pretty usual thing to realize a containment or so called strategy pattern to make things decoupled and stuff. However I am not 100% sure whether it is a good solution in c++... Any suggestions/commnets on it?
In C++, during a class constructor, I started a new thread with this pointer as a parameter which will be used in the thread extensively (say, calling member functions). Is that a bad thing to do? Why and what are the consequences?
My thread start process is at the end of the constructor.
The consequence is that the thread can start and code will start executing a not yet fully initialized object. Which is bad enough in itself.
If you are considering that 'well, it will be the last sentence in the constructor, it will be just about as constructed as it gets...' think again: you might derive from that class, and the derived object will not be constructed.
The compiler may want to play with your code around and decide that it will reorder instructions and it might actually pass the this pointer before executing any other part of the code... multithreading is tricky
Main consequence is that the thread might start running (and using your pointer) before the constructor has completed, so the object may not be in a defined/usable state. Likewise, depending how the thread is stopped it might continue running after the destructor has started and so the object again may not be in a usable state.
This is especially problematic if your class is a base class, since the derived class constructor won't even start running until after your constructor exits, and the derived class destructor will have completed before yours starts. Also, virtual function calls don't do what you might think before derived classes are constructed and after they're destructed: virtual calls "ignore" classes whose part of the object doesn't exist.
Example:
struct BaseThread {
MyThread() {
pthread_create(thread, attr, pthread_fn, static_cast<void*>(this));
}
virtual ~MyThread() {
maybe stop thread somehow, reap it;
}
virtual void id() { std::cout << "base\n"; }
};
struct DerivedThread : BaseThread {
virtual void id() { std::cout << "derived\n"; }
};
void* thread_fn(void* input) {
(static_cast<BaseThread*>(input))->id();
return 0;
}
Now if you create a DerivedThread, it's a best a race between the thread that constructs it and the new thread, to determine which version of id() gets called. It could be that something worse can happen, you'd need to look quite closely at your threading API and compiler.
The usual way to not have to worry about this is just to give your thread class a start() function, which the user calls after constructing it.
Depends on what you do after starting the thread. If you perform initialization work after the thread has started, then it could use data that is not properly initialized.
You can reduce the risks by using a factory method that first creates an object, then starts the thread.
But I think the greatest flaw in the design is that, for me at least, a constructor that does more than "construction" seems quite confusing.
It can be potentially dangerous.
During construction of a base class any calls to virtual functions will not despatch to overrides in more derived classes that haven't yet been completely constructed; once the construction of the more derived classes change this changes.
If the thread that you kick-off calls a virtual function and it is indeterminate where this happens in relation to the completion of the construction of the class then you are likely to get unpredictable behaviour; perhaps a crash.
Without virtual functions, if the thread only uses methods and data of the parts of the class that have been constructed completely the behaviour is likely to be predictable.
I'd say that, as a general rule, you should avoid doing this. But you can certainly get away with it in many circumstances. I think there are basically two things that can go wrong:
The new thread might try to access the object before the constructor finishes initializing it. You can work around this by making sure all initialization is complete before you start the thread. But what if someone inherits from your class? You have no control over what their constructor will do.
What happens if your thread fails to start? There isn't really a clean way to handle errors in a constructor. You can throw an exception, but this is perilous since it means that your object's destructor will not get called. If you elect not to throw an exception, then you're stuck writing code in your various methods to check if things were initialized properly.
Generally speaking, if you have complex, error-prone initialization to perform, then it's best to do it in a method rather than the constructor.
Basically, what you need is two-phase construction: You want to start your thread only after the object is fully constructed. John Dibling answered a similar (not a duplicate) question yesterday exhaustively discussing two-phase construction. You might want to have a look at it.
Note, however, that this still leaves the problem that the thread might be started before a derived class' constructor is done. (Derived classes' constructors are called after those of their base classes.)
So in the end the safest thing is probably to manually start the thread:
class Thread {
public:
Thread();
virtual ~Thread();
void start();
// ...
};
class MyThread : public Thread {
public:
MyThread() : Thread() {}
// ...
};
void f()
{
MyThread thrd;
thrd.start();
// ...
}
It's fine, as long as you can start using that pointer right away. If you require the rest of the constructor to complete initialization before the new thread can use the pointer, then you need to do some synchronization.