Given:
class Foo {
Foo() {};
};
class Bar {
static int counter;
Bar() { ++counter; }
}
It's clear that Foo::Foo is thread safe whereas Bar::bar is not.
Furthermore, it's clear that if a function is written in such a way so that it's not thread-safe, then clearly putting it in a constructor makes that constructor not thread safe.
However, are there extra gotchas that I need to worry about constructors? I.e. a piece of code with mutex/locks such that if it was in a function body, it would be thread safe, but if I stuck it in a constructor, based on the complexity of C++'s constructors, weird things happen and it's no longer thread safe?
Thanks!
Edit: you can assume I'm using g++.
I would avoid any static values in an object used by a thread.
Why not pass in the value needed as a parameter for the constructor?
Or actually, put a mutex around the constructor in your thread. I wouldn't let the other classes be responsible for that.
Related
I want to have a base class which has the purpose of registering it for a callback (and de-registering in destructor), the callback is a pure virtual function. Like this.
struct autoregister {
autoregister() { callback_manager.register(this); }
~autoregister() { callback_manager.deregister(this); }
virtual void call_me()=0;
};
But this seems unreliable to me, I suspect there are several race conditions in there. 1) When the callback_manager sees the pointer, the call_me is still uncallable, and it could take an arbitrary amount of time until the object finishes construction, 2) by the time deregister is called, the derived object destructor was called, so the callbacks should not be called.
One of the things I was thinking, was to check, inside callback_manager, if the pointer's call_me is valid or not, but I can't find a standard compliant way to get the address of call_me or anything. I was thinking of comparing typeid(pointer) to typeid(autoregister*) but there might be an abstract class inbetween, making this unreliable, derived : public middle {}; middle : public autoregister {};, middle's constructor can spend an hour, e.g. loading SQL or searching google, and the callback sees that it's not the base class, and thinks the callback can be invoked, and boom. Can this be done?
Q1: are there other race conditions?
Q2: how to do this right (without race conditions, undefined behavior and other errors) without asking the derived class to call register manually?
Q3: how to check if a virtual function can be called on a pointer?
You should separate callback and registration handle. Nobody beside callback_manager needs call_me method, so why keep it visible from outside? Use std::function as a callback because it is very convenient: any callable can be converted to it and lambdas are very handy. Return a Handle object from callback registration method. The only method Handle will have would be a destructor from which you will remove the callback.
class Handle {
public:
explicit Handle(std::function<void()> deleter)
: deleter_(std::move(deleter))
{}
~Handle()
{
deleter_();
}
private:
std::function<void()> deleter_;
};
class Manager {
public:
typedef std::function<void()> Callback;
Handle subscribe(Callback callback) {
// NOTE: use mutex here if this method is accessed from multiple threads
callbacks_.push_back(std::move(callback));
auto itr = callbacks_.end() - 1;
// NOTE If Handle lifetime can exceed Manager lifetime, store handlers_ in std::shared_ptr and capture a std::weak_ptr in lambda.
return Handle([this, itr]{
// NOTE: use mutex here if this method is accessed from multiple threads
callbacks_.erase(itr);
});
}
private:
std::list<Callback> callbacks_;
};
Q1: are there other race conditions?
Callback/Handle may outlive callback_manager and will try to unsubscribe itself from deleted object. This can be fixed either by policy (always unsubscribe everything before deleting manager) or by using weak pointers.
And there is an obvious race if callback_manager is accessed from multiple threads you need to guard callbacks storage with mutexes.
Q2: how to do this right (without race conditions, undefined behavior
and other errors) without asking the derived class to call register
manually?
See above.
Q3: how to check if a virtual function can be called on a pointer?
This is not possible.
In the constructor of your autoregister, since the object has not fully been constructed yet, the 'this' pointer is dangerous to pass out to the callback_manager. I would recommend a slightly different design.
struct callback {
virtual void call_me() = 0;
}
struct autoregister {
callback*const callback_;
autoregister(callback*const _callback)
: callback_(_callback) {
callback_manager.register(callback_);
}
~autoregister() {
callback_manager.deregister(callback_);
}
};
Q1: are there other race conditions?
Maybe. Your callback_manager must be synchronized, if multiple threads may use it. But my version of autoregister itself does not have race condition.
Q2: how to do this right (without race conditions, undefined behavior and other errors) without asking the derived class to call register manually?
My code is how I think can do this right.
Q3: how to check if a virtual function can be called on a pointer?
Not necessary in my code. But in general you could keep in the class a flag which is set false in the initialization list and set true when ready to be called.
A race condition is when two threads are trying to do something at the same time, and the outcome depends on the precise timing. I don't think there is a race condition in this snippet because this is only accessible to the thread executing the constructor. There may however be a race condition in callback_manager, but you haven't posted the code for that so I can't tell.
There is another issue here: Objects are constructed from the base to the most derived, so at the time autoregister's constructor is running, the virtual call_me cannot be called. See this FAQ entry. There is no way to check if a virtual function call will work apart from ensuring the class is fully constructed.
Any solution to this problem that works by inheritance can't ensure that the class being registered is fully constructed before the callback is registered, so the registration must be done externally to the class being registered. The best you can do is have some RAII wrapper which registers the object on construction and deregisters it on destruction, and perhaps force the objects to be created through a factory that handles registration.
I think #Donghui Zhang is on the right track, but still not really quite there yet, so to speak. Unfortunately, what he's done introduces its own set of pitfalls--for example, if you pass the address of a local object to autoregister's ctor, you can still register a callback object that immediately goes out of scope (but doesn't necessarily immediately get deregistered).
I also think it's questionable (at best) to define a callback interface using call_me as the member function to invoke when calling back. If you need to define a type that can be invoked like a function, C++ already defines a name for that function: operator(). I'm going to enforce that instead of call_me being present.
To do all this, I think you really want to use a template instead of inheritance:
template <class T>
class autoregister {
T t;
public:
template <class...Args>
autoregister(Args && ... args) : t(std::forward(args)...) {
static_assert(std::is_callable<T>::value, "Error: callback must be callable");
callback_manager.register(t);
}
~autoregister() { callback_manager.deregister(t); }
};
You'd use this something like this:
class f {
public:
virtual void operator()() { /* ... */ }
};
autoregister<f> a;
The static_assert assures that the type you pass as the template parameter can be called like a function.
This also supports passing arguments through autoregister to the constructor for the object it contains, so you might have something like:
class F {
public:
F(int a, int b) { ... }
void operator()() {}
};
autoregister<F> f(1,2);
...and the 1, 2 will be passed through from autoregister to F when it's constructed. Also note that this doesn't attempt to enforce a specific signature for the callback function. If you were, for example, to modify your callback manager to do the callback as int r = callback(1);, then the code would only compile if the callback objects you registered could be invoked with an int argument, and returned an int (or something that could be implicitly converted to an int, anyway). The compiler will enforce the callback having a signature compatible with how it's called. The only big shortcoming here is that if you pass a type that can be called, but (for example) can't be called with the parameter(s) that the callback manager tries to pass, the error message(s) you get may not be as readable as you'd like.
There is a global variable called "pTrackerArray", which is used in Loki's SetLongevity function.
Declaration of pTrackerArray:
typedef std::list<LifetimeTracker*> TrackerArray;
extern LOKI_EXPORT TrackerArray* pTrackerArray;
Definition of SetLongevity:
template <typename T, typename Destroyer>
void SetLongevity(T* pDynObject, unsigned int longevity, Destroyer d)
{
using namespace Private;
// manage lifetime of stack manually
if(pTrackerArray==0)
pTrackerArray = new TrackerArray;
// For simplicity, the rest of code is omitted
...
}
Is it thread safe to use pTrackerArray as such in SetLongevity?
As shown, obviously not. However, if I am reading the rest of that file correctly, SetLongevity is, ultimately, only ever called from within a function that is itself properly wrapped in a mutex [provided you requested that the singleton be thread-safe, obviously]. So while that particular function has issues, its use is still perfectly safe.
However, the mutex they create in that base function is paramaterized on the type of singleton you are creating, while that global pointer is shared between all singletons. So yes, it does appear as though two different Singleton objects in two different threads could both access that function at once, causing havok.
When writing C programs that need to share a file scope variable between the application and an interrupt routine/a thread/a callback routine, it is well-known that the variable must be declared volatile, or else the compiler may do incorrect optimizations. This is an example of what I mean:
int flag;
void some_interrupt (void)
{
flag = 1;
}
int main()
{
flag = 0;
...
/* <-- interrupt occurs here */
x = flag; /* BUG: the compiler doesn't realize that "flag" was changed
and sets x to 0 even though flag==1 */
}
To prevent the above bug, "flag" should have been declared as volatile.
My question is: how does this apply to C++ when creating a class containing a thread?
I have a class looking something like this:
class My_thread
{
private:
int flag;
static void thread_func (void* some_arg) // thread callback function
{
My_thread* this_ptr= (My_thread*)some_arg;
}
};
"some_arg" will contain a pointer to an instance of the class, so that each object of "My_thread" has its own thread. Through this pointer it will access member variables.
Does this mean that "this_ptr" must be declared as pointer-to-volatile data? Must "flag" be volatile as well? And if so, must I make all member functions that modify "flag" volatile?
I'm not interested in how a particular OS or compiler behaves, I am looking for a generic, completely portable solution.
EDIT: This question has nothing to do with thread-safety whatsoever!
The real code will have semaphores etc.
To clarify, I wish to avoid bugs caused by the compiler's unawareness of that a callback function may be called from sources outside the program itself, and therefore make incorrect conclusions about whether certain variables have been used or not. I know how to do this in C, as illustrated with the first example, but not in C++.
Well, that edit makes all the difference of the world. Semaphores introduce memory barriers. Those make volatile redundant. The compiler will always reload int flag after any operation on a semaphore.
Fred Larson already predicted this. volatile is insufficient in the absence of locks, and redudant in the presence of locks. That makes it useless for thread-safe programming.
From the function pointer signature I guess you are using the posix thread implementation for threads. I assume you want to know how to start off a thread using this API. First consider using boost thread instead. If not an option, I usually go for something like the following to get somewhat of that cosy Java readability.
class Runnable {
public:
virtual void run() = 0;
};
class Thread : public Runnable {
public:
Thread();
Thread(Runnable *r);
void start();
void join();
pthread_t getPthread() const;
private:
static void *start_routine(void *object);
Runnable *runner;
pthread_t thread;
};
And then something like this in the start_routine function:
void* Thread::start_routine(void *object) {
Runnable *o = (Runnable *)object;
o->run();
pthread_exit(NULL);
return NULL;
}
Now access to fields of classes extending the Runnable or Thread class need not be volatile since they are thread-local.
That said, sharing data between threads is more complex than using a volatile data member unfortunately if that is what you asked...
Read this article by Andrei Alexandrescu over at Dr. Dobbs, it might be relevant:
volatile - Multithreaded Programmer's Best Friend
From the intro to the article:
The volatile keyword was devised to
prevent compiler optimizations that
might render code incorrect in the
presence of certain asynchronous
events. For example, if you declare a
primitive variable as volatile, the
compiler is not permitted to cache it
in a register -- a common optimization
that would be disastrous if that
variable were shared among multiple
threads. So the general rule is, if
you have variables of primitive type
that must be shared among multiple
threads, declare those variables
volatile. But you can actually do a
lot more with this keyword: you can
use it to catch code that is not
thread safe, and you can do so at
compile time. This article shows how
it is done; the solution involves a
simple smart pointer that also makes
it easy to serialize critical sections
of code.
Some implementation of the fallback mechanism is given here for both Windows and Linux. Try this example:
typeReadFileCallback varCallback;
I was able to implement using that.
Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?
From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}
First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".
I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.
If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.
Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.
You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.
Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.
If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.
If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.
Is there a pattern that I may use for calling the required initialization and cleanup routines of an underlying (C) library? In my case, I would like to create the wrapper class so that it can be composed into other objects. The problem is that, when I destroy the wrapper class, the cleanup routines of the underlying library are called. That's fine until I instantiate multiple objects of my wrapper class. My question is what is the best way to really handle this situation? A static reference counter comes to mind, but I wanted to know if there were other potentially better options and the trades involved.
If the initialization can be called before main starts, and cleanup called after main ends, this little trick (hack?) might work for you:
#include <iostream>
// C library initialization routine
void init() {std::cout << "init\n";}
// C library cleanup routine
void fini() {std::cout << "fini\n";}
// Put this in only one of your *.cpp files
namespace // anonymous
{
struct Cleaner
{
Cleaner() {init();}
~Cleaner() {fini();}
};
Cleaner cleaner;
};
int main()
{
std::cout << "using library\n";
}
Output:
init
using library
fini
It uses (abuses?) the fact that constructors for static objects are called before main, and that destructors are called after main. It's like RAII for the whole program.
Not everything has to be a class. The Singleton pattern would let you turn this into a class, but it's really not buying you anything over global functions:
bool my_library_init();
void my_library_shutdown();
The first call returns true if the library was successfully initialized, the second just quietly does whatever needs to be done and exits. You can add whatever reference counting or thread tracking type stuff behind these interfaces you like.
Also, don't neglect the possibility that your library may be able to do all of this transparently. When the first library function is called, could it detect that it is not initialized yet and set everything up before doing the work? For shutdown, just register the resources to be destroyed with a global object, so they are destroyed when the program exits. Doing it this way is certainly trickier, but may be worth the usability benefit to your library's callers.
I have seen a lot of Singleton talk, so I can only recommend a look at Alexandrescu's work.
However I am not sure that you really need a Singleton there. Because if you do, you assume that all your calls are going to share the state... is it the case ? Do you really wish when you call the library through a different instance of Wrapper to get the state in which the last call set it ?
If not, you need to serialize the access, and reinitialize the data each time.
class Wrapper
{
public:
Wrapper() { lock(Mutex()); do_init_c_library(); }
~Wrapper() { do_clean_up_c_library(); unlock(Mutex()); }
private:
static Mutex& Mutex() { static Mutex MMutex; return MMutex; }
}; // class Wrapper
Quite simple... though you need to make sure that Mutex is initialized correctly (once) and live until it's not needed any longer.
Boost offers facilities for the once issue, and since we use a stack based approach with MMutex it should not go awry... I think (hum).
If you can change the library implementation, you could have each call to one of the library's functions access a singleton which is created on first use.
Or you put a global/static variable into the library which initializes it during construction and shuts it down during destruction. (That might become annoying if the library uses global variables itself and the order of initialization/shutdown conflicts with them. Also, linkers might decide to eliminate unreferenced globals...)
Otherwise, I don't see how you want to avoid reference counting. (Note, however, that it has the drawback of possibly creating multiple init/shutdown cycles during the program's lifetime.)
If your set of C library routines is not too large, you can try combining the Singleton and Facade patterns so that C library routines are only invoked via the Facade. The Facade ensures the initialization and cleanup of the C library. Singleton insures that there is only one instance of the Facade.
#include <iostream>
// C library initialization and cleanup routines
void init() {std::cout << "init\n";}
void fini() {std::cout << "fini\n";}
// C library routines
void foo() {std::cout << "foo\n";}
void bar() {std::cout << "bar\n";}
class Facade // Singleton
{
public:
void foo() {::foo();}
void bar() {::bar();}
static Facade& instance() {static Facade instance; return instance;}
private:
Facade() {init();}
~Facade() {fini();}
};
// Shorthand for Facade::instance()
inline Facade& facade() {return Facade::instance();}
int main()
{
facade().foo();
facade().bar();
}
Output:
init
foo
bar
fini