When writing C programs that need to share a file scope variable between the application and an interrupt routine/a thread/a callback routine, it is well-known that the variable must be declared volatile, or else the compiler may do incorrect optimizations. This is an example of what I mean:
int flag;
void some_interrupt (void)
{
flag = 1;
}
int main()
{
flag = 0;
...
/* <-- interrupt occurs here */
x = flag; /* BUG: the compiler doesn't realize that "flag" was changed
and sets x to 0 even though flag==1 */
}
To prevent the above bug, "flag" should have been declared as volatile.
My question is: how does this apply to C++ when creating a class containing a thread?
I have a class looking something like this:
class My_thread
{
private:
int flag;
static void thread_func (void* some_arg) // thread callback function
{
My_thread* this_ptr= (My_thread*)some_arg;
}
};
"some_arg" will contain a pointer to an instance of the class, so that each object of "My_thread" has its own thread. Through this pointer it will access member variables.
Does this mean that "this_ptr" must be declared as pointer-to-volatile data? Must "flag" be volatile as well? And if so, must I make all member functions that modify "flag" volatile?
I'm not interested in how a particular OS or compiler behaves, I am looking for a generic, completely portable solution.
EDIT: This question has nothing to do with thread-safety whatsoever!
The real code will have semaphores etc.
To clarify, I wish to avoid bugs caused by the compiler's unawareness of that a callback function may be called from sources outside the program itself, and therefore make incorrect conclusions about whether certain variables have been used or not. I know how to do this in C, as illustrated with the first example, but not in C++.
Well, that edit makes all the difference of the world. Semaphores introduce memory barriers. Those make volatile redundant. The compiler will always reload int flag after any operation on a semaphore.
Fred Larson already predicted this. volatile is insufficient in the absence of locks, and redudant in the presence of locks. That makes it useless for thread-safe programming.
From the function pointer signature I guess you are using the posix thread implementation for threads. I assume you want to know how to start off a thread using this API. First consider using boost thread instead. If not an option, I usually go for something like the following to get somewhat of that cosy Java readability.
class Runnable {
public:
virtual void run() = 0;
};
class Thread : public Runnable {
public:
Thread();
Thread(Runnable *r);
void start();
void join();
pthread_t getPthread() const;
private:
static void *start_routine(void *object);
Runnable *runner;
pthread_t thread;
};
And then something like this in the start_routine function:
void* Thread::start_routine(void *object) {
Runnable *o = (Runnable *)object;
o->run();
pthread_exit(NULL);
return NULL;
}
Now access to fields of classes extending the Runnable or Thread class need not be volatile since they are thread-local.
That said, sharing data between threads is more complex than using a volatile data member unfortunately if that is what you asked...
Read this article by Andrei Alexandrescu over at Dr. Dobbs, it might be relevant:
volatile - Multithreaded Programmer's Best Friend
From the intro to the article:
The volatile keyword was devised to
prevent compiler optimizations that
might render code incorrect in the
presence of certain asynchronous
events. For example, if you declare a
primitive variable as volatile, the
compiler is not permitted to cache it
in a register -- a common optimization
that would be disastrous if that
variable were shared among multiple
threads. So the general rule is, if
you have variables of primitive type
that must be shared among multiple
threads, declare those variables
volatile. But you can actually do a
lot more with this keyword: you can
use it to catch code that is not
thread safe, and you can do so at
compile time. This article shows how
it is done; the solution involves a
simple smart pointer that also makes
it easy to serialize critical sections
of code.
Some implementation of the fallback mechanism is given here for both Windows and Linux. Try this example:
typeReadFileCallback varCallback;
I was able to implement using that.
Related
I have a pre-existing source code in C similar to below.
bool getFlag(int param)
{
static bool flag = false;
if(param == 1)
flag = true;
return flag;
}
I have written the C++ version of the same as below.
class MyClass
{
public:
static bool getFlag(int param)
{
if(param == 1)
flag = true;
return flag;
}
private:
static bool flag;
};
What is the difference between the above two code snippets?
Does the C++ code above has advantage over the C code in any ways?
Its somewhat similar. Anyone who interacts with any instance of your class MyClass will interact with the same variable flag.
The same is true of your function. Any caller will interact with the same static variable.
However, there is definitely some ambiguity in how they behave in a multi-threaded environment depending on your compiler (are you compiling pure c functions or a mixed c/c++ with a new compiler?)
Basically, initialization wasn't thread safe before C++11, and you'd get data races if two threads reached the initialization (or subsequent modification) of the local static variable. This existed all the way up until Visual Studio 2015 on the Microsoft side.
As such, on modern compilers, C++ behaves differently.
https://stackoverflow.com/a/11711991/128581
If control enters the declaration concurrently while the variable is
being initialized, the concurrent execution shall wait for completion
of the initialization.
Without the whole context it's hard to say which is better, it's just a matter of abstraction. Even at the assembly level you can't tell the difference, the static variable goes to the .BSS segment in both cases and the logic is exactly the same, because your method is static(and assuming it's the only method you have and call), it doesn't make use of the hidden "this" argument or have a constructor call at any point.
You can only tell the difference by compiling the code in debug mode and interpret the mangled names generated.
I'm trying to develop a thread abstraction (POSIX thread and thread from the Windows API), and I would very much like it to be able to start them with a method pointer, and not a function pointer.
What I would like to do is an abstraction of thread being a class with a pure virtual method "runThread", which would be implanted in the future threaded class.
I don't know yet about the Windows thread, but to start a POSIX thread, you need a function pointer, and not a method pointer.
And I can't manage to find a way to associate a method with an instance so it could work as a function.
I probably just can't find the keywords (and I've been searching a lot), I think it's pretty much what Boost::Bind() does, so it must exist.
Can you help me ?
Don't do this. Use boost::thread.
With boost::thread you can start threads with any functor of signature void(), so you can use std::mem_fun and std::bind1st, like in
struct MyAwesomeThread
{
void operator()()
{
// Do something with the data
}
// Add constructors, and perhaps a way to get
// a result back
private:
// some data here
};
MyAwesomeThread t(parameters)
boost::thread(std::bind1st(std::mem_fun_ref(&t::operator()), t));
EDIT: If you really want to abstract POSIX threads (it is not hard), you can do (I leave you the initialization of the pthread_attr)
class thread
{
virtual void run() = 0; // private method
static void run_thread_(void* ptr)
{
reinterpret_cast<thread*>(ptr)->run();
}
pthread_t thread_;
pthread_attr_t attr_;
public:
void launch()
{
pthread_create(&thread_, &attr_, &::run_thread_, reinterpret_cast<void*>(this));
}
};
but boost::thread is portable, flexible and very simple to use.
You really should use Boost.Thread. But if you can't and the call to start a thread allows you to pass a parameter to your thread function, a common idiom is to have a stand-alone or static member function which casts the parameter to an object pointer. e.g.
class Thread {
public:
void start() { start_thread(_work, this); } // whatever call starts a thread
void work() {} // does thread work
private:
static void _work(void* param) {
(reinterpret_cast<Thread*>(param))->work();
}
}
You need a C++ API for threading. Something like boost::thread (which is pretty much the same API that will be in the new C++). The OS thread API's are generally in C and you simply CAN'T pass non-static member function pointers to them, nor functors (which is what boost::bind creates).
Well, it's actually an exercise, and I don't have the right to use anything (not even Boost).
Of course I don't HAVE to do it that way (I only have to develop a thread abstraction), I just would like to : I used SFML thread abstraction once and I just loved it. It's such a sexy way to deal with it.
Ferruccio way seems good !
If the work() method is pure virtual, it can be implemented in any class implementing the thread abstraction... and it would work just fine, right ? (I'm not quite sure if it will, but according to my basic knowledge of C++, I guess it should ?)
Given:
class Foo {
Foo() {};
};
class Bar {
static int counter;
Bar() { ++counter; }
}
It's clear that Foo::Foo is thread safe whereas Bar::bar is not.
Furthermore, it's clear that if a function is written in such a way so that it's not thread-safe, then clearly putting it in a constructor makes that constructor not thread safe.
However, are there extra gotchas that I need to worry about constructors? I.e. a piece of code with mutex/locks such that if it was in a function body, it would be thread safe, but if I stuck it in a constructor, based on the complexity of C++'s constructors, weird things happen and it's no longer thread safe?
Thanks!
Edit: you can assume I'm using g++.
I would avoid any static values in an object used by a thread.
Why not pass in the value needed as a parameter for the constructor?
Or actually, put a mutex around the constructor in your thread. I wouldn't let the other classes be responsible for that.
Suppose you have the following code:
int main(int argc, char** argv) {
Foo f;
while (true) {
f.doSomething();
}
}
Which of the following two implementations of Foo are preferred?
Solution 1:
class Foo {
private:
void doIt(Bar& data);
public:
void doSomething() {
Bar _data;
doIt(_data);
}
};
Solution 2:
class Foo {
private:
Bar _data;
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
In plain english: if I have a class with a method that gets called very often, and this method defines a considerable amount of temporary data (either one object of a complex class, or a large number of simple objects), should I declare this data as private members of the class?
On the one hand, this would save the time spent on constructing, initializing and destructing the data on each call, improving performance. On the other hand, it tramples on the "private member = state of the object" principle, and may make the code harder to understand.
Does the answer depend on the size/complexity of class Bar? What about the number of objects declared? At what point would the benefits outweigh the drawbacks?
From a design point of view, using temporaries is cleaner if that data is not part of the object state, and should be preferred.
Never make design choices on performance grounds before actually profiling the application. You might just discover that you end up with a worse design that is actually not any better than the original design performance wise.
To all the answers that recommend to reuse objects if construction/destruction cost is high, it is important to remark that if you must reuse the object from one invocation to another, in many cases the object must be reset to a valid state between method invocations and that also has a cost. In many such cases, the cost of resetting can be comparable to construction/destruction.
If you do not reset the object state between invocations, the two solutions could yield different results, as in the first call, the argument would be initialized and the state would probably be different between method invocations.
Thread safety has a great impact on this decision also. Auto variables inside a function are created in the stack of each of the threads, and as such are inherently thread safe. Any optimization that pushes those local variable so that it can be reused between different invocations will complicate thread safety and could even end up with a performance penalty due to contention that can worsen the overall performance.
Finally, if you want to keep the object between method invocations I would still not make it a private member of the class (it is not part of the class) but rather an implementation detail (static function variable, global in an unnamed namespace in the compilation unit where doOperation is implemented, member of a PIMPL...[the first 2 sharing the data for all objects, while the latter only for all invocations in the same object]) users of your class do not care about how you solve things (as long as you do it safely, and document that the class is not thread safe).
// foo.h
class Foo {
public:
void doOperation();
private:
void doIt( Bar& data );
};
// foo.cpp
void Foo::doOperation()
{
static Bar reusable_data;
doIt( reusable_data );
}
// else foo.cpp
namespace {
Bar reusable_global_data;
}
void Foo::doOperation()
{
doIt( reusable_global_data );
}
// pimpl foo.h
class Foo {
public:
void doOperation();
private:
class impl_t;
boost::scoped_ptr<impl_t> impl;
};
// foo.cpp
class Foo::impl_t {
private:
Bar reusable;
public:
void doIt(); // uses this->reusable instead of argument
};
void Foo::doOperation() {
impl->doIt();
}
First of all it depends on the problem being solved. If you need to persist the values of temporary objects between calls you need a member variable. If you need to reinitialize them on each invokation - use local temporary variables. It a question of the task at hand, not of being right or wrong.
Temporary variables construction and destruction will take some extra time (compared to just persisting a member variable) depending on how complex the temporary variables classes are and what their constructors and destructors have to do. Deciding whether the cost is significant should only be done after profiling, don't try to optimize it "just in case".
I'd declare _data as temporary variable in most cases. The only drawback is performance, but you'll get way more benefits. You may want to try Prototype pattern if constructing and destructing are really performance killers.
If it is semantically correct to preserve a value of Bar inside Foo, then there is nothing wrong with making it a member - it is then that every Foo has-a bar.
There are multiple scenarios where it might not be correct, e.g.
if you have multiple threads performing doSomething, would they need all separate Bar instances, or could they accept a single one?
would it be bad if state from one computation carries over to the next computation.
Most of the time, issue 2 is the reason to create local variables: you want to be sure to start from a clean state.
Like a lot of coding answers it depends.
Solution 1 is a lot more thread-safe. So if doSomething were being called by many threads I'd go for Solution 1.
If you're working in a single threaded environment and the cost of creating the Bar object is high, then I'd go for Solution 2.
In a single threaded env and if the cost of creating Bar is low, then I think i'd go for Solution 1.
You have already considered "private member=state of the object" principle, so there is no point in repeating that, however, look at it in another way.
A bunch of methods, say a, b, and c take the data "d" and work on it again and again. No other methods of the class care about this data. In this case, are you sure a, b and c are in the right class?
Would it be better to create another smaller class and delegate, where d can be a member variable? Such abstractions are difficult to think of, but often lead to great code.
Just my 2 cents.
Is that an extremely simplified example? If not, what's wrong with doing it this
void doSomething(Bar data);
int main() {
while (true) {
doSomething();
}
}
way? If doSomething() is a pure algorithm that needs some data (Bar) to work with, why would you need to wrap it in a class? A class is for wrapping a state (data) and the ways (member functions) to change it.
If you just need a piece of data then use just that: a piece of data. If you just need an algorithm, then use a function. Only if you need to keep a state (data values) between invocations of several algorithms (functions) working on them, a class might be the right choice.
I admit that the borderlines between these are blurred, but IME they make a good rule of thumb.
If it's really that temporary that costs you the time, then i would say there is nothing wrong with including it into your class as a member. But note that this will possibly make your function thread-unsafe if used without proper synchronization - once again, this depends on the use of _data.
I would, however, mark such a variable as mutable. If you read a class definition with a member being mutable, you can immediately assume that it doesn't account for the value of its parent object.
class Foo {
private:
mutable Bar _data;
private:
void doIt(Bar& data);
public:
void doSomething() {
doIt(_data);
}
};
This will also make it possible to use _data as a mutable entity inside a const function - just like you could use it as a mutable entity if it was a local variable inside such a function.
If you want Bar to be initialised only once (due to cost in this case). Then I'd move it to a singleton pattern.
Is there a pattern that I may use for calling the required initialization and cleanup routines of an underlying (C) library? In my case, I would like to create the wrapper class so that it can be composed into other objects. The problem is that, when I destroy the wrapper class, the cleanup routines of the underlying library are called. That's fine until I instantiate multiple objects of my wrapper class. My question is what is the best way to really handle this situation? A static reference counter comes to mind, but I wanted to know if there were other potentially better options and the trades involved.
If the initialization can be called before main starts, and cleanup called after main ends, this little trick (hack?) might work for you:
#include <iostream>
// C library initialization routine
void init() {std::cout << "init\n";}
// C library cleanup routine
void fini() {std::cout << "fini\n";}
// Put this in only one of your *.cpp files
namespace // anonymous
{
struct Cleaner
{
Cleaner() {init();}
~Cleaner() {fini();}
};
Cleaner cleaner;
};
int main()
{
std::cout << "using library\n";
}
Output:
init
using library
fini
It uses (abuses?) the fact that constructors for static objects are called before main, and that destructors are called after main. It's like RAII for the whole program.
Not everything has to be a class. The Singleton pattern would let you turn this into a class, but it's really not buying you anything over global functions:
bool my_library_init();
void my_library_shutdown();
The first call returns true if the library was successfully initialized, the second just quietly does whatever needs to be done and exits. You can add whatever reference counting or thread tracking type stuff behind these interfaces you like.
Also, don't neglect the possibility that your library may be able to do all of this transparently. When the first library function is called, could it detect that it is not initialized yet and set everything up before doing the work? For shutdown, just register the resources to be destroyed with a global object, so they are destroyed when the program exits. Doing it this way is certainly trickier, but may be worth the usability benefit to your library's callers.
I have seen a lot of Singleton talk, so I can only recommend a look at Alexandrescu's work.
However I am not sure that you really need a Singleton there. Because if you do, you assume that all your calls are going to share the state... is it the case ? Do you really wish when you call the library through a different instance of Wrapper to get the state in which the last call set it ?
If not, you need to serialize the access, and reinitialize the data each time.
class Wrapper
{
public:
Wrapper() { lock(Mutex()); do_init_c_library(); }
~Wrapper() { do_clean_up_c_library(); unlock(Mutex()); }
private:
static Mutex& Mutex() { static Mutex MMutex; return MMutex; }
}; // class Wrapper
Quite simple... though you need to make sure that Mutex is initialized correctly (once) and live until it's not needed any longer.
Boost offers facilities for the once issue, and since we use a stack based approach with MMutex it should not go awry... I think (hum).
If you can change the library implementation, you could have each call to one of the library's functions access a singleton which is created on first use.
Or you put a global/static variable into the library which initializes it during construction and shuts it down during destruction. (That might become annoying if the library uses global variables itself and the order of initialization/shutdown conflicts with them. Also, linkers might decide to eliminate unreferenced globals...)
Otherwise, I don't see how you want to avoid reference counting. (Note, however, that it has the drawback of possibly creating multiple init/shutdown cycles during the program's lifetime.)
If your set of C library routines is not too large, you can try combining the Singleton and Facade patterns so that C library routines are only invoked via the Facade. The Facade ensures the initialization and cleanup of the C library. Singleton insures that there is only one instance of the Facade.
#include <iostream>
// C library initialization and cleanup routines
void init() {std::cout << "init\n";}
void fini() {std::cout << "fini\n";}
// C library routines
void foo() {std::cout << "foo\n";}
void bar() {std::cout << "bar\n";}
class Facade // Singleton
{
public:
void foo() {::foo();}
void bar() {::bar();}
static Facade& instance() {static Facade instance; return instance;}
private:
Facade() {init();}
~Facade() {fini();}
};
// Shorthand for Facade::instance()
inline Facade& facade() {return Facade::instance();}
int main()
{
facade().foo();
facade().bar();
}
Output:
init
foo
bar
fini