Say I have the following classes:
public class Item {
public:
CString name;
int id;
UINT type;
bool valid;
void invalidate(){
valid = false;
}
...
}
public class itemPool {
public:
static std::vector<Item*> items ;
void invalidateOfType(UINT type){
for( auto iter : items )
if ( iter->type == type )
iter->invalidate();
}
...
}
Can I call the "invalidateOfType(UINT type)" - method from different threads?
Is there any possibility of "undefined behaviour" ? In other words, can I use static resources from in multiple threads ( make parallel calls to that resource ) ?
Static resources are no different than any shared resources. Unless their methods are thread-safe, they should not be called from multiple threads simultaneously. In your particular case, it boils down to the question of invalidate() being thread safe. Iterating over vector itself is thread-safe.
Quite unexpectedly (to me!) the question turned out into something very interesting and educational. Following are points of interest to remember here. In explaining those, I will take the code at the face value. I will also operate under the assumption (actually clarified by OP in some of the comments) that no code is READING while the invalidation takes place.
The code as written would iterate over the same vector at the same time. Since iterating the vector which is not modified during iteration is thread safe, this part is thread safe and needs no further discussion.
The second question is 'can two or more threads execute invalidateOfType for the same type at the same type'? If the answer is NO - every thread has it's own type - than again, the code is 100% thread safe, since same objects are not accessed from more than one thread and no further discussion is neccessary.
If the answer to the above question is 'YES', than we have a conondrum. Effectively it boils down to the question 'when two or more threads set the same memory location to the same value at the same time, is it going to produce unexpected results'? Precise reading of standards does not give a straight answer.
No, you cannot. This could result in two threads executing valid = false; at the same time on the same valid. It is not permissible to modify an object in one thread while another thread is, or might be, accessing it. (To be sure, check the docs for the particular threading model or library you are using, but most have this rule.)
I would consider this okay on Windows, because everyone does it. It's unlikely that some subsequent change to the platform will break everyone's code. I wouldn't do this on POSIX platforms because the documentation is pretty clear that it's not allowed and it's not commonly done.
If your question is, calling invalidateOfType() simultaniously from different threads lead to data curruption (one thread reading the other writing to the same object), then the answer is yes.
But you can protect the resource, in this example the items vector, with a std::mutex and std::lock_guard like:
class itemPool {
public:
static std::vector<Item*> items;
static std::mutex items_mutex;
void invalidateOfType(UINT type) {
std::lock_guard< std::mutex > scoped_lock(items_mutex);
for (auto iter : items)
if (iter->type = type)
{
iter->invalidate();
}
}
...
}
If thread1 is just executing invalidateOfType and thread2 does a call to invalidateOfType, then thread2 has to wait until thread1 has left the function and the items_mutex is unlocked.
Do this with every resource you share accross threads to prevent corruption.
Can I call the "invalidateOfType(UINT type)" - method from different threads?
Dear god, no! Simply touching that array from another thread while the invalidateOfType function is running is sufficient to crash your program instantly. There's no locks anywhere.
At the very least you should lock (ie mutex) access to the array itself.
Related
I'm in the design phase of a multi threading problem I might implement in c++. Will be the first time implementing something multi threaded in c++. The question is quite simple: If I have a function with a const parameter as input, is it just the function under consideration that is not allowed to alter it? Or does c++ guarantee that the parameter will not change (even if another thread tries to access it mid-function)?
void someFunction(const SomeObject& mustNotChange){
bool check;
if(mustNotChange.getNumber()==0) check == true; //sleep for 10s
if(check && mustNotChange.getNumber()!=0) //CRASH!!!
}
In your example const doesn't make any difference for other threads as mustNotChange is in your current function stack space (or even a register) which should not be accessible by other threads.
I assume you are more interested in the case where other threads can access the memory, something like:
void someFunction(const int& mustNotChange)
{
//...
}
void someOtherFunction(int& mayCHange)
{
//...
}
int main()
{
int i = 0;
std::thread t0([&i](){someFunction(i);});
std::thread t1([&i](){someOtherFunction(i);});
t0.join();
t1.join();
return 0;
}
In this case the const ensure that someFunction can't change the value of mustNotChange and if it does it's undefined behavior, but this doesn't offer any guarantees about other functions that can access the same memory.
As a conclusion:
if you don't share data (as in the same memory location) between threads as you do in your example you don't have to worry about data races
if you share data, but no function can change the shared data (all functions receive data as const) you don't have to worry about data races. Please note that in current example if you change i before both threads join it's still a data race!
if you share data and at least one function can change the data you
must synchronization mechanisms.
It is not possible to say by just looking at the code if the referred-to object can change or not. This is why it's important to design the application with concurrency in mind, to, for example, minimize explicit sharing of writable data.
It doesn't matter that an object is accessed inside a function or through a const reference, the rules are the same. In the C++ memory model, accessing the same (non-atomic) value concurrently is possible only for reading. As soon as at least one writer is involved (in any thread), no other reads or writes may happen concurrently.
Furthermore, reads and writes in different threads must be synchronized; this is known as the "happens-before" semantics; locking and releasing a mutex or waiting on an atomic are examples of synchronization events which "release" writes to other threads, which subsequently "acquire" those writes.
For more details on C++ concurrency there is a very good book "C++ Concurrency in Action". Herb Sutter also has a nice atomic<> weapons talk.
Basically the answer is yes; You are passing the variable by const value and hence the function to which this variable is scoped is not allowed to alter it.
It is best to think of a const parameter in C++ as a promise by the function not to change the value. It doesn't mean someone else doesn't.
I'm using boost to start a thread and the thread function is a member function of my class just like this:
class MyClass {
public:
void ThreadFunc();
void StartThread() {
worker_thread_ = boost::shared_ptr<boost::thread>(
new boost::thread(boost::bind(&MyClass::ThreadFunc, this)));
}
};
I will access some member variables in ThreadFunc:
while (!stop) {
Sleep(1000); // here will be some block operations
auto it = this->list.begin();
if (it != this->list.end())
...
}
I can not wait forever for thread return, so I set timeout:
stop = true;
worker_thread_->interrupt();
worker_thread_->timed_join(boost::posix_time::milliseconds(timeout_ms));
After timeout, I will delete this MyClass pointer. Here will be a problem, the ThreadFunc hasn't return, it will have chances to access this and its member variables. In my case, the iterator will be invalid and it != this->list.end() will be true, so my program will crash if using invalid iterator.
My question is how to avoid it ? Or how to check whether this is valid or member variables is valid ? Or could I set some flags to tell ThreadFunc the destructor has been called ?
There are a lot of possible solutions. One is to use a shared_ptr to the class and let the thread hold its own shared_ptr to the class. That way, the object will automatically get destroyed only when both threads are done with it.
How about you create a stopProcessing flag (make it atomic) as a member of MyClass and in your ThreadFunc method check at each cycle if this flag is set?
[EDIT: making clearer the answer]
There a two orthogonal problems:
stopping the processing (I lost my patience, stop now please). This can be arranged by setting a flag into MyClass and make ThreadFunc checking it as often as reasonable possible
deallocation of resources. This is best by using RAII - one example being the use of shared_ptr
Better keep them as separate concerns.
Combining them in a single operation may be possible, but risky.
E.g. if using shared_ptr, the once the joining thread decided "I had enough", it simply goes out of the block which keeps its "copy" of shared_ptr, thus the shared_ptr::use_count gets decremented. The thread function may notice this and decide to interpret it as an "caller had enough" and cut short the processing.However, this implies that, in the future releases, nobody else (but the two threads) may acquire a shared_ptr, otherwise the "contract" of 'decremented use_count means abort' is broken.
(a use_count==1 condition may still be usable - interpretation "Only me, the processing thread, seems to be interested in the results; no consumer for them, better abort the work").
I would like to ask about thread safety in C++ (using POSIX threads with a C++ wrapper for ex.) when a single instance/object of a class is shared between different threads. For example the member methods of this single object of class A would be called within different threads. What should/can I do about thread safety?
class A {
private:
int n;
public:
void increment()
{
++n;
}
void decrement()
{
--n;
}
};
Should I protect class member n within increment/decrement methods with a lock or something else? Also static (class variables) members have such a need for lock?
If a member is immutable, I do not have to worry about it, right?
Anything that I cannot foreseen now?
In addition to the scenario with a single object within multithreads, what about multiple object with multiple threads? Each thread owns an instance of a class. Anything special other than static (class variables) members?
These are the things in my mind, but I believe this is a large topic and I would be glad if you have good resources and refer previous discussions about that.
Regards
Suggestion: don't try do it by hand. Use a good multithread library like the one from Boost: http://www.boost.org/doc/libs/1_47_0/doc/html/thread.html
This article from Intel will give you a good overview: http://software.intel.com/en-us/articles/multiple-approaches-to-multithreaded-applications/
It's a really large topic and probably it's impossible to complete the topic in this thread.
The golden rule is "You can't read while somebody else is writing."
So if you have an object that share a variable you have to put a lock in the function that access the shared variable.
There are very few cases when this is not true.
The first case is for integer number you can use the atomic function as showed by c-smile, in this case the CPU will use an hardware lock on the cache, so other cores can't modify the variables.
The second cases are lock free queue, that are special queue that use the compare and excange function to assure the atomicity of the instruction.
All the other cases are MUST be locked...
the first aproach is to lock everything, this can lead to a lot of problem when more object are involved (ObjA try to read from ObjB but, ObjB is using the variable and also is waiting for ObjC that wait ObjA) Where circular lock can lead to indefinite waiting (deadlock).
A better aproach is to minimize the point where thread share variable.
For example if you have and array of data, and you want to parallelize the computation on the data you can launch two thread and thread one will work only on even index while thread two will work on the odd. The thread are working on the same set of data, but as long the data don't overlap you don't have to use lock. (This is called data parallelization)
The other aproch is to organize the application as a set of "work" (function that run on a thread a produce a result) and make the work communicate only with messages. You only have to implement a thread safe message system and a work sheduler you are done. Or you can use libray like intel TBB.
Both approach don't solve deadlock problem but let you isolate the problem and find bugs more easily. Bugs in multithread are really hard to debug and sometime are also difficoult to find.
So, if you are studing I suggest to start with the thery and start with pThread, then whe you are learned the base move to a more user frendly library like boost or if you are using Gcc 4.6 as compiler the C++0x std::thread
yes, you should protect the functions with a lock if they are used in a multithreading environment. You can use boost libraries
and yes, immutable members should not be a concern, since a such a member can not be changed once it has been initialized.
Concerning "multiple object with multiple threads".. that depends very much of what you want to do, in some cases you could use a thread pool which is a mechanism that has a defined number of threads standing by for jobs to come in. But there's no thread concurrency there since each thread does one job.
You have to protect counters. No other options.
On Windows you can do this using these functions:
#if defined(PLATFORM_WIN32_GNU)
typedef long counter_t;
inline long _inc(counter_t& v) { return InterlockedIncrement(&v); }
inline long _dec(counter_t& v) { return InterlockedDecrement(&v); }
inline long _set(counter_t &v, long nv) { return InterlockedExchange(&v, nv); }
#elif defined(WINDOWS) && !defined(_WIN32_WCE) // lets try to keep things for wince simple as much as we can
typedef volatile long counter_t;
inline long _inc(counter_t& v) { return InterlockedIncrement((LPLONG)&v); }
inline long _dec(counter_t& v) { return InterlockedDecrement((LPLONG)&v); }
inline long _set(counter_t& v, long nv) { return InterlockedExchange((LPLONG)&v, nv); }
I want to use a library developed by someone else, of which I only have the library file, not the source code. My question is this: the library provides a class with a number of functionalities. The class itself is not thread safe. I wanted to make it thread safe and I was wondering if this code works
// suppose libobj is the class provided by the library
class my_libobj : public libobj {
// code
};
This only inherits from libobj, which may or may not "work" depending on whether the class was designed for inheritance (has at least a virtual destructor).
In any case, it won't buy you thread-safety for free. The easiest way to get that is to add mutexes to the class and lock those when entering a critical section:
class my_obj {
libobj obj; // inheritance *might* work too
boost::mutex mtx;
void critical_op()
{
boost::unique_lock lck(mtx);
obj.critical_op();
}
};
(This is very coarse-grained design with a single mutex; you may to able to make it more fine-grained if you know the behavior of the various operations. It's also not fool-proof, as #dribeas explains.)
Retrofitting thread safety -- and BTW, there are different level -- in a library which hasn't be designed for is probably impossible without knowing how it has be implemented if you aren't content with just serializing all calls to it, and even then it can be problematic if the interface is bad enough -- see strtok for instance.
This is impossible to answer without knowledge of at least the actual interface of the class. In general the answer would be no.
From the practical C++ point of view, if the class was not designed to be extended, every non-virtual method will not be overriden and as such you might end up with a mixture of some thread-safe and some non-thread safe methods.
Even if you decide to wrap (without inheritance) and force delegation only while holding a lock, the approach is still not valid in all cases. Thread safety requires not only locking, but an interface that can be made thread safe.
Consider a stack implementation as in the STL, by just adding a layer of locking (i.e. making every method thread safe, you will not guarantee thread safety on the container. Consider a few threads adding elements to the stack and two threads pulling information:
if ( !stack.empty() ) { // 1
x = stack.top(); // 2
stack.pop(); // 3
// operate on data
}
There are a number of possible things that can go wrong here: Both threads might perform test [1] when the stack has a single element and then enter sequentially, in which case the second thread will fail in [2] (or obtain the same value and fail in [3]), even in the case where there are multiple objects in the container, both threads could execute [1] and [2] before any of them executing [3], in which case both threads would be consuming the same object, and the second element in the stack would be discarded without processing...
Thread safety requires (in most cases) changes to the API, in the example above, an interface that provides bool pop( T& v ); implemented as:
bool stack::try_pop( T& v ) { // argument by reference, to provide exception safety
std::lock<std:mutex> l(m);
if ( s.empty() ) return false; // someone consumed all data, return failure
v = s.top(); // top and pop are executed "atomically"
s.pop();
return true; // we did consume one datum
}
Of course there are other approaches, you might not return failure but rather wait on a condition in a pop operation that is guaranteed to lock until a datum is ready by making use of a conditional variable or something alike...
The simplest solution is to create a single thread uniquely for that library, and only access the library from that thread, using message queues to pass requests and return parameters.
let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).