My concerns is, what will be the impact on the global pointers when accessed between the threads. My global pointer were a thread safe class. From the code what will the impact on the global pointer, when updatethread() method updating the pointer with new pointer and workerthread() accessing the pointer. What synchronization i should work with?
SomeCache* ptrCache = NULL;
//worker thread
void Workerthread(std::string strFileToWork)
{
while (std::getline(fileStream, strCurrent))
{
//worker thread accessing the global pointer
if (ptrCache->SearchValue(strCurrent))
{
iCounter++;
}
}
}
void updatethread()
{
//replace old cache with a new fresh updated.
SomeCache* ptrOldCache = ptrCache;
ptrCache = ptrNewCache;
}
One possible solution with mutexes:
std::mutex ptrCacheMutex;
SomeCache* ptrCache = null_ptr;
void Workerthread(...)
{
...
bool valueFound;
{
std::scoped_lock lk(ptrCacheMutex);
valueFound = ptrCache && ptrCache->SearchValue(...);
}
if (valueFound)
iCounter++;
...
}
void updatethread()
{
...
{
std::scoped_lock lk(ptrCacheMutex);
auto ptrOldCache = std::exchange(ptrCache, ptrNewCache);
}
...
}
If your compiler doesn't support template argument deduction, you should specify mutex type explicitly: std::scoped_lock<std::mutex> ....
The behave is probably undefined, you can refer to this answer regarding the volatile keyword in C++.
https://stackoverflow.com/a/72617/10443813
If the volatile keyword is used, before the following line is executed, the old value is used, and after, the new value is used. Otherwise the behave might be compiler or compiling flag dependent.
ptrCache = ptrNewCache;
Related
I implemented code such that multiple instances running on different threads reads other instances' data using reader-writer lock and shared_ptr. It seemed fine, but I am not 100% sure about that and I came up with some questions about usage of those.
Detail
I have multiple instances of a class called Chunk and each instance does some calculations in a dedicated thread. A chunk needs to read neighbour chunks' data as well as its own data, but it doesn't write neighbours' data, so reader-writer lock is used. Also, neighbours can be set at runtime. For example, I might want o set a different neighbour chunk at runtime, sometimes just nullptr. It is possible to delete a chunk at runtime, too. Raw pointers can be used but I thought shared_ptr and weak_ptr are better for this, in order to keep track of the lifetime. Own data in shared_ptr and neighbours' data in weak_ptr.
I provided a simpler version of my code below. ChunkData has data and a mutex for it. I use InitData for data initialization and DoWork function is called in a dedicated thread after that. other functions can be called from main thread.
This seems to work, but I am not so confident. Especially, about use of shared_ptr across multiple threads.
What happens if a thread calls shared_ptr's reset() (in ctor and InitData) and other uses it with weak_ptr's lock (in DoWork)? Does this need a lock dataMutex or chunkMutex?
How about copy(in SetNeighbour)? Do I need locks for this as well?
I think other parts are ok, but please let me know if you find anything dangerous. Appreciate that.
By the way, I considered about storing shared_ptr of Chunk instead of ChunkData, but decided not to use this method because internal code, which I don't manage, has GC system and it can delete a pointer to Chunk when I don't expect it.
class Chunk
{
public:
class ChunkData
{
public:
shared_mutex dataMutex; // mutex to read/write data
int* data;
int size;
ChunkData() : data(nullptr) { }
~ChunkData()
{
if (data)
{
delete[] data;
data = nullptr;
}
}
};
private:
mutex chunkMutex; // mutex to read/write member variables
shared_ptr<ChunkData> chunkData;
weak_ptr<ChunkData> neighbourChunkData;
string result;
public:
Chunk(string _name)
: chunkData(make_shared<ChunkData>())
{
}
~Chunk()
{
EndProcess();
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset();
}
void InitData(int size)
{
ChunkData* NewData = new ChunkData();
NewData->size = size;
NewData->data = new int[size];
{
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset(NewData);
cout << "init chunk " << name << endl;
}
}
// This is executed in other thread. e.g. thread t(&Chunk::DoWork, this);
void DoWork()
{
lock_guard lock(chunkMutex); // we modify some members such as result(string) reading chunk data, so need this.
if (chunkData)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data)
{
// read chunkData->data[i] and modify some members such as result(string)
for (int i = 0; i < chunkData->size; ++i)
{
// Is this fine, or should I write data result outside of readLock scope?
result += to_string(chunkData->data[i]) + " ";
}
}
}
// does this work?
if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock())
{
shared_lock readLock(neighbour->dataMutex);
if (neighbour->data)
{
// read neighbour->data[i] and modify some members as above
}
}
}
shared_ptr<ChunkData> GetChunkData()
{
unique_lock lock(chunkMutex);
return chunkData;
}
void SetNeighbour(Chunk* neighbourChunk)
{
if (neighbourChunk)
{
// safe?
shared_ptr<ChunkData> newNeighbourData = neighbourChunk->GetChunkData();
unique_lock lock(chunkMutex); // lock for chunk properties
{
shared_lock readLock(newNeighbourData->dataMutex); // not sure if this is needed.
neighbourChunkData = newNeighbourData;
}
}
}
int GetDataAt(int index)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}
};
Edit 1
I added more detail for DoWork function. Chunk data is read and chunk's member variables are edited in the function.
After Homer512's anwer, I came up with other questions.
A) In DoWork function I write a member variable inside a read lock. Should I only read data in a read lock scope and if I need to modify other data based on read data, do I have to do outside of the read lock? For example, copy the whole array to a local variable in a read lock, and modify other members outside of the read lock using the local.
B) I followed Homer512 and modifed GetDataAt/SetDataAt as below. I do read/write lock chunkData->dataMutex before unlocking chunkMutex. I also do this in DoWork function. Should I instead do locks separately? For example, make a local variable shared_ptr and set chunkData to it in a chunkMutex lock, unlock it, then lastly read/write lock that local variable's dataMutex and read/write data.
int GetDataAt(int index)
{
lock_guard chunkLock(chunkMutex);
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
lock_guard chunkLock(chunkMutex);
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}
I have several remarks:
~ChunkData: You could change your data member from int* to unique_ptr<int[]> to get the same result without an explicit destructor. Your code is correct though, just less convenient.
~Chunk: I don't think you need a lock or call the reset method. By the time the destructor runs, by definition, no one should have a reference to the Chunk object. So the lock can never be contested. And reset is unnecessary because the shared_ptr destructor will handle that.
InitData: Yes, the lock is needed because InitData can race with DoWork. You could avoid this by moving InitData to the constructor but I assume there are reasons for this division. You could also change the shared_ptr to std::atomic<std::shared_ptr<ChunkData> > to avoid the lock.
It is more efficient to write InitData like this:
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
// deletes old chunkData outside locked region if it was initialized before
}
make_shared avoids an additional memory allocation for the reference counter. This also moves all allocations and deallocations out of the critical section.
DoWork: Your comment "ready chunkData->data[i] and modify some members". You only take a shared_lock but say that you modify members. Well, which is it, reading or writing? Or do you mean to say that you modify Chunk but not ChunkData, with Chunk being protected by its own mutex?
SetNeighbour: You need to lock both your own chunkMutex and the neighbour's. You should not lock both at the same time to avoid the dining philosopher's problem (though std::lock solves this).
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
std::shared_ptr<ChunkData> newNeighbourData;
{
std::lock_guard<std::mutex> lock(neighbourChunk->chunkMutex);
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> lock(this->chunkMutex);
this->neighbourChunkData = newNeighbourData;
}
GetDataAt and SetDataAt: You need to lock chunkMutex. Otherwise you might race with InitData. There is no need to use std::lock because the order of locks is never swapped around.
EDIT 1:
DoWork: The line if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock()) doesn't keep the neighbur alive. Move the variable declaration out of the if to keep the reference.
EDIT: Alternative design proposal
What I'm bothered with is that your DoWork may be unable to proceed if InitData is still running or waiting to run. How do you want to deal with this? I suggest you make it possible to wait until the work can be done. Something like this:
class Chunk
{
std::mutex chunkMutex;
std::shared_ptr<ChunkData> chunkData;
std::weak_ptr<ChunkData> neighbourChunkData;
std::condition_variable chunkSet;
void waitForChunk(std::unique_lock<std::mutex>& lock)
{
while(! chunkData)
chunkSet.wait(lock);
}
public:
// modified version of my code above
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
chunkSet.notify_all();
}
void DoWork()
{
std::unique_lock<std::mutex> ownLock(chunkMutex);
waitForChunk(lock); // blocks until other thread finishes InitData
{
shared_lock readLock(chunkData->dataMutex);
...
}
shared_ptr<ChunkData> neighbour = neighbourChunkData.lock();
if(! neighbour)
return;
shared_lock readLock(neighbour->dataMutex);
...
}
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
shared_ptr<ChunkData> newNeighbourData;
{
std::unique_lock<std::mutex> lock(neighbourChunk->chunkMutex);
neighbourChunk->waitForChunk(lock); // wait until neighbor has finished InitData
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> ownLock(this->chunkMutex);
this->neighbourChunkData = std::move(newNeighbourData);
}
};
The downside to this is that you could deadlock if InitData is never called or if it failed with an exception. There are ways around this, like using an std::shared_future which knows that it is valid (set when InitData is scheduled) and whether it failed (records exception of associated promise or packaged_task).
Is this code thread safe?
static PFN_SOMEFUNC pfnSomeFunc = nullptr;
PFN_SOMEFUNC SomeFuncGetter();
void CalledFromManyThreads() {
if (pfnSomeFunc == nullptr)
pfnSomeFunc = SomeFuncGetter();
pfnSomeFunc();
}
This is not atomic pointer operation question. There are some special conditions.
SomeFuncGetter() always returns same address.
SomeFuncGetter() is thread safe.
It doesn't look thread-safe, because the global variable can be modified by any thread without synchronization. Even an assignment is not guaranteed to be atomic.
What you could do is leverage a language feature that guarantees thread-safe atomic initialization by moving the static into your function (a very common solution to the Static Initialization Fiasco):
void CalledFromManyThreads() {
static PFN_SOMEFUNC pfnSomeFunc = SomeFuncGetter();
pfnSomeFunc();
}
Now it's thread-safe. And if you need that cached result in multiple places, you can wrap it:
PFN_SOMEFUNC GetSomeFunc()
{
static PFN_SOMEFUNC pfnSomeFunc = SomeFuncGetter();
return pfnSomeFunc;
}
void CalledFromManyThreads() {
PFN_SOMEFUNC pfnSomeFunc = GetSomeFunc();
pfnSomeFunc();
}
For the following code, will the Lock object be initialized after or before the first statement?
int* data = NULL;
int* example() {
if (data) return data;
Lock lock;
data = new int[10];
return data;
}
I know it should work as expected in this way:
int* data = NULL;
int* example() {
if (data) return data;
{
Lock lock;
if (!data)
data = new int[10];
}
return data;
}
But do both methods above work in multiple threads on a weak memory order machine? I tried to search but couldn't find an answer.
Local variables are initialized when execution passes through the point of their definition. That is, your first implementation of example() is equivalent to (this even incorrect use of double-checked locking - it omits the second check in addition to having a data-race...):
int* example() {
if (data)
return data;
{
Lock lock; // (this really needs to reference some already initialized mutex
data = new int[10];
return data;
}
}
Assuming your Lock class acquires a suitably initialized mutex, the code still has a data-race according to the C++ standard: reading data in the if-statement and the return-statement is not synchronized with setting data later. As a result, all of these functions result in undefined behavior according to the C++ standard. A simple fix avoiding all of the problems is to rely on the guarantees for initializating function-local static variables:
int* example() {
static int* rc = new int[10];
return rc;
}
The initialization of rc has to be done in a thread-safe way. If multiple threads try to initialize rc concurrently, only one thread will manage to do so while all other threads are blocked until the initializaton is complete.
I have a scenario where:
I launch a new thread from within a dll that does some work.
The dlls destructor could be called before the new thread finishes its work.
If so I want to set a boolean flag in the destructor to tell the thread to return and not continue.
If I try the following then I find that because the destructor is called and MyDll goes out of scope then m_cancel is deleted and its value is unreliable (Sometimes false, sometimes true) so I cannot use this method.
Method 1
//member variable declared in header file
bool m_cancel = false;
MyDll:~MyDll()
{
m_cancel = true;
}
//Function to start receiving data asynchronously
void MyDll::GetDataSync()
{
std::thread([&]()
{
SomeFunctionThatCouldTakeAWhile();
if( m_cancel == true )
return;
SomeFunctionThatDoesSomethingElse();
}
}
So I have looked at this example Replacing std::async with own version but where should std::promise live? where a shared pointer is used which can be accessed from both threads.
So I was thinking that I should:
Create a shared pointer to a bool and pass it to the new thread that I have kicked off.
In the destructor, change the value of this shared pointer and check it in the new thread.
Here is what I have come up with but I'm not sure if this is the proper way to solve this problem.
Method 2
//member variable declared in header file
std::shared_ptr<bool> m_Cancel;
//Constructor
MyDll:MyDll()
{
m_Cancel = make_shared<bool>(false);
}
//Destructor
MyDll:~MyDll()
{
std::shared_ptr<bool> m_cancelTrue = make_shared<bool>(true);
m_Cancel = std::move(m_cancelTrue);
}
//Function to start receiving data asynchronously
void MyDll::GetDataSync()
{
std::thread([&]()
{
SomeFunctionThatCouldTakeAWhile();
if( *m_Cancel.get() == true )
return;
SomeFunctionThatDoesSomethingElse();
}
}
If I do the above then the if( *m_Cancel.get() == true ) causes a crash (Access violation)
Do I pass the shared pointer by value or by reference to the std::thread??
Because its a shared pointer, will the copy that the std::thread had still be valid even MyDll goes out of scope??
How can I do this??
Method 3
//Members declared in header file
std::shared_ptr<std::atomic<bool>> m_Cancel;
//Constructor
MyDll:MyDll()
{
//Initialise m_Cancel to false
m_Cancel = make_shared<std::atomic<bool>>(false);
}
//Destructor
MyDll:~MyDll()
{
//Set m_Cancel to true
std::shared_ptr<std::atomic<bool>> m_cancelTrue = make_shared<std::atomic<bool>>(true);
m_Cancel = std::move(m_cancelTrue);
}
//Function to start receiving data asynchronously
void MyDll::GetDataSync()
{
std::thread([=]() //Pass variables by value
{
SomeFunctionThatCouldTakeAWhile();
if( *m_Cancel.get() == true )
return;
SomeFunctionThatDoesSomethingElse();
}
}
What I fund is that when the destructor gets called and then if( *m_Cancel.get() == true ) is called it always crashes.
Am I doing something wrong??
Solution
I have added in a mutex to protect against the dtor returning after cancel has been checked in the new thread.
//Members declared in header file
std::shared_ptr<std::atomic<bool>> m_Cancel;
std::shared_ptr<std::mutex> m_sharedMutex;
//Constructor
MyDll:MyDll()
{
//Initialise m_Cancel to false
m_Cancel = make_shared<std::atomic<bool>>(false);
m_sharedMutex = make_shared<std::mutex>();
}
//Destructor
MyDll:~MyDll()
{
//Set m_Cancel to true
std::shared_ptr<std::atomic<bool>> m_cancelTrue = make_shared<std::atomic<bool>>(true);
std::lock_guard<std::mutex> lock(*m_sharedMutex);//lock access to m_Cancel
{
*m_Cancel = std::move(cancelTrue);
}
}
//Function to start receiving data asynchronously
void MyDll::GetDataSync()
{
auto cancel = this->m_Cancel;
auto mutex = this->m_sharedMutex;
std::thread([=]() //Pass variables by value
{
SomeFunctionThatCouldTakeAWhile();
std::lock_guard<std::mutex> lock(*mutex);//lock access to cancel
{
if( *cancel.get() == true )
return;
SomeFunctionThatDoesSomethingElse();
}
}
}
Step 2 is just wrong. That's a design fault.
Your first mechanism doesn't work for a simple reason. m_cancel==false may be optimized out by the compiler. When the destructor returns, m_cancel ceases to exist, and no statement in the destructor depends on that write. After the destructor returns, it would be Undefined Behavior to access the memory which previously held m_cancel.
The second mechanism (global) fails for a more complex reason. There's the obvious problem that you have only one global m_Cancel (BTW, m_ is a really misleading prefix for something that's not a member). But assuming you only have one MyDll, it can still fail for threading reasons. What you wanted was not a shared_ptr but a std::atomic<bool>. That is safe for access from multiple threads
[edit]
And your third mechanism fails because [=] captures names from the enclosing scope. m_Cancel isn't in that scope, but this is. You don't want a copy of this for the thread though, because this will be destroyed. Solution: auto cancel = this->m_Cancel; std::thread([cancel](...
[edit 2]
I think you really should read up on basics. In the dtor of version 3, you indeed change the value of m_cancel. That is to say, you change the pointer. You should have changed *m_cancel, i.e. what it points to. As I pointed out above, the thread has a copy of the pointer. If you change the original pointer, the thread will continue to point to the old value. (This is unrelated to smart pointers, dumb pointers behave the same).
Anything wrong with the following Singleton implementation?
Foo& Instance() {
if (foo) {
return *foo;
}
else {
scoped_lock lock(mutex);
if (foo) {
return *foo;
}
else {
// Don't do foo = new Foo;
// because that line *may* be a 2-step
// process comprising (not necessarily in order)
// 1) allocating memory, and
// 2) actually constructing foo at that mem location.
// If 1) happens before 2) and another thread
// checks the foo pointer just before 2) happens, that
// thread will see that foo is non-null, and may assume
// that it is already pointing to a a valid object.
//
// So, to fix the above problem, what about doing the following?
Foo* p = new Foo;
foo = p; // Assuming no compiler optimisation, can pointer
// assignment be safely assumed to be atomic?
// If so, on compilers that you know of, are there ways to
// suppress optimisation for this line so that the compiler
// doesn't optimise it back to foo = new Foo;?
}
}
return *foo;
}
No, you cannot even assume that foo = p; is atomic. It's possible that it might load 16 bits of a 32-bit pointer, then be swapped out before loading the rest.
If another thread sneaks in at that point and calls Instance(), you're toasted because your foo pointer is invalid.
For true security, you will have to protect the entire test-and-set mechanism, even though that means using mutexes even after the pointer is built. In other words (and I'm assuming that scoped_lock() will release the lock when it goes out of scope here (I have little experience with Boost)), something like:
Foo& Instance() {
scoped_lock lock(mutex);
if (foo != 0)
foo = new Foo();
return *foo;
}
If you don't want a mutex (for performance reasons, presumably), an option I've used in the past is to build all singletons before threading starts.
In other words, assuming you have that control (you may not), simply create an instance of each singleton in main before kicking off the other threads. Then don't use a mutex at all. You won't have threading problems at that point and you can just use the canonical don't-care-about-threads-at-all version:
Foo& Instance() {
if (foo != 0)
foo = new Foo();
return *foo;
}
And, yes, this does make your code more dangerous to people who couldn't be bothered to read your API docs but (IMNSHO) they deserve everything they get :-)
Why not keep it simple?
Foo& Instance()
{
scoped_lock lock(mutex);
static Foo instance;
return instance;
}
Edit: In C++11 where threads is introduced into the language. The following is thread safe. The language guarantees that instance is only initialized once and in a thread safe manor.
Foo& Instance()
{
static Foo instance;
return instance;
}
So its lazily evaluated. Its thread safe. Its very simple. Win/Win/Win.
This depends on what threading library you're using. If you're using C++0x you can use atomic compare-and-swap operations and write barriers to guarantee that double-checked locking works. If you're working with POSIX threads or Windows threads, you can probably find a way to do it. The bigger question is why? Singletons, it turns out, are usually unnecessary.
the new operator in c++ always invovle 2-steps process :
1.) allocating memory identical to simple malloc
2.) invoke constructor for given data type
Foo* p = new Foo;
foo = p;
the code above will make the singleton creation into 3 step, which is even vulnerable to problem you trying to solve.
Why don't you just use a real mutex ensuring that only one thread will attempt to create foo?
Foo& Instance() {
if (!foo) {
pthread_mutex_lock(&lock);
if (!foo) {
Foo *p = new Foo;
foo = p;
}
pthread_mutex_unlock(&lock);
}
return *foo;
}
This is a test-and-test-and-set lock with free readers. Replace the above with a reader-writer lock if you want reads to be guaranteed safe in a non-atomic-replacement environment.
edit: if you really want free readers, you can write foo first, and then write a flag variable fooCreated = 1. Checking fooCreated != 0 is safe; if fooCreated != 0, then foo is initialized.
Foo& Instance() {
if (!fooCreated) {
pthread_mutex_lock(&lock);
if (!fooCreated) {
foo = new Foo;
fooCreated = 1;
}
pthread_mutex_unlock(&lock);
}
return *foo;
}
It has nothing wrong with your code. After the scoped_lock, there will be only one thread in that section, so the first thread that enters will initialize foo and return, and then second thread(if any) enters, it will return immediately because foo is not null anymore.
EDIT: Pasted the simplified code.
Foo& Instance() {
if (!foo) {
scoped_lock lock(mutex);
// only one thread can enter here
if (!foo)
foo = new Foo;
}
return *foo;
}
Thanks for all your input. After consulting Joe Duffy's excellent book, "Concurrent Programming on Windows", I am now thinking that I should be using the code below. It's largely the code from his book, except for some renames and the InterlockedXXX line. The following implementation uses:
volatile keyword on both the temp and "actual" pointers to protect against re-ordering from the compiler.
InterlockedCompareExchangePointer to protect against reordering from
the CPU.
So, that should be pretty safe (... right?):
template <typename T>
class LazyInit {
public:
typedef T* (*Factory)();
LazyInit(Factory f = 0)
: factory_(f)
, singleton_(0)
{
::InitializeCriticalSection(&cs_);
}
T& get() {
if (!singleton_) {
::EnterCriticalSection(&cs_);
if (!singleton_) {
T* volatile p = factory_();
// Joe uses _WriterBarrier(); then singleton_ = p;
// But I thought better to make singleton_ = p atomic (as I understand,
// on Windows, pointer assignments are atomic ONLY if they are aligned)
// In addition, the MSDN docs say that InterlockedCompareExchangePointer
// sets up a full memory barrier.
::InterlockedCompareExchangePointer((PVOID volatile*)&singleton_, p, 0);
}
::LeaveCriticalSection(&cs_);
}
#if SUPPORT_IA64
_ReadBarrier();
#endif
return *singleton_;
}
virtual ~LazyInit() {
::DeleteCriticalSection(&cs_);
}
private:
CRITICAL_SECTION cs_;
Factory factory_;
T* volatile singleton_;
};