Are constructors thread safe in C++ and/or C++11?

Are constructors thread safe in C++ and/or C++11? - c++

Derived from this question and related to this question:
If I construct an object in one thread and then convey a reference/pointer to it to another thread, is it thread un-safe for that other thread to access the object without explicit locking/memory-barriers?
// thread 1
Obj obj;
anyLeagalTransferDevice.Send(&obj);
while(1); // never let obj go out of scope
// thread 2
anyLeagalTransferDevice.Get()->SomeFn();
Alternatively: is there any legal way to convey data between threads that doesn't enforce memory ordering with regards to everything else the thread has touched? From a hardware standpoint I don't see any reason it shouldn't be possible.
To clarify; the question is with regards to cache coherency, memory ordering and whatnot. Can Thread 2 get and use the pointer before Thread 2's view of memory includes the writes involved in constructing obj? To miss-quote Alexandrescu(?) "Could a malicious CPU designer and compiler writer collude to build a standard conforming system that make that break?"

Reasoning about thread-safety can be difficult, and I am no expert on the C++11 memory model. Fortunately, however, your example is very simple. I rewrite the example, because the constructor is irrelevant.
Simplified Example
Question: Is the following code correct? Or can the execution result in undefined behavior?
// Legal transfer of pointer to int without data race.
// The receive function blocks until send is called.
void send(int*);
int* receive();
// --- thread A ---
/* A1 */ int* pointer = receive();
/* A2 */ int answer = *pointer;
// --- thread B ---
int answer;
/* B1 */ answer = 42;
/* B2 */ send(&answer);
// wait forever
Answer: There may be a data race on the memory location of answer, and thus the execution results in undefined behavior. See below for details.
Implementation of Data Transfer
Of course, the answer depends on the possible and legal implementations of the functions send and receive. I use the following data-race-free implementation. Note that only a single atomic variable is used, and all memory operations use std::memory_order_relaxed. Basically this means, that these functions do not restrict memory re-orderings.
std::atomic<int*> transfer{nullptr};
void send(int* pointer) {
transfer.store(pointer, std::memory_order_relaxed);
}
int* receive() {
while (transfer.load(std::memory_order_relaxed) == nullptr) { }
return transfer.load(std::memory_order_relaxed);
}
Order of Memory Operations
On multicore systems, a thread can see memory changes in a different order as what other threads see. In addition, both compilers and CPUs may reorder memory operations within a single thread for efficiency - and they do this all the time. Atomic operations with std::memory_order_relaxed do not participate in any synchronization and do not impose any ordering.
In the above example, the compiler is allowed to reorder the operations of thread B, and execute B2 before B1, because the reordering has no effect on the thread itself.
// --- valid execution of operations in thread B ---
int answer;
/* B2 */ send(&answer);
/* B1 */ answer = 42;
// wait forever
Data Race
C++11 defines a data race as follows (N3290 C++11 Draft): "The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior." And the term happens before is defined earlier in the same document.
In the above example, B1 and A2 are conflicting and non-atomic operations, and neither happens before the other. This is obvious, because I have shown in the previous section, that both can happen at the same time.
That's the only thing that matters in C++11. In contrast, the Java Memory Model also tries to define the behavior if there are data races, and it took them almost a decade to come up with a reasonable specification. C++11 didn't make the same mistake.
Further Information
I'm a bit surprised that these basics are not well known. The definitive source of information is the section Multi-threaded executions and data races in the C++11 standard. However, the specification is difficult to understand.
A good starting point are Hans Boehm's talks - e.g. available as online videos:
Threads and Shared Variables in C++11
Getting C++ Threads Right
There are also a lot of other good resources, I have mentioned elsewhere, e.g.:
std::memory_order - cppreference.com

There is no parallel access to the same data, so there is no problem:
Thread 1 starts execution of Obj::Obj().
Thread 1 finishes execution of Obj::Obj().
Thread 1 passes reference to the memory occupied by obj to thread 2.
Thread 1 never does anything else with that memory (soon after, it falls into infinite loop).
Thread 2 picks-up the reference to memory occupied by obj.
Thread 2 presumably does something with it, undisturbed by thread 1 which is still infinitely looping.
The only potential problem is if Send didn't acts as a memory barrier, but then it wouldn't really be a "legal transfer device".

As others have alluded to, the only way in which a constructor is not thread-safe is if something somehow gets a pointer or reference to it before the constructor is finished, and the only way that would occur is if the constructor itself has code that registers the this pointer to some type of container which is shared across threads.
Now in your specific example, Branko Dimitrijevic gave a good complete explanation how your case is fine. But in the general case, I'd say to not use something until the constructor is finished, though I don't think there's anything "special" that doesn't happen until the constructor is finished. By the time it enters the (last) constructor in an inheritance chain, the object is pretty much fully "good to go" with all of its member variables being initialized, etc. So no worse than any other critical section work, but another thread would need to know about it first, and the only way that happens is if you're sharing this in the constructor itself somehow. So only do that as the "last thing" if you are.

It is only safe (sort of) if you wrote both threads, and know the first thread is not accessing it while the second thread is. For example, if the thread constructing it never accesses it after passing the reference/pointer, you would be OK. Otherwise it is thread unsafe. You could change that by making all methods that access data members (read or write) lock memory.

Read this question until now... Still will post my comments:
Static Local Variable
There is a reliable way to construct objects when you are in a multi-thread environment, that is using a static local variable (static local variable-CppCoreGuidelines),
From the above reference: "This is one of the most effective solutions to problems related to initialization order. In a multi-threaded environment the initialization of the static object does not introduce a race condition (unless you carelessly access a shared object from within its constructor)."
Also note from the reference, if the destruction of X involves an operation that needs to be synchronized you can create the object on the heap and synchronize when to call the destructor.
Below is an example I wrote to show the Construct On First Use Idiom, which is basically what the reference talks about.
#include <iostream>
#include <thread>
#include <vector>
class ThreadConstruct
{
public:
ThreadConstruct(int a, float b) : _a{a}, _b{b}
{
std::cout << "ThreadConstruct construct start" << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "ThreadConstruct construct end" << std::endl;
}
void get()
{
std::cout << _a << " " << _b << std::endl;
}
private:
int _a;
float _b;
};
struct Factory
{
template<class T, typename ...ARGS>
static T& get(ARGS... args)
{
//thread safe object instantiation
static T instance(std::forward<ARGS>(args)...);
return instance;
}
};
//thread pool
class Threads
{
public:
Threads()
{
for (size_t num_threads = 0; num_threads < 5; ++num_threads) {
thread_pool.emplace_back(&Threads::run, this);
}
}
void run()
{
//thread safe constructor call
ThreadConstruct& thread_construct = Factory::get<ThreadConstruct>(5, 10.1);
thread_construct.get();
}
~Threads()
{
for(auto& x : thread_pool) {
if(x.joinable()) {
x.join();
}
}
}
private:
std::vector<std::thread> thread_pool;
};
int main()
{
Threads thread;
return 0;
}
Output:
ThreadConstruct construct start
ThreadConstruct construct end
5 10.1
5 10.1
5 10.1
5 10.1
5 10.1

Related

std::shared_ptr Thread Safe

Is a shared pointer (std::shared_ptr) safe to use in a multi-threaded program?
I am not considering read/write accesses to the data owned by the shared pointer but rather the shared pointer itself.
I am aware that certain implementations (such as MSDN) do provide this extra guarantee; but I want to understand if this is guaranteed by the standard and as such is portable.
#include <thread>
#include <memory>
#include <iostream>
void function_to_run_thread(std::shared_ptr<int> x)
{
std::cout << x << "\n";
}
// Shared pointer goes out of scope.
// Is its destruction here guaranteed to happen only once?
// Or is this a "Data Race" situation that is UB?
int main()
{
std::thread threads[2];
{
// A new scope
// So that the shared_ptr in this scope has the
// potential to go out of scope before the threads have executed.
// So leaving the shared_ptr in the scope of the threads only.
std::shared_ptr<int> data = std::make_shared<int>(5);
// Create workers.
threads[0] = std::thread(function_to_run_thread, data);
threads[1] = std::thread(function_to_run_thread, data);
}
threads[0].join();
threads[1].join();
}
Any links to sections in the standard most welcome.
I would be happy if people have reference to the major implementations so we could consider it portable to most normal developers.
MSDN: Check. Thread Safe.
G++: ?
clang: ?
I would consider those the major implementations but happy to consider others.

I don't have links to the standard. I did check this a long time ago, std::shared_ptr is thread-safe under certain conditions, which summarizes to: every thread should have its own copy.
As documented on cppreference:
All member functions (including copy constructor and copy assignment) can be called by multiple threads on different instances of shared_ptr without additional synchronization even if these instances are copies and share ownership of the same object. If multiple threads of execution access the same shared_ptr without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur.
So just like any other class in the standard, reading from the same instance from multiple threads is allowed. Writing to this instance from 1 thread is not.
int main()
{
std::vector<std::thread> threads;
{
// A new scope
// So that the shared_ptr in this scope has the
// potential to go out of scope before the threads have executed.
// So leaving the shared_ptr in the scope of the threads only.
std::shared_ptr<int> data = std::make_shared<int>(5);
// Perfectly legal to read access the shared_ptr
threads.emplace_back(std::thread([&data]{ std::cout << data.get() << '\n'; }));
threads.emplace_back(std::thread([&data]{ std::cout << data.get() << '\n'; }));
// This line will result in a race condition as you now have read and write on the same instance
threads.emplace_back(std::thread([&data]{ data = std::make_shared<int>(42); }));
for (auto &thread : threads)
thread.join();
}
}
Once we are dealing with multiple copies of the shared_ptr, everything is fine:
int main()
{
std::vector<std::thread> threads;
{
// A new scope
// So that the shared_ptr in this scope has the
// potential to go out of scope before the threads have executed.
// So leaving the shared_ptr in the scope of the threads only.
std::shared_ptr<int> data = std::make_shared<int>(5);
// Perfectly legal to read access the shared_ptr copy
threads.emplace_back(std::thread([data]{ std::cout << data.get() << '\n'; }));
threads.emplace_back(std::thread([data]{ std::cout << data.get() << '\n'; }));
// This line will no longer result in a race condition the other threads are using a copy
threads.emplace_back(std::thread([&data]{ data = std::make_shared<int>(42); }));
for (auto &thread : threads)
thread.join();
}
}
Also destruction of the shared_ptr will be fine, as every thread will call the destructor of the local shared_ptr and the last one will clean up the data. There are some atomic operations on the reference count to ensure this happens correctly.
int main()
{
std::vector<std::thread> threads;
{
// A new scope
// So that the shared_ptr in this scope has the
// potential to go out of scope before the threads have executed.
// So leaving the shared_ptr in the scope of the threads only.
std::shared_ptr<int> data = std::make_shared<int>(5);
// Perfectly legal to read access the shared_ptr copy
threads.emplace_back(std::thread([data]{ std::cout << data.get() << '\n'; }));
threads.emplace_back(std::thread([data]{ std::cout << data.get() << '\n'; }));
// Sleep to ensure we have some delay
threads.emplace_back(std::thread([data]{ std::this_thread::sleep_for(std::chrono::seconds{2}); }));
}
for (auto &thread : threads)
thread.join();
}
As you already indicated, the access to the data in the shared_ptr ain't protected. So similar to the first case, if you would have 1 thread reading and 1 thread writing, you still have a problem. This can be solved with atomics or mutexes or by guaranteeing read-onlyness of the objects.

Quoting the latest draft:
For purposes of determining the presence of a data race, member functions shall access and modify only the shared_ptr and weak_ptr objects themselves and not objects they refer to. Changes in use_count() do not reflect modifications that can introduce data races.
So, this is a lot to take in. The first sentence talks about member functions not accessing the pointee, i.e. that accessing the pointee is not thread-safe.
However, then there is the second sentence. Effectively, this forces any operation that would change use_count() (e.g. copy construction, assignment, destruction, calling reset) to be thread-safe - but only as far as they are affecting use_count().
Which makes sense: Different threads copying the same std::shared_ptr (or destroying the same std::shared_ptr) must not cause a data race regarding ownership of the pointee. The internal value of use_count() must be synchronized.
I checked, and this exact wording was also present in N3337, Section 20.7.2.2 Paragraph 4, so it should be safe to say that this requirement has been there since the introduction of std::shared_ptr in C++11 (and was not something introduced later on).

shared_ptr (and also weak_ptr) utilizes atomic integer to keep use count, so sharing between threads is safe but of course, access to data still requires mutexes or any other synchronization.

C++: Release barriers required in constructor that creates a thread that accesses the constructed object

If I create a thread in a constructor and if that thread accesses the object do I need to introduce a release barrier before the thread accesses the object? Specifically, if I have the code below (wandbox link) do I need to lock the mutex in the constructor (the commented out line)? I need to make sure that the worker_thread_ sees the write to run_worker_thread_ so that is doesn't immediately exit. I realize using an atomic boolean is better here but I'm interested in understanding the memory ordering implications here. Based on my understanding I think I do need to lock the mutex in the constructor to ensure that the release operation that the unlocking of the mutex in the constructor provides synchronizes with the acquire operation provided by the locking of the mutex in the threadLoop() via the call to shouldRun().
class ThreadLooper {
public:
ThreadLooper(std::string thread_name)
: thread_name_{std::move(thread_name)}, loop_counter_{0} {
//std::lock_guard<std::mutex> lock(mutex_);
run_worker_thread_ = true;
worker_thread_ = std::thread([this]() { threadLoop(); });
// mutex unlock provides release semantics
}
~ThreadLooper() {
{
std::lock_guard<std::mutex> lock(mutex_);
run_worker_thread_ = false;
}
if (worker_thread_.joinable()) {
worker_thread_.join();
}
cout << thread_name_ << ": destroyed and counter is " << loop_counter_
<< std::endl;
}
private:
bool shouldRun() {
std::lock_guard<std::mutex> lock(mutex_);
return run_worker_thread_;
}
void threadLoop() {
cout << thread_name_ << ": threadLoop() started running"
<< std::endl;
while (shouldRun()) {
using namespace std::literals::chrono_literals;
std::this_thread::sleep_for(2s);
++loop_counter_;
cout << thread_name_ << ": counter is " << loop_counter_ << std::endl;
}
cout << thread_name_
<< ": exiting threadLoop() because flag is false" << std::endl;
}
const std::string thread_name_;
std::atomic_uint64_t loop_counter_;
bool run_worker_thread_;
std::mutex mutex_;
std::thread worker_thread_;
};
This also got me to thinking about more generally if I were to initialize a bunch of regular int (not atomic) member variables in the constructor that were then read from other threads via some public methods if I would need to similarly lock the mutex in the constructor in addition to in the methods that read these variables. This seems slightly different to me than the case above since I know that the object would be fully constructed before any other thread could access it, but that doesn't seem to ensure that the initialization of the object would be visible to the other threads without a release operation in the constructor.

You do not need any barriers because it is guaranteed that the thread constructor synchronizes with the invocation of the function passed to it.
In Standardese:
The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f.
Somewhat formal proof:
run_worker_thread_ = true;(A) is sequenced before the thread object creation (B) according to the full expressions evaluation order. The thread object construction synchronizes with the closure object execution (C) according to the rule cited above. Hence, A inter-thread happens before C.
A seq before B, B sync with C, A happens before C -> this is a formal proof in Standard terms.
And when analyzing programs in C++11+ era you should stick to the C++ model of memory & execution and forget about barriers and reordering which compiler might or might not do. These are just implementation details. The only thing that matters is the formal proof in the C++ terms. Compiler must obey and do (and not do) whatever it can to adhere to the rules.
But for the sake of completeness let's look at the code with compiler's eyes and try to understand why it can't reorder anything in this case. We all know the "as-if" rule under which the compiler might reorder some instructions if you can't tell they have been reordered. So if we have some bool flags setting:
flag1 = true; // A
flag2 = false;// B
It is allowed to execute these lines as follows:
flag2 = false;// B
flag1 = true;// A
Despite the fact that A sequenced before B. It can do it because we can't tell the difference, we can't catch it reordering our instructions just by observing the program behavior because except "sequenced before" there is no relations between these lines. But let's get back to our case:
run_worker_thread_ = true; // A
worker_thread_ = std::thread(...); // B
It might look like that this case is the same as with bool variables above. And that would be the case if we didn't know that the thread object (besides being sequenced after the A expression) synchronizes with something (for simplicity let's ignore this something). But as we found out if something is sequenced before another thing which in its turn sync with yet another thing then it is happens before that thing. So the Standard requires for the A expression to happen before that something our B expression sync with.
And this fact forbids compiler to reorder our A & B expressions because suddenly we can tell the difference if it did so. Because if it did it then the C expression (something) might not see the visible side effects provided by A. So just by observing the program execution we might caught the cheating compiler! Hence, it has to use some barriers. It doesn't matter if it is just a compiler barrier or a hardware one—it has to be there to guarantee that these instructions are not reordered. So you might think that it uses a release fence upon the construction completion and an acquire fence upon the closure object execution. That would roughly describe what happens under the hood.
It also looks like you treat mutex as some kind of magic thing which always work and do not require any proofs. So for some reason you believe in mutex and not in thread. But the thing is that it has no magic and the only guarantee it has is that lock sync with prior unlock and vice versa. So it provides the same guarantee that thread provides.

Is this a correct C++11 double-checked locking version with shared_ptr?

This article by Jeff Preshing states that the double-checked locking pattern (DCLP) is fixed in C++11. The classical example used for this pattern is the singleton pattern but I happen to have a different use case and I am still lacking experience in handling "atomic<> weapons" - maybe someone over here can help me out.
Is the following piece of code a correct DCLP implementation as described by Jeff under "Using C++11 Sequentially Consistent Atomics"?
class Foo {
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
if (data.use_count() > 1)
{
std::lock_guard<std::mutex> lock{mutex};
if (data.use_count() > 1)
{
data = std::make_shared<B>(*data);
}
}
}
public:
// public interface
};

No, this is not a correct implementation of DCLP.
The thing is that your outer check data.use_count() > 1 accesses the object (of type B with reference count), which can be deleted (unreferenced) in mutex-protected part. Any sort of memory fences cannot help there.
Why data.use_count() accesses the object:
Assume these operations have been executed:
shared_ptr<B> data1 = make_shared<B>(...);
shared_ptr<B> data = data1;
Then you have following layout (weak_ptr support is not shown here):
data1 [allocated with B::new()] data
--------------------------
[pointer type] ref; --> |atomic<int> m_use_count;| <-- [pointer type] ref
|B obj; |
--------------------------
Each shared_ptr object is just a pointer, which points to allocated memory region. This memory region embeds object of type B plus atomic counter, reflecting number of shared_ptr's, pointed to given object. When this counter becomes zero, memory region is freed(and B object is destroyed). Exactly this counter is returned by shared_ptr::use_count().
UPDATE: Execution, which can lead to accessing memory which is already freed (initially, two shared_ptr's point to the same object, .use_count() is 2):
/* Thread 1 */ /* Thread 2 */ /* Thread 3 */
Enter detach() Enter detach()
Found `data.use_count()` > 1
Enter critical section
Found `data.use_count()` > 1
Dereference `data`,
found old object.
Unreference old `data`,
`use_count` becomes 1
Delete other shared_ptr,
old object is deleted
Assign new object to `data`
Access old object
(for check `use_count`)
!! But object is freed !!
Outer check should only take a pointer to object for decide, whether to need aquire lock.
BTW, even your implementation would be correct, it has a little sence:
If data (and detach) can be accessed from several threads at the same time, object's uniqueness gives no advantages, since it can be accessed from the several threads. If you want to change object, all accesses to data should be protected by outer mutex, in that case detach() cannot be executed concurrently.
If data (and detach) can be accessed only by single thread at the same time, detach implementation doesn't need any locking at all.

This constitutes a data race if two threads invoke detach on the same instance of Foo concurrently, because std::shared_ptr<B>::use_count() (a read-only operation) would run concurrently with the std::shared_ptr<B> move-assignment operator (a modifying operation), which is a data race and hence a cause of undefined behavior. If Foo instances are never accessed concurrently, on the other hand, there is no data race, but then the std::mutex would be useless in your example. The question is: how does data's pointee become shared in the first place? Without this crucial bit of information, it is hard to tell if the code is safe even if a Foo is never used concurrently.

According to your source, I think you still need to add thread fences before the first test and after the second test.
std::shared_ptr<B> data;
std::mutex mutex;
void detach()
{
std::atomic_thread_fence(std::memory_order_acquire);
if (data.use_count() > 1)
{
auto lock = std::lock_guard<std::mutex>{mutex};
if (data.use_count() > 1)
{
std::atomic_thread_fence(std::memory_order_release);
data = std::make_shared<B>(*data);
}
}
}

Can code reordering affect my test

I am writing an unit test for a class to test for insertion when no memory is available. It relies on the fact that nbElementInserted is incremented AFTER insert_edge has returned.
void test()
{
adjacency_list a(true);
MemoryVacuum no_memory_after_this_line;
bool signalReceived = false;
size_t nbElementInserted = 0;
do
{
try
{
a.insert_edge( 0, 1, true ); // this should throw
nbElementInserted++;
}
catch(std::bad_alloc &)
{
signalReceived = true;
}
}
while (!signalReceived); // this loop is necessary because the
// memory vacuum only prevents new memory
// pages from being mapped. so the first
// allocations may succeed.
CHECK_EQUAL( nbElementInserted, a.nb_edges() );
}
Now I am wondering which of the two statement is true:
Reordering can happen, in which case nbElementInserted can be incremented before insert_edge throws an exception, and that invalidates my case. Reordering can happen because the visible result for the user is the same if the two lines are permuted.
Reordering cannot happen because insert_edge is a function and all the side effects of the function should be completed before going to the next line. Throwing is a side effect.
Bonus point: if the correct answer is “yes reordering can happen”, is a memory barrier between the 2 lines sufficient to fix it?

No. Reordering only comes into play in multithreaded or multiprocessing scenarios. In a single thread the compiler cannot reorder instructions in a way that would change the behavior of the program. Exceptions are not an exception to this rule.
Reordering becomes visible when two threads read and write to shared state. If thread A makes modifications to shared variables thread B can see those modifications out-of-order, or even not at all if it has the shared state cached. This can be due to optimizations in either thread A or thread B or both.
Thread A will always see its own modifications in-order, though. Each sequence point must happen in order, at least as far as the local thread is aware.
Let's say thread A executed this code:
a = foo() + bar();
b = baz;
Each ; introduces a sequence point. The compiler is allowed to call either foo() or bar() first, whichever it likes, since + does not introduce a sequence point. If you put printouts you might see foo() called first, or you might see bar() called first. Either one would be correct. It must call them before it assigns baz to b, though. If either foo() or bar() throws an exception b must retain its existing value.
However, if the compiler knew that foo() and bar() never throw, and their execution in no way depends on the value of b, it could reorder the two statements. It'd be a valid optimization. There would be no way for the thread A to know that statements had been reordered.
Thread B, on the other hand, would know. The problem in multithreaded programming is that sequence points don't apply to other threads. That's where memory barriers come in. Memory barriers are cross-thread sequence points, in a sense.

Is unique_ptr thread safe?

Is unique_ptr thread safe? Is it impossible for the code below to print same number twice?
#include <memory>
#include <string>
#include <thread>
#include <cstdio>
using namespace std;
int main()
{
unique_ptr<int> work;
thread t1([&] {
while (true) {
const unique_ptr<int> localWork = move(work);
if (localWork)
printf("thread1: %d\n", *localWork);
this_thread::yield();
}
});
thread t2([&] {
while (true) {
const unique_ptr<int> localWork = move(work);
if (localWork)
printf("thread2: %d\n", *localWork);
this_thread::yield();
}
});
for (int i = 0; ; i++) {
work.reset(new int(i));
while (work)
this_thread::yield();
}
return 0;
}

unique_ptr is thread safe when used correctly. You broke the unwritten rule: Thou shalt never pass unique_ptr between threads by reference.
The philosophy behind unique_ptr is that it has a single (unique) owner at all times. Because of that, you can always pass it safely between threads without synchronization -- but you have to pass it by value, not by reference. Once you create aliases to a unique_ptr, you lose the uniqueness property and all bets are off. Unfortunately C++ can't guarantee uniqueness, so you are left with a convention that you have to follow religiously. Don't create aliases to a unique_ptr!

No, it isn't thread-safe.
Both threads can potentially move the work pointer with no explicit synchronization, so it's possible for both threads to get the same value, or both to get some invalid pointer ... it's undefined behaviour.
If you want to do something like this correctly, you probably need to use something like std::atomic_exchange so both threads can read/modify the shared work pointer with the right semantics.

According to Msdn:
The following thread safety rules apply to all classes in the Standard
C++ Library (except shared_ptr and iostream classes, as described
below).
A single object is thread safe for reading from multiple threads. For
example, given an object A, it is safe to read A from thread 1 and
from thread 2 simultaneously.
If a single object is being written to by one thread, then all reads
and writes to that object on the same or other threads must be
protected. For example, given an object A, if thread 1 is writing to
A, then thread 2 must be prevented from reading from or writing to A.
It is safe to read and write to one instance of a type even if another
thread is reading or writing to a different instance of the same type.
For example, given objects A and B of the same type, it is safe if A
is being written in thread 1 and B is being read in thread 2.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js