Concurrent deletion and field access - c++

Is it legal for a second thread to delete an object while a first thread is still potentially inside a member function of that object, if the deletion occurs after the last (explicit) field access inside that function?
This will probably be clearer with an example:
#include <thread>
#include <atomic>
struct handoff {
std::atomic<int> flag{};
void signal() { flag = 1; }
};
int main() {
auto h = new handoff;
auto t2 = std::thread([=]{
while (!h->flag) {
}
delete h;
});
h->signal();
t2.join();
}
Here, the handoff h object is used to communicate between T1 (the main thread) and T2. The flag field is initially zero, and T2 waits until it becomes non-zero then immediately deletes the h object. T1 calls h->signal() which sets the flag to non-zero. So the object will be deleted (by T2) while T1 is still potentially "inside" the signal() call. However, there are no further accesses to any fields of the handoff object in signal().
Is it defined behavior? I think it is clear that calling a member function (even an empty one) is UB after an object has been deleted, but how about returning from one?
TSAN thinks this is OK and I agree in practice this will work fine on sane architectures.

Your code is safe. There is no such concept of being "inside" an object. You either access (as in: read write) the object or not. Method or not, its just a function. However in your concrete case accessing this after flag = 1 line in signal method is not safe. So it is a matter of concrete code, not of being inside or not.
Also you are correct that calling a member function after delete is UB. But that's not the case. The "call" event already happened before delete, that is not UB.

Related

Do objects captured by a lambda exist for as long as the lambda?

I have always assumed lambda were just function pointers, but I've never thought to use capture statements seriously...
If I create a lambda that captures by copy, and then move that lambda to a completely different thread and make no attempt to save the original objects used in the lambda, will it retain those copies for me?
std::thread createThread() {
std::string str("Success");
auto func = [=](){
printf("%s", str.c_str());
};
str = "Failure";
return std::thread(func);
}
int main() {
std::thread thread = createThread();
thread.join();
// assuming the thread doesn't execute anything until here...
// would it print "Success", "Failure", or deference a dangling pointer?
return 0;
}
It is guaranteed to print Success. Capture-by-copy does exactly what it says. It make a copy of the object right there and stores this copy as part of the closure object. The member of the closure object created from the capture lives as long as the closure object itself.
A lambda is not a function pointer. Lambdas are general function objects that can have internal state, which a function pointer can't have. In fact, only capture-less lambdas can be converted to function pointers and so may behave like one sometimes.
The lambda expression produces a closure type that basically looks something like this:
struct /*unnamed1*/ {
/*unnamed1*/(const /*unnamed1*/&) = default;
/*unnamed1*/(/*unnamed1*/&&) = default;
/*unnamed1*/& operator=(const /*unnamed1*/&) = delete;
void operator()() const {
printf("%s", /*unnamed2*/.c_str());
};
std::string /*unnamed2*/;
};
and the lambda expression produces an object of this type, with /*unnamed2*/ direct-initialized to the current value of str. (Direct-initialized meaning as if by std::string /*unnamed2*/(str);)
You have 3 situations
You can be design guarantee that variables live longer then the thread, because you synchronize with the end of the thread before variables go out of scope.
You know your thread may outlive the scope/life cycle of your thread but you don't need access to the variables anymore from any other thread.
You can't say which thread lives longest, you have multiple thread accessing your data and you want to extend the live time of your variables
In case 1. Capture by reference
In case 2. Capture by value (or you even use move) variables
In case 3. Make data shared, std::shared_ptr and capture that by value
Case 3 will extend the lifetime of the data to the lifetime of the longest living thread.
Note I prefer using std::async over std::thread, since that returns a RAII object (a future). The destructor of that will synchronize with the thread. So you can use that as members in objects with a thread and make sure the object destruction waits for the thread to finish.

std::atomic to what extent?

In the C++ Seasoning video by Sean Parent https://youtu.be/W2tWOdzgXHA at 33:41 when starting to talk about “no raw synchronization primitives”, he brings an example to show that with raw synchronization primitives we will get it wrong. The example is a bad copy on write class:
template <typename T>
class bad_cow {
struct object_t {
explicit object_t(const T& x) : data_m(x) { ++count_m; }
atomic<int> count_m;
T data_m;
};
object_t* object_m;
public:
explicit bad_cow(const T& x) : object_m(new object_t(x)) { }
~bad_cow() { if (0 == --object_m->count_m) delete object_m; }
bad_cow(const bad_cow& x) : object_m(x.object_m) { ++object_m->count_m; }
bad_cow& operator=(const T& x) {
if (object_m->count_m == 1) {
// label #2
object_m->data_m = x;
} else {
object_t* tmp = new object_t(x);
--object_m->count_m; // bug #1
// this solves bug #1:
// if (0 == --object_m->count_m) delete object_m;
object_m = tmp;
}
return *this;
}
};
He then asks the audience to find the bug, which is the bug #1 as he confirms.
But a more obvious bug I guess, is when some thread is about to proceed to execute a line of code that I have denoted with label #2, while all of a sudden, some other thread just destroys the object and the destructor is called, which deletes object_m. So, the first thread will encounter a deleted memory location.
Am I right? I don’t seem so!
some other thread just destroys the object and the destructor is
called, which deletes object_m. So, the first thread will encounter a
deleted memory location.
Am I right? I don’t seem so!
Assuming the rest of the program isn't buggy, that shouldn't happen, because each thread should have its own reference-count object referencing the data_m object. Therefore, if thread B has a bad_cow object that references the data-object, then thread A cannot (or at least should not) ever delete that object, because the count_m field can never drop to zero as long as there remains at least one reference-count object pointing to it.
Of course, a buggy program might encounter the race condition you suggest -- for example, a thread might be holding only a raw pointer to the data-object, rather than a bad_cow that increments its reference count; or a buggy thread might call delete on the object explicitly rather than relying on the bad_cow class to handle deletion properly.
Your objection doesn't hold because *this at that moment is pointing to the object and the count is 1. The counter cannot get to 0 unless someone is not playing this game correctly (but in that case anything can happen anyway).
Another similar objection could be that while you're assigning to *this and the code being executed is inside the #2 branch another thread makes a copy of *this; even if this second thread is just reading the pointed object may see it mutating suddenly because of the assignment. The problem in this case is that count was 1 when entering the if in the thread doing the mutation but increased immediately after.
This is also however a bad objection because this code handles concurrency to the pointed-to object (like for example std::shared_ptr does) but you are not allowed to mutate and read a single instance of bad_cow class from different threads. In other words a single instance of bad_cow cannot be used from multiple threads if some of them are writers without adding synchronization. Distinct instances of bad_cow pointing to the same storage are instead safe to be used from different threads (after the fix #1, of course).

Re-assigning an std::function object while inside its execution

I have an std::function object I'm using as a callback to some event. I'm assigning a lambda to this object, within which, I assign the object to a different lambda mid execution. I get a segfault when I do this. Is this not something I'm allowed to do? If so, why? And how would I go about achieving this?
declaration:
std::function<void(Data *)> doCallback;
calling:
//
// This gets called after a sendDataRequest call returns with data
//
void onIncomingData(Data *data)
{
if ( doCallback )
{
doCallback(data);
}
}
assignment:
doCallback =
[=](Data *data)
{
//
// Change the callback within itself because we want to do
// something else after getting one request
//
doCallback =
[=](Data *data2)
{
... do some work ...
};
sendDataRequest();
};
sendDataRequest();
The standard does not specify when in the operation of std::function::operator() that the function uses its internal state object. In practice, some implementations use it after the call.
So what you did was undefined behaviour, and in particular it crashes.
struct bob {
std::function<void()> task;
std::function<void()> next_task;
void operator()(){
next_task=task;
task();
task=std::move(next_task);
}
}
now if you want to change what happens when you next invoke bob within bob(), simply set next_task.
Short answer
It depends on whether, after the (re)assignment, the lambda being called accesses any of its non static data members or not. If it does then you get undefined behavior. Otherwise, I believe nothing bad should happen.
Long answer
In the OP's example, a lambda object -- denoted here by l_1 -- held by a std::function object is invoked and, during its execution, the std::function object is assigned to another lambda -- denoted here by l_2.
The assignment calls template<class F> function& operator=(F&& f); which, by 20.8.11.2.1/18, has the effects of
function(std::forward<F>(f)).swap(*this);
where f binds to l_2 and *this is the std::function object being assigned to. At this time, the temporary std::function holds l_2 and *this holds l_1. After the swap the temporary holds l_1 and *this holds l_2 (*). Then the temporary is destroyed and so is l_1.
In summary, while running operator() on l_1 this object gets destroyed. Then according to 12.7/1
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
Lambdas non static data members correspond its captures. So if you don't access them, then it should be fine.
There's one more important point raised by Yakk's answer. As far as I understand, the concern was whether std::function::operator(), after having forwarded the call to l_1, tries to access l_1 (which is now dead) or not? I don't think this is the case because the effects of std::function::operator() don't imply that. Indeed, 20.8.11.2.4 says that the effect of this call is
INVOKE(f, std::forward<ArgTypes>(args)..., R) (20.8.2), where f is the target object (20.8.1) of *this.
which basicallky says that std::function::operator() calls l_1.operator() and does nothing else (at least, nothing that is detectable).
(*) I'm putting details on how the interchange happens under the carpet but the idea remains valid. (E.g. what if the temporary holds a copy of l_1 and not a pointer to it?)

What happens when object is destroyed when it is stuck in an infinite loop?

I want to know what would happen when destructor gets called on an object when the object is stuck in an infinite while loop in a different thread.
// Main thread creates the object
MyClass _obj = new MyClass():
// doing some stuff
delete _obj;
Where,
MyClass::MyClass()
{
// Start a thread which calls MyClass::MyPollingFn()
}
MyClass:: MyPollingFn()
{
// runs in new child thread
while(true)
{
// doing some work
// sleep(5 seconds)
}
}
Explanation:
There is a class object of MyClass which creates a thread and runs MyPollingFn method in an infinite loop. Every iteration of this method can change some class variables. Is it ok to destroy the object from parent thread which holds the object? Is there any possibility of this giving an issue?
If MyPollingFn ever touches this, explicitly or implicitly (e.g. by accessing non-static member variables), then this code would exhibit undefined behavior, as this would become a dangling pointer.
And if it doesn't touch this, then why make it a non-static member function?
There are several possible issues, including
1. Either you will try to join the thread in your destructor, in which case it will block.
Edit
i.e. if you add
MyClass::~MyClass()
{
myThread.join();
}
and leave the MyPollingFunction as it is, it will never finish, so the join will block.
End Edit
Though this code doesn't have a destructor, but perhaps it should.
2. Or the thread will try to "change some class variables" after the class has gone away.
Which is obviously bad.
It might be better to change the
while(true)
to
while(!finished)
where the finished is some kind of thread-safe flag (an e.g. atomic) and set it in the (currently non-existent) destructor.

Is unique_ptr thread safe?

Is unique_ptr thread safe? Is it impossible for the code below to print same number twice?
#include <memory>
#include <string>
#include <thread>
#include <cstdio>
using namespace std;
int main()
{
unique_ptr<int> work;
thread t1([&] {
while (true) {
const unique_ptr<int> localWork = move(work);
if (localWork)
printf("thread1: %d\n", *localWork);
this_thread::yield();
}
});
thread t2([&] {
while (true) {
const unique_ptr<int> localWork = move(work);
if (localWork)
printf("thread2: %d\n", *localWork);
this_thread::yield();
}
});
for (int i = 0; ; i++) {
work.reset(new int(i));
while (work)
this_thread::yield();
}
return 0;
}
unique_ptr is thread safe when used correctly. You broke the unwritten rule: Thou shalt never pass unique_ptr between threads by reference.
The philosophy behind unique_ptr is that it has a single (unique) owner at all times. Because of that, you can always pass it safely between threads without synchronization -- but you have to pass it by value, not by reference. Once you create aliases to a unique_ptr, you lose the uniqueness property and all bets are off. Unfortunately C++ can't guarantee uniqueness, so you are left with a convention that you have to follow religiously. Don't create aliases to a unique_ptr!
No, it isn't thread-safe.
Both threads can potentially move the work pointer with no explicit synchronization, so it's possible for both threads to get the same value, or both to get some invalid pointer ... it's undefined behaviour.
If you want to do something like this correctly, you probably need to use something like std::atomic_exchange so both threads can read/modify the shared work pointer with the right semantics.
According to Msdn:
The following thread safety rules apply to all classes in the Standard
C++ Library (except shared_ptr and iostream classes, as described
below).
A single object is thread safe for reading from multiple threads. For
example, given an object A, it is safe to read A from thread 1 and
from thread 2 simultaneously.
If a single object is being written to by one thread, then all reads
and writes to that object on the same or other threads must be
protected. For example, given an object A, if thread 1 is writing to
A, then thread 2 must be prevented from reading from or writing to A.
It is safe to read and write to one instance of a type even if another
thread is reading or writing to a different instance of the same type.
For example, given objects A and B of the same type, it is safe if A
is being written in thread 1 and B is being read in thread 2.