std::move operation C++ - c++

Lines from Anthony Williams book:
The following example shows the use of std::move to transfer ownership
of a dynamic object into a thread:
void process_big_object(std::unique_ptr<big_object>);
std::unique_ptr<big_object> p(new big_object);
p->prepare_data(42);
std::thread t(process_big_object,std::move(p));
By specifying std::move(p) in the std::thread constructor, the
ownership of the big_object is transferred first into internal
storage for the newly created thread and then into
process_big_object.
I understand stack and heap; any idea, what actually is this internal storage ?
Why can't they transfer the ownership directly to process_big_object?

It means that the object will temporarily belong to the std::thread object until the thread actually starts.
Internal storage here refers to the memory associated to the std::thread object. It could be a member variable, or just held in the stack during the constructor. Since this is implementation dependant, the general, and non-commital, "internal storage" term is used.

All arguments to a thread are copied into some internal memory held by the std::thread object, so it can be passed to the thread function.
That internal memory is owned by the std::thread object, before the ownership is passed on to the actual thread function.

Why can't they transfer the ownership directly to process_big_object?
Because there is no line in the code snippet where process_big_object is called as a function. The last line of the snippet calls the std::thread constructor. It will set in motion a chain of events that eventually will cause process_big_object(p) to be called in the new thread; but that call is not visible here.

Related

Is it safe to pass stack variables by reference to multithreaded code?

As an example in pseudocode:
MultiThreadedWorker worker;
Foo()
{
const Vector position = CreatePosition();
worker.StartWorker(Position);
}
MultiThreadedWorker::StartWorker(const Vector& myPosition)
{
... Do a bunch of async work that keeps referencing myPosition ...
}
This seems to be working for now, but I don't understand why because it seems that myPosition would end up pointing to nothing long before StartWorker completed.
Assuming this isn't safe, is there any solution other than just passing around everything by value or ensuring it's all on the heap?
std::async copies const references
So yes, it is safe. For a discussion of why it does, see Why does std::async copy its const & arguments?
It is programmers responsibility to ensure that variable live long enough so that it is not destroyed before any access through pointers or references. This can be achieved through at least by one of the following:
Ensure the thread ends before destroying the variable. You can run .join() on the thread before leaving the scope.
Create object on the heap. Create it using make_shared and pass shared_ptr. This ensures the object lives until the last reference is destroyed.
Note that there is another problem with threads and shared objects. If one thread writes when another thread reads to the same object, then it is a data race which is Undefined Behavior. Thread synchronization mechanisms such as std::mutex can be used to avoid this.

callback to be called before any objects with automatic storage duration are destroyed

Does there exist a standard (any C++ standard) way to register a callback, that gets called shortly before any objects with automatic storage duration are destroyed as part of normal program termination?
EDIT:
To make this more clear. It's a multi-threaded application. Some objects may have pushed functors into a thread pool, that reference them (functor accesses the "originator" object). The thread pool object is static, so it gets destroyed after main() returns and so after all these objects who pushed functors into the thread pool, that are referencing them, have already been destroyed. The thread pool is flushed upon termination and so the functors have dangling references to "originator" objects in them.
std::atexit should do what you want:
Registers the function pointed to by func to be called on normal program termination (via std::exit() or returning from the cpp/language/main function)
http://en.cppreference.com/w/cpp/utility/program/atexit'
you may want to put something on std::terminate_handler as well.
http://en.cppreference.com/w/cpp/error/terminate_handler

C++: Can one thread see a newly allocated object as uninitialized if passed through boost lockfree queue?

I'm building a multiple-producer single-consumer mechanism.
I want to do something like this, suppose I have access to an instance of boost lockfree queue available for both threads and a synchronizing condition variable:
Thread 1 (producer):
Object * myObj = new Object();
lockfree_queue.push(myObj);
condition_variable.notify();
Thread 2 (consumer):
condition_variable.wait();
Object * myObj = lockfree_queue.pop();
...
delete myObj;
Is there a chance that on a multi-core system Thread 2 shall see myObj as pointing to memory that is uninitialized or to object that is partially initialized (suppose it has some member variables)?
Once new returns and gives you a pointer, the object is fully constructed.
If there's uninitialized members in the object, then it's the fault of the constructor for not initializing them.
And it shouldn't be a problem even if the queue contained object instances instead of pointers, as the push call would be fully done before you notify the condition variable, so the other thread will not even pop the queue until the object is pushed.

Memory sharing between C++ threads

I'm new to threading in C++, and I'm trying to get a clear picture about how memory is shared/not shared between threads. I'm using std::thread with C++11.
From what I've read on other SO questions, stack memory is owned by only one thread and heap memory is shared between threads. So from what I think I understand about the stack vs. the heap, the following should be true:
#include <thread>
using namespace std;
class Obj {
public:
int x;
Obj(){x = 0;}
};
int main() {
Obj stackObj;
Obj *heapObj = new Obj();
thread t([&]{
stackObj.x++;
heapObj->x++;
});
t.join();
assert(heapObj->x == 1);
assert(stackObj.x == 0);
}
forgive me if I screwed up a bunch of stuff, lambda syntax is very new to me. But hopefully what I'm trying to do is coherent.
Would this perform as I expect? And if not, what am I misunderstanding?
Memory is memory. An object in C++ occupies some location in memory; that location may be on a stack or on the heap, or it may have been statically allocated. It doesn't matter where the object is located: any thread that has a reference or pointer to the object may access the object. If two threads have a reference or a pointer to the object, then both threads may access it.
In your program, you create a worker thread (by constructing a std::thread) that executes the lambda expression you provide it. Because you capture both stackObj and heapObj by reference (using the [&] capture default), that lambda has references to both of those objects.
Those objects are both located on the main thread's stack (note that heapObj is a pointer-type object that is located on the main thread's stack and points to a dynamically allocated object that is located on the heap). No copies of these objects are made; rather, your lambda expression has references to the objects. It modifies the stackObj directly and modifies the object pointed to by heapObj indirectly.
After the main thread joins with the worker thread, both heapObj->x and stackObj.x have a value of 1.
If you had used the value capture default ([=]), your lambda expression would have copied both stackObj and heapObj. The expression stackObj.x++ in the lambda expression would increment the copy, and the stackObj that you declare in main() would be left unchanged.
If you capture the heapObj by value, only the pointer itself is copied, so while a copy of the pointer is used, it still points to the same dynamically allocated object. The expression heapObj->x++ would dereference that pointer, yielding the Obj you created via new Obj(), and increment its value. You would then observe at the end of main() that heapObj->x has been incremented.
(Note that in order to modify an object captured by value, the lambda expression must be declared mutable.)
I agree with James McNellis that heapObj->x and stackObj.x will be 1.
Furthermore, this code only works because you join immediately after spawning the thread. If you started the thread and then did more work while it runs, an exception could unwind the stack and suddenly the new thread's stackObj is invalid. That is why sharing stack memory between threads is a bad idea even if it's technically possible.

C++: Concurrency and destructors

Suppose you have an object which can be accesed by many threads. A critical section is used to protect the sensitive areas. But what about the destructor? Even if I enter a critical section as soon as I enter the destructor, once the destructor has been called, is the object already invalidated?
My train of thought: Say I enter the destructor, and I have to wait on the critical section because some other thread is still using it. Once he is done, I can finish destroying the object. Does this make sense?
In general, you should not destroy an object until you know that no other thread is using it. Period.
Consider this scenario, based on your 'train of thought':
Thread A: Get object X reference
Thread A: Lock object X
Thread B: Get object X reference
Thread B: Block on object X lock
Thread A: Unlock object X
Thread B: Lock object X; unlock object X; destroy object X
Now consider what happens if the timing is slightly different:
Thread A: Get object X reference
Thread B: Get object X reference
Thread B: Lock object X; unlock object X; destroy object X
Thread A: Lock object X - crash
In short, object destruction must be synchronized somewhere other than the object itself. One common option is to use reference counting. Thread A will take a lock on the object reference itself, preventing the reference from being removed and the object being destroyed, until it manages to increment the reference count (keeping the object alive). Then thread B merely clears the reference and decrements the reference count. You can't predict which thread will actually call the destructor, but it will be safe either way.
The reference counting model can be implemented easily by using boost::shared_ptr or std::shared_ptr; the destructor will not run unless all shared_ptrs in all threads have been destroyed (or made to point elsewhere), so at the moment of destruction you know that the only pointer to the object remaining is the this pointer of the destructor itself.
Note that when using shared_ptr, it's important to prevent the original object reference from changing until you can capture a copy of it. Eg:
std::shared_ptr<SomeObject> objref;
Mutex objlock;
void ok1() {
objlock.lock();
objref->dosomething(); // ok; reference is locked
objlock.unlock();
}
void ok2() {
std::shared_ptr<SomeObject> localref;
objlock.lock();
localref = objref;
objlock.unlock();
localref->dosomething(); // ok; local reference
}
void notok1() {
objref->dosomething(); // not ok; reference may be modified
}
void notok2() {
std::shared_ptr<SomeObject> localref = objref; // not ok; objref may be modified
localref->dosomething();
}
Note that simultaneous reads on a shared_ptr is safe, so you can choose to use a read-write lock if it makes sense for your application.
If a object is in use then you should make sure that the destructor of the object is not being called before the use of the object ends. If this is the behavior you have then its a potential problem and it really needs to be fixed.
You should make sure that if one thread is destroying your objects then another thread should not be calling functions on that object or the first thread should wait till second thread completes the function calling.
Yes, even destructors might need critical sections to protect updating some global data which is not related to the class itself.
It's possible that while one thread is waiting for CS in destructor the other is destroying the object and if CS belongs to object it will be destroyed as well. So that's not a good design.
You absolutely, positively need to make sure your object lifetime is less than the consumer threads, or you are in for some serious headaches. Either:
Make the consumers of the object children so it's impossible for them to exist outside of your object, or
use message passing/broker.
If you go the latter route, I highly recommend 0mq http://www.zeromq.org/.
Yes while you are in destructor, the object is already invalidated.
I used Destroy() method that enters critical section and then destroys it self.
Lifetime of object is over before destructor is called?
Yes, it is fine to do that. If a class supports such use, clients don't need to synchronize destruction; i.e. they don't need to make sure that all other methods on the object have finished before invoking the destructor.
I would recommend that clients not assume they can do this unless it is explicitly documented. Clients do have this burden, by default, with standard library objects in particular(§17.6.4.10/2).
There are cases where it is fine, though; std::condition_variable's destructor, for example, specifically allows ongoing condition_variable::wait() method invocations when ~condition_variable() starts. It only requires that clients not initiate calls to wait() after ~condition_variable() starts.
It might be cleaner to require that the client synchronize access to the destructor – and constructor for that matter – like most of the rest of the standard library does. I would recommend doing that if feasible.
However, there are certain patterns where it might make sense to relieve clients of the burden of fully synchronizing destruction. condition_variable's overall pattern seems like one: consider use of an object that handles possibly long-running requests. The user does the following:
Construct the object
Cause the object to receive requests from other threads.
Cause the object to stop receiving requests: at this point, some outstanding requests might be ongoing, but no new ones can be invoked.
Destroy the object. The destructor will block until all requests are done, otherwise the ongoing requests might have a bad time.
An alternative would be to require that clients do need to synchronize access. You could imagine step 3.5 above where the client calls a shutdown() method on the object that does the blocking, after which it is safe for the client to destroy the object. However, this design has some downsides; it complicates the API and introduces an additional state for the object of shutdown-but-valid.
Consider instead perhaps getting step (3) to block until all requests are done. There are tradeoffs...