In this program, why is the destructor on line 14 is called twice for the same instance of mystruct_t?
I'm assuming that all pointer manipulation in this program is thread safe. I think the atomic updates do not work on my system or compiler.
I tried this on MSVC 2017, MSVC 2019 and on clang
/* This crashes for me (line 19) */
#include <iostream>
#include <vector>
#include <thread>
#include <memory>
#include <chrono>
#include <assert.h>
struct mystruct_t {
int32_t nInvocation = 0;
~mystruct_t();
mystruct_t() = default;
};
mystruct_t::~mystruct_t() {
nInvocation++;
int nInvoke = nInvocation;
if (nInvoke > 1) {
/* destructor was invoked twice */
assert(0);
}
/* sleep is not necessary for crash */
//std::this_thread::sleep_for(std::chrono::microseconds(525));
}
std::shared_ptr<mystruct_t> globalPtr;
void thread1() {
for (;;) {
std::this_thread::sleep_for(std::chrono::microseconds(1000));
std::shared_ptr<mystruct_t> ptrNewInstance = std::make_shared<mystruct_t>();
globalPtr = ptrNewInstance;
}
}
void thread2() {
for (;;) {
std::shared_ptr<mystruct_t> pointerCopy = globalPtr;
}
}
int main()
{
std::thread t1;
t1 = std::thread([]() {
thread1();
});
std::thread t2;
t2 = std::thread([]() {
thread2();
});
for (int i = 0;; ++i) {
std::this_thread::sleep_for(std::chrono::microseconds(1000));
std::shared_ptr<mystruct_t> pointerCopy = globalPtr;
globalPtr = nullptr;
}
return 0;
}
As several users here already mentioned, you're running into undefined behavior since you globally (or foreign threaded) alter your referred object while thread-locally, you try to copy assign it. A drawback of the sharedPtr especially for newcomers is the quite hidden danger in the suggestion, you're always thread safe in copying them. As you do not use references, this can become even harder to see in doubt. Always try to see the shared_ptr as a regular class in the first place with a common ('trivial') member-wise copy assigment where interferences are always possible in non-protected threading environments.
If you're going to encounter similar situations with that or similar code in future, try to use a robust channeled broadcasting/event based scheme instead of locally placed locks! The channels (buffered or single data based) themselves care about the proper data lifetime then, ensuring the sharedPtr's underlying data 'rescue'.
Related
https://en.cppreference.com/w/cpp/memory/shared_ptr/use_count states:
In multithreaded environment, the value returned by use_count is approximate (typical implementations use a memory_order_relaxed load)
But does this mean that use_count() is totally useless in a multi-threaded environment?
Consider the following example, where the Circular class implements a circular buffer of std::shared_ptr<int>.
One method is supplied to users - get(), which checks whether the reference count of the next element in the std::array<std::shared_ptr<int>> is greater than 1 (which we don't want, since it means that it's being held by a user which previously called get()).
If it's <= 1, a copy of the std::shared_ptr<int> is returned to the user.
In this case, the users are two threads which do nothing at all except love to call get() on the circular buffer - that's their purpose in life.
What happens in practice when I execute the program is that it runs for a few cycles (tested by adding a counter to the circular buffer class), after which it throws the exception, complaining that the reference counter for the next element is > 1.
Is this a result of the statement that the value returned by use_count() is approximate in a multi-threaded environment?
Is it possible to adjust the underlying mechanism to make it, uh, deterministic and behave as I would have liked it to behave?
If my thinking is correct - use_count() (or rather the real number of users) of the next element should never EVER increase above 1 when inside the get() function of Circular, since there are only two consumers, and every time a thread calls get(), it's already released its old (copied) std::shared_ptr<int> (which in turn means that the remaining std::shared_ptr<int> residing in Circular::ints_ should have a reference count of only 1).
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
class Circular {
public:
Circular() {
for (auto& i : ints_) { i = std::make_shared<int>(0); }
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
index_ = index_ % 2; // Re-set the index pointer.
if (ints_.at(index_).use_count() > 1) {
// This shouldn't happen - right? (but it does)
std::string excp = std::string("OOPSIE: ") + std::to_string(index_) + " " + std::to_string(ints_.at(index_).use_count());
throw std::logic_error(excp);
}
return ints_.at(index_++);
}
private:
std::mutex guard_;
unsigned int index_{0};
std::array<std::shared_ptr<int>, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{circ.get()};
}while(1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}
While use_count is fraught with problems, the core issue right now is outside of that logic.
Assume thread t1 takes the shared_ptr at index 0, and then t2 runs its loop twice before t1 finishes its first loop iteration. t2 will obtain the shared_ptr at index 1, release it, and then attempt to acquire the shared_ptr at index 0, and will hit your failure condition, since t1 is just running behind.
Now, that said, in a broader context, it's not particularly safe, as if a user creates a weak_ptr, it's entirely possible for the use_count to go from 1 to 2 without passing through this function. In this simple example, it would work to have it loop through the index array until it finds the free shared pointer.
use_count is for debugging only and shouldn't be used. If you want to know when nobody else has a reference to a pointer any more just let the shared pointer die and use a custom deleter to detect that and do whatever you need to do with the now unused pointer.
This is an example of how you might implement this in your code:
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
#include <vector>
#include <iostream>
class Circular {
public:
Circular() {
size_t index = 0;
for (auto& i : ints_)
{
i = 0;
unused_.push_back(index++);
}
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
if (unused_.empty())
{
throw std::logic_error("OOPSIE: none left");
}
size_t index = unused_.back();
unused_.pop_back();
return std::shared_ptr<int>(&ints_[index], [this, index](int*) {
std::lock_guard<std::mutex> lock_guard(guard_);
unused_.push_back(index);
});
}
private:
std::mutex guard_;
std::vector<size_t> unused_;
std::array<int, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{ circ.get() };
} while (1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}
A list of unused indexes is kept, when the shared pointer is destroyed the custom deleter returns the index back to the list of unused indexes ready to be used in the next call to get.
Foo.h
#pragma once
#include <stdint.h>
#include <cstddef>
#include <thread>
#include <mutex>
#include <shared_mutex>
#include <atomic>
class Foo: public FooAncestor
{
private:
std::atomic_bool start = false;
//I do a shared mutex on the currentValue,
//but I took it out in order to make sure that it was not part of the problem
mutable std::shared_mutex valueMutex;
float currentValue = 0.0;
public:
Foo();
void Start();
inline float GetValue() { return currentValue; }
};
Foo.cpp
#include "Foo.h"
using namespace std::chrono_literals;
Foo::Foo()
{
//setting up things not related to the subject
}
void Foo::Start()
{
std::thread updateThread( [=]()
{
start = true;
while( start )
{
//a whole bunch of commented out code which doesn't make any difference to this
std::this_thread::sleep_for( 200ms );
currentTemp += 1;
}
} );
updateThread.detach();
}
Main.cpp
int main()
{
std::cout << "hello from PiCBake!\n";
int i = 0;
//a bunch of additional setup code here, not related to this class
Foo foo = Foo();
foo.Start();
for( ;;)
{
//this part is just for testing of the class and has no actual function
std::cout << max.GetTemp() << std::endl;
delay( 500 );
if( ++i == 10 )
i = 0;
}
return 0;
}
Whenever I include this, or a separate test at the beginning of the updateThread part (like: if(!run) return;) the flow of the code does weird things. In debugging, it keeps jumping to the part where 'run' is being tested, even jumping out of the while loop when run is set in the thread, but the behavior is the same when run is set before the thread is created.
If the thread is very long it actually never gets to finish at all. Whenever I remove the variable 'run' and replace it with "true" all works as expected. If I replace the "while(run)" with "while(true)", but test 'run' right after in an if-statement, the same thing happens.
Tried making run atomic, tried a bunch of other things, removing everything etc.
Please note that I didn't yet implemented a thread-exit condition. "run" could be set to false in the destructor or something in order to quit the thread. But that's not really the issue.
Update:
It must have been late, and I was probably going around in circles. It seems that changing the start value to atomic already solved it, either way, it works and I have no ^&*() clue what was going wrong...
std::thread is not simply inheritable by classes, cannot auto join when destruction, etc.
Lots of pitfalls like need to use std::atomic_bool for stopping, cannot simply share this of object when use std::thread as member variables to execute a member method.
Is there any good practice to implement QThread like classes using std::thread?
I'm aiming at an inheritable Thread class, enables start(), detach(), stop() functionality.
For example I wrote one like the following:
#include <atomic>
#include <chrono>
#include <cstdio>
#include <cstdlib>
#include <thread>
#include <vector>
struct Thread {
Thread(int id) {
std::atomic_init(&(this->id), id);
std::atomic_init(&(this->m_stop), false);
}
Thread(Thread &&rhs) :
id(),
m_stop(),
m_thread(std::move(rhs.m_thread))
{
std::atomic_init(&(this->id), rhs.id.load());
rhs.id.store(-1);
std::atomic_init(&(this->m_stop), rhs.m_stop.load());
}
virtual ~Thread() {
this->stop();
}
void start() {
this->m_thread = std::move(std::thread(&Thread::work, this));
}
void stop() {
this->m_stop.store(true);
if (this->m_thread.joinable()) {
this->m_thread.join();
}
}
virtual void work() {
while (!(this->m_stop)) {
std::chrono::milliseconds ts(5000);
std::this_thread::sleep_for(ts);
}
}
std::atomic_int id;
std::atomic_bool m_stop;
std::thread m_thread;
};
int main() {
srand(42);
while (true) {
std::vector<Thread> v;
for (int i = 0; i < 10; ++i) {
auto t = Thread(i);
v.push_back(std::move(t));
printf("Start %d\n", i);
v[i].start();
}
printf("Start fin!\n");
int time_sleep = rand() % 2000 + 1000;
std::chrono::milliseconds ts(time_sleep);
std::this_thread::sleep_for(ts);
for (int i = 0; i < 10; ++i) {
printf("Stop %d\n", i);
v[i].stop();
printf("Pop %d\n", i);
v.pop_back();
}
printf("Stop fin!\n");
}
return 0;
}
But I'm having a hard time getting it right, just after Stop 0 it deadlocked, or sometimes it core dumps.
std::thread is intended as the lowest building block for implementing a threading primitive. As such, it does not offer as rich an interface as QThread, but together with the synchronization primitives from the standard library, it allows you to implement more complex behavior like what's offered by QThread very easily.
You correctly noted that inheriting from std::thread is a bad idea (the absence of a virtual destructor is a dead giveaway) and I would argue that having a polymorphic Thread type is not the smartest design in the first place, but you can easily encapsulate a std::thread as a member of any class (polymorphic or not) if you want to.
Joining on destruction is really just a policy. A class encapsulating a std::thread as a member can simply call join in its destructor, effectively implementing self-joining on destruction. Concurrency is of no concern here, as object destruction is always executed non-concurrently, by definition. If you want to share ownership of a thread between multiple (possible concurrently invoked) execution paths, std::shared_ptr will handle that for you. But even here, the destructor is always executed non-concurrently by the last remaining thread that gives up its shared_ptr.
Stuff like QThread::isInterruptionRequested can be implemented with a single flag, access to which of course must be synchronized by the class (either with a mutex or by using an atomic flag).
Changing a thread's priority is not specified by the standard, as not all platforms envisioned by the standard would allow this, but you can use native_handle to implement this yourself using platform-specific code.
And so on. All the pieces are there, you just have to assemble them according to your needs.
Consider the following simplified program modelling a real scenario where different users can make concurrent requests to the same resource:
#include <thread>
#include <memory>
#include <mutex>
#include <iostream>
using namespace std;
struct T {
void op() { /* some stuff */ }
~T() noexcept { /* some stuff */ }
};
std::shared_ptr<T> t;
std::mutex mtx;
std::weak_ptr<T> w{t};
enum action { destroy, op};
void request(action a) {
if (a == action::destroy) {
lock_guard<mutex> lk{mtx};
t.reset();
std::cout << "*t certainly destroyed\n";
} else if (a == action::op) {
lock_guard<mutex> lk{mtx};
if (auto l = w.lock()) {
l->op();
}
}
}
int main() {
// At some point in time and different points in the program,
// two different users make two different concurrent requests
std::thread th1{request, destroy}; std::thread th2{request, op};
// ....
th2.join();
th1.join();
}
I am not asking if the program is formally correct - I think it is, but I have never seen this approach for guaranteeing a synchronous destruction of a resource shared via smart pointers. I personally think it is fine and has a valid use.
However, I am wondering if others think the same and, in case, if there are more elegant alternatives apart from the classic synchronization with unique_locks and condition variables and from introducing modifications (e.g. atomic flags) to T.
It would be ideal if I could even get rid of the mtx somehow.
Yes, it's fine. The reference counting in the shared_ptr is atomic and the locked copy stays in scope for the duration of the op, so the object can't be destroyed during the op.
In this case the mutex is not actually protecting the lifetime of T, but sequencing calls to op() and destruction. If you don't mind multiple concurrent calls to op(), or the destruction time being indeterminate (i.e. after the last running op() has completed) then you can do away with it, since std::shared_ptr<>::reset() and std::weak_ptr<>::lock() are both thread-safe.
However, I would advise caution as the author clearly meant for calls to op() to be serialised.
I know there are a lot of similar questions with answers around, but since I still don't understand this particular case, I decided to pose a question.
What I have is a map of shared_ptrs to a dynamically allocated array (MyVector). What I want is limited concurrent access without the need to lock. I know that the map per se is not thread safe, but I always thought what I'm doing here should be ok, which is:
I fill the map in a single threaded environment like that:
typedef shared_ptr<MyVector<float>> MyVectorPtr;
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
After the initialization, I have one thread that reads from the elements and one that replaces what the shared_ptrs point to.
Thread 1:
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
Thread 2:
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
}
After a while I get either a seg fault or a double free in one of the two threads. Somehow not really surprisingly, but still I don't really get it.
The reasons why I thought this would work, are:
I don't add or remove any items of the map in the multi-threaded
environment, so the iterators should always point to something valid.
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic. SO Question
Obviously, either one ore more of my assumptions are wrong, or I'm not doing what I think I am. I think that reset actually is not thread safe, would std::atomic_exchange help?
Can someone release me? Thanks a lot!
If someone wants to try out, here is the full code example:
#include <stdio.h>
#include <iostream>
#include <string>
#include <map>
#include <unistd.h>
#include <pthread.h>
using namespace std;
template<class T>
class MyVector
{
public:
MyVector(int length)
: number(length)
, array(new T[length])
{
}
~MyVector()
{
if (array != NULL)
{
delete[] array;
}
array = NULL;
}
int number;
private:
T* array;
};
typedef shared_ptr<MyVector<float>> MyVectorPtr;
static map<int,MyVectorPtr> content;
const int numRows = 1000;
const int numElements = 10;
//pthread_mutex_t write_lock;
double get_cache_size_in_megabyte()
{
double memory_use=0;
//BlockingLockGuard guard(write_lock);
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
return memory_use/(1024.0*1024.0);
}
void* write_content(void*)
{
while(true)
{
//BlockingLockGuard guard(write_lock);
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
cout << "one new written" <<endl;
}
}
return NULL;
}
void* loop_size_checker(void*)
{
while (true)
{
cout << get_cache_size_in_megabyte() << endl;;
}
return NULL;
}
int main(int argc, const char* argv[])
{
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
pthread_attr_t attr;
pthread_attr_init(&attr) ;
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
pthread_t *grid_proc3 = new pthread_t;
pthread_create(grid_proc3, &attr, &loop_size_checker,NULL);
pthread_t *grid_proc = new pthread_t;
pthread_create(grid_proc, &attr, &write_content,(void*)NULL);
// to keep alive and avoid content being deleted
sleep(10000);
}
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
Changing the element in a map is not atomic unless you have a atomic type like std::atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic.
That is correct. Unfortunately you are also changing the underlying pointer. That pointer is not atomic. Since it is not atomic you need synchronization.
One thing you can do though is use the atomic free functions that are introduced with std::shared_ptr. This will let you avoid having to use a mutex.
Lets expand MyVectorPtr p(i->second); which is running on thread-1:
The constructor called for this is:
template< class Y >
shared_ptr( const shared_ptr<Y>& r ) = default;
Which probably boils down to 2 assignments of the underlying shared pointer and the reference count.
It may very well happen that thread 2 would delete the shared pointer while in thread-1 the pointer is being assigned to p. The underlying pointer stored inside shared_ptr is not atomic.
Thus, you usage of std::shared_ptr is not thread safe. It is thread safe as long as you do not update or modify the underlying pointer.
TL;DR;
Changing std::map isn't thread safe, while using std::shared_ptr regarding additional references is.
You should protect accessing your map regarding read/write operations using an appropriate synchronization mechanism, like e.g. a std::mutex.
Also if the state of an instance referenced by the std::shared_ptr should change, it needs to be protected against data races if it's accessed from concurrent threads.
BTW, the MyVector you are showing is a way too naive implementation.