Woes with std::shared_ptr<T>.use_counter() - c++

https://en.cppreference.com/w/cpp/memory/shared_ptr/use_count states:
In multithreaded environment, the value returned by use_count is approximate (typical implementations use a memory_order_relaxed load)
But does this mean that use_count() is totally useless in a multi-threaded environment?
Consider the following example, where the Circular class implements a circular buffer of std::shared_ptr<int>.
One method is supplied to users - get(), which checks whether the reference count of the next element in the std::array<std::shared_ptr<int>> is greater than 1 (which we don't want, since it means that it's being held by a user which previously called get()).
If it's <= 1, a copy of the std::shared_ptr<int> is returned to the user.
In this case, the users are two threads which do nothing at all except love to call get() on the circular buffer - that's their purpose in life.
What happens in practice when I execute the program is that it runs for a few cycles (tested by adding a counter to the circular buffer class), after which it throws the exception, complaining that the reference counter for the next element is > 1.
Is this a result of the statement that the value returned by use_count() is approximate in a multi-threaded environment?
Is it possible to adjust the underlying mechanism to make it, uh, deterministic and behave as I would have liked it to behave?
If my thinking is correct - use_count() (or rather the real number of users) of the next element should never EVER increase above 1 when inside the get() function of Circular, since there are only two consumers, and every time a thread calls get(), it's already released its old (copied) std::shared_ptr<int> (which in turn means that the remaining std::shared_ptr<int> residing in Circular::ints_ should have a reference count of only 1).
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
class Circular {
public:
Circular() {
for (auto& i : ints_) { i = std::make_shared<int>(0); }
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
index_ = index_ % 2; // Re-set the index pointer.
if (ints_.at(index_).use_count() > 1) {
// This shouldn't happen - right? (but it does)
std::string excp = std::string("OOPSIE: ") + std::to_string(index_) + " " + std::to_string(ints_.at(index_).use_count());
throw std::logic_error(excp);
}
return ints_.at(index_++);
}
private:
std::mutex guard_;
unsigned int index_{0};
std::array<std::shared_ptr<int>, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{circ.get()};
}while(1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}

While use_count is fraught with problems, the core issue right now is outside of that logic.
Assume thread t1 takes the shared_ptr at index 0, and then t2 runs its loop twice before t1 finishes its first loop iteration. t2 will obtain the shared_ptr at index 1, release it, and then attempt to acquire the shared_ptr at index 0, and will hit your failure condition, since t1 is just running behind.
Now, that said, in a broader context, it's not particularly safe, as if a user creates a weak_ptr, it's entirely possible for the use_count to go from 1 to 2 without passing through this function. In this simple example, it would work to have it loop through the index array until it finds the free shared pointer.

use_count is for debugging only and shouldn't be used. If you want to know when nobody else has a reference to a pointer any more just let the shared pointer die and use a custom deleter to detect that and do whatever you need to do with the now unused pointer.
This is an example of how you might implement this in your code:
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
#include <vector>
#include <iostream>
class Circular {
public:
Circular() {
size_t index = 0;
for (auto& i : ints_)
{
i = 0;
unused_.push_back(index++);
}
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
if (unused_.empty())
{
throw std::logic_error("OOPSIE: none left");
}
size_t index = unused_.back();
unused_.pop_back();
return std::shared_ptr<int>(&ints_[index], [this, index](int*) {
std::lock_guard<std::mutex> lock_guard(guard_);
unused_.push_back(index);
});
}
private:
std::mutex guard_;
std::vector<size_t> unused_;
std::array<int, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{ circ.get() };
} while (1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}
A list of unused indexes is kept, when the shared pointer is destroyed the custom deleter returns the index back to the list of unused indexes ready to be used in the next call to get.

Related

How to properly assign a member value in a std::thread and read it from another?

I have an issue with multithreading where I probably lack understanding. I want to assign a member variable of an object inside a thread and then print the variable from another thread (or main in this case). It seems like I am doing something wrong, like there are two objects.
Here is a quick example:
thread_test.hpp
#pragma once
#include <iostream>
#include <chrono>
#include <thread>
class thread_test
{
public:
thread_test ():
test_variable{}
{}
int get_test_variable() const;
void updater_method();
private:
/* private data */
int test_variable;
};
thread_test.cpp
#include "thread_test.hpp"
void thread_test::updater_method()
{
int i{};
while(true)
{
test_variable = ++i;
std::this_thread::sleep_for(std::chrono::microseconds{1800});
}
}
int thread_test::get_test_variable() const
{
return test_variable;
}
main.cpp
#include <iostream>
#include <thread>
#include <unistd.h>
#include "thread_test.hpp"
int main()
{
thread_test obj{};
std::thread updater_thread(&thread_test::updater_method, obj);
while(true)
{
std::cout << obj.get_test_variable() << std::endl;
sleep(1);
}
return 0;
}
Output:
❯ ./main
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I don't understand why the variable stays at the initialized value, because I am assigning it in the thread for the test object. Is there a better way to start the thread or should I start the thread from the class itself? What is the right way to do this?
std::thread stores the arguments to be passed to the thread execution function by value (actually the function as well, relevant, if it's actually a functor, which is supported as well) – i.e. it creates copies of, which applies, too, to your test object obj. So the result is that the main thread and the worker thread operate each on their own, distinct instance of your thread_test class and thus the main thread never receives updates.
To avoid, you can either use a std::reference_wrapper or a pointer:
std::thread updater_thread(&thread_test::updater_method, std::ref(obj));
// or alternatively:
std::thread updater_thread(&thread_test::updater_method, &obj);
This way you only operate on one single instance. Yet another alternative (thanks, #chi, for the hint) is using a lambda:
std::thread updater_thread([&obj]() { obj.updater_method() });
// ^^^ optional, alternatively just a default &
Note, too, that updating and reading variables in general is not atomic and you might get confronted with race conditions. You can solve the issue by either using std::atomic, wrapping entire objects into or single members thereof, whichever suits better for you, or you protect access to the variables via a std::mutex or possibly a std::shared_mutex, if you want to allow simultaneous reads and only have the writes with exclusive access.

Why does std::shared_ptr call my destructor twice?

In this program, why is the destructor on line 14 is called twice for the same instance of mystruct_t?
I'm assuming that all pointer manipulation in this program is thread safe. I think the atomic updates do not work on my system or compiler.
I tried this on MSVC 2017, MSVC 2019 and on clang
/* This crashes for me (line 19) */
#include <iostream>
#include <vector>
#include <thread>
#include <memory>
#include <chrono>
#include <assert.h>
struct mystruct_t {
int32_t nInvocation = 0;
~mystruct_t();
mystruct_t() = default;
};
mystruct_t::~mystruct_t() {
nInvocation++;
int nInvoke = nInvocation;
if (nInvoke > 1) {
/* destructor was invoked twice */
assert(0);
}
/* sleep is not necessary for crash */
//std::this_thread::sleep_for(std::chrono::microseconds(525));
}
std::shared_ptr<mystruct_t> globalPtr;
void thread1() {
for (;;) {
std::this_thread::sleep_for(std::chrono::microseconds(1000));
std::shared_ptr<mystruct_t> ptrNewInstance = std::make_shared<mystruct_t>();
globalPtr = ptrNewInstance;
}
}
void thread2() {
for (;;) {
std::shared_ptr<mystruct_t> pointerCopy = globalPtr;
}
}
int main()
{
std::thread t1;
t1 = std::thread([]() {
thread1();
});
std::thread t2;
t2 = std::thread([]() {
thread2();
});
for (int i = 0;; ++i) {
std::this_thread::sleep_for(std::chrono::microseconds(1000));
std::shared_ptr<mystruct_t> pointerCopy = globalPtr;
globalPtr = nullptr;
}
return 0;
}
As several users here already mentioned, you're running into undefined behavior since you globally (or foreign threaded) alter your referred object while thread-locally, you try to copy assign it. A drawback of the sharedPtr especially for newcomers is the quite hidden danger in the suggestion, you're always thread safe in copying them. As you do not use references, this can become even harder to see in doubt. Always try to see the shared_ptr as a regular class in the first place with a common ('trivial') member-wise copy assigment where interferences are always possible in non-protected threading environments.
If you're going to encounter similar situations with that or similar code in future, try to use a robust channeled broadcasting/event based scheme instead of locally placed locks! The channels (buffered or single data based) themselves care about the proper data lifetime then, ensuring the sharedPtr's underlying data 'rescue'.

C++ thread safe class, not working as expected

I am trying to implement a thread-safe class. I put lock_guard for setter and getter for each member variable.
#include <iostream>
#include <omp.h>
#include <mutex>
#include <vector>
#include <map>
#include <string>
class B {
std::mutex m_mutex;
std::vector<int> m_vec;
std::map<int,std::string> m_map;
public:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
void insertInMap(int x, const std::string& s) {
std::lock_guard<std::mutex> lock(m_mutex);
m_map[x] = s;
}
const std::string& getValAtKey(int k) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_map[k];
}
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
};
int main() {
B b;
#pragma omp parallel for num_threads(4)
for (int i = 0; i < 100000; i++) {
b.insertInVec(i);
b.insertInMap(i, std::to_string(i));
}
std::cout << b.getValAtKey(20) << std::endl;
std::cout << b.getValAtIdx(20) << std::endl;
return 0;
}
when I run this code, the output from map is correct but output from vector is garbage. I get o/p as
20
50008
Ofcourse the second o/p changes at each run.
1. what is wrong with this code? (I also have to consider the scenario, where there can be multiple instances of class B, running at multiple threads)
2. For each member variable do I need separate mutex variable? like
std::mutex vec_mutex;
std::mutex map_mutex;
I don't understand why you think that the output is garbage. Your loop is executed in 4 threads, so a possible sharing of the tasks could be:
Thread 1: 0 <= i < 25000
Thread 2: 25000 <= i < 50000
Thread 3: 50000 <= i < 75000
Thread 4: 75000 <= i < 100000
Each thread is doing a push_back of i to the vector. If now thread 1 starts and writes "0, 1, 2, 3, 4, 5, 6, 7, 8, 9" and then thread 1 writes "25000, 25001" and then thread 3 writes "50000, 50001, 50002, 50003, 50004, 50005, 50006, 50007, 50008". So you will end up with value 50008 at index 20. Of course other thread interleaving is also possible and you might also see values like for example 25003 or 75004.
The output you see is fine, only your expectations are off.
You add elements into the vector via:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
and then retrive them via:
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
Because the loop is executed in parallel, there is no guarantee that the values are inserted in the order you expect. Whichever thread first grabs the mutex will insert values first. If you wanted to insert the values in some specified order you would need to resize the vector upfront and then use something along the line of:
void setInVecAtIndex(int x,size_t index) {
std::lock_guard<std::mutex> lock(m_mutex); // maybe not needed, see PS
m_vec[index] = x;
}
So this isn't the problem with your code. However, I see two problems:
getValAtKey returns a reference to the value in the map. It is a const reference, but that does not prevent somebody else to modify the value via a call to insertInMap. Returning a reference here defeats the purpose of using the lock. Using that reference is not thread safe! To make it thread safe you need to return a copy.
You forgot to protect the compiler generated methods. For an overview, see What are all the member-functions created by compiler for a class? Does that happen all the time?. The compiler generated methods will not use your getters and setter, hence are not thread-safe by default. You should either define them yourself or delete them (see also rule of 3/5).
PS: Accessing different elements in a vector from different threads needs no synchronization. As long as you do not resize the vector and only access different elements you do not need the mutex if you can ensure that no two threads access the same index.

Synchronous destruction through std::shared_ptr<T>::reset()

Consider the following simplified program modelling a real scenario where different users can make concurrent requests to the same resource:
#include <thread>
#include <memory>
#include <mutex>
#include <iostream>
using namespace std;
struct T {
void op() { /* some stuff */ }
~T() noexcept { /* some stuff */ }
};
std::shared_ptr<T> t;
std::mutex mtx;
std::weak_ptr<T> w{t};
enum action { destroy, op};
void request(action a) {
if (a == action::destroy) {
lock_guard<mutex> lk{mtx};
t.reset();
std::cout << "*t certainly destroyed\n";
} else if (a == action::op) {
lock_guard<mutex> lk{mtx};
if (auto l = w.lock()) {
l->op();
}
}
}
int main() {
// At some point in time and different points in the program,
// two different users make two different concurrent requests
std::thread th1{request, destroy}; std::thread th2{request, op};
// ....
th2.join();
th1.join();
}
I am not asking if the program is formally correct - I think it is, but I have never seen this approach for guaranteeing a synchronous destruction of a resource shared via smart pointers. I personally think it is fine and has a valid use.
However, I am wondering if others think the same and, in case, if there are more elegant alternatives apart from the classic synchronization with unique_locks and condition variables and from introducing modifications (e.g. atomic flags) to T.
It would be ideal if I could even get rid of the mtx somehow.
Yes, it's fine. The reference counting in the shared_ptr is atomic and the locked copy stays in scope for the duration of the op, so the object can't be destroyed during the op.
In this case the mutex is not actually protecting the lifetime of T, but sequencing calls to op() and destruction. If you don't mind multiple concurrent calls to op(), or the destruction time being indeterminate (i.e. after the last running op() has completed) then you can do away with it, since std::shared_ptr<>::reset() and std::weak_ptr<>::lock() are both thread-safe.
However, I would advise caution as the author clearly meant for calls to op() to be serialised.

Thread safety in std::map of std::shared_ptr

I know there are a lot of similar questions with answers around, but since I still don't understand this particular case, I decided to pose a question.
What I have is a map of shared_ptrs to a dynamically allocated array (MyVector). What I want is limited concurrent access without the need to lock. I know that the map per se is not thread safe, but I always thought what I'm doing here should be ok, which is:
I fill the map in a single threaded environment like that:
typedef shared_ptr<MyVector<float>> MyVectorPtr;
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
After the initialization, I have one thread that reads from the elements and one that replaces what the shared_ptrs point to.
Thread 1:
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
Thread 2:
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
}
After a while I get either a seg fault or a double free in one of the two threads. Somehow not really surprisingly, but still I don't really get it.
The reasons why I thought this would work, are:
I don't add or remove any items of the map in the multi-threaded
environment, so the iterators should always point to something valid.
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic. SO Question
Obviously, either one ore more of my assumptions are wrong, or I'm not doing what I think I am. I think that reset actually is not thread safe, would std::atomic_exchange help?
Can someone release me? Thanks a lot!
If someone wants to try out, here is the full code example:
#include <stdio.h>
#include <iostream>
#include <string>
#include <map>
#include <unistd.h>
#include <pthread.h>
using namespace std;
template<class T>
class MyVector
{
public:
MyVector(int length)
: number(length)
, array(new T[length])
{
}
~MyVector()
{
if (array != NULL)
{
delete[] array;
}
array = NULL;
}
int number;
private:
T* array;
};
typedef shared_ptr<MyVector<float>> MyVectorPtr;
static map<int,MyVectorPtr> content;
const int numRows = 1000;
const int numElements = 10;
//pthread_mutex_t write_lock;
double get_cache_size_in_megabyte()
{
double memory_use=0;
//BlockingLockGuard guard(write_lock);
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
return memory_use/(1024.0*1024.0);
}
void* write_content(void*)
{
while(true)
{
//BlockingLockGuard guard(write_lock);
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
cout << "one new written" <<endl;
}
}
return NULL;
}
void* loop_size_checker(void*)
{
while (true)
{
cout << get_cache_size_in_megabyte() << endl;;
}
return NULL;
}
int main(int argc, const char* argv[])
{
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
pthread_attr_t attr;
pthread_attr_init(&attr) ;
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
pthread_t *grid_proc3 = new pthread_t;
pthread_create(grid_proc3, &attr, &loop_size_checker,NULL);
pthread_t *grid_proc = new pthread_t;
pthread_create(grid_proc, &attr, &write_content,(void*)NULL);
// to keep alive and avoid content being deleted
sleep(10000);
}
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
Changing the element in a map is not atomic unless you have a atomic type like std::atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic.
That is correct. Unfortunately you are also changing the underlying pointer. That pointer is not atomic. Since it is not atomic you need synchronization.
One thing you can do though is use the atomic free functions that are introduced with std::shared_ptr. This will let you avoid having to use a mutex.
Lets expand MyVectorPtr p(i->second); which is running on thread-1:
The constructor called for this is:
template< class Y >
shared_ptr( const shared_ptr<Y>& r ) = default;
Which probably boils down to 2 assignments of the underlying shared pointer and the reference count.
It may very well happen that thread 2 would delete the shared pointer while in thread-1 the pointer is being assigned to p. The underlying pointer stored inside shared_ptr is not atomic.
Thus, you usage of std::shared_ptr is not thread safe. It is thread safe as long as you do not update or modify the underlying pointer.
TL;DR;
Changing std::map isn't thread safe, while using std::shared_ptr regarding additional references is.
You should protect accessing your map regarding read/write operations using an appropriate synchronization mechanism, like e.g. a std::mutex.
Also if the state of an instance referenced by the std::shared_ptr should change, it needs to be protected against data races if it's accessed from concurrent threads.
BTW, the MyVector you are showing is a way too naive implementation.