I know that need mutex when try to delete an element from a vector.
so, I wrote a sample code to check this.
class Test
{
public:
Test(int idx) : m_index(idx) {}
int m_index = { -1 };
int m_count = { 0 };
};
std::vector<std::unique_ptr<Test>> m_vec;
std::mutex m_mutex;
void foo1() // print element data
{
while (true)
{
std::unique_lock ulock(m_mutex);
for (auto& e : m_vec)
{
e->m_count++;
printf("%d : Count : %d\n", e->m_index, e->m_count);
}
ulock.unlock();
std::this_thread::sleep_for(std::chrono::milliseconds(5));
}
}
void foo2() // Only insert element
{
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> dis(0, 99);
while (true)
{
int t = dis(gen);
if (t >= 0 && t < 10)
{
//std::unique_lock ulock(m_mutex);
m_vec.push_back(std::make_unique<Test>(m_vec.size()));
}
std::this_thread::sleep_for(std::chrono::milliseconds(t));
}
}
void foo3() // Only remove element
{
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> dis(0, 99);
while (true)
{
int t = dis(gen);
if (t >= 0 && t < 10)
{
std::unique_lock ulock(m_mutex);
if (m_vec.empty() == false)
m_vec.erase(m_vec.begin());
}
std::this_thread::sleep_for(std::chrono::milliseconds(t));
}
}
int main()
{
m_vec.push_back(std::make_unique<Test>(1));
m_vec.push_back(std::make_unique<Test>(2));
m_vec.push_back(std::make_unique<Test>(3));
std::thread t1(foo1);
std::thread t2(foo2);
std::thread t3(foo3);
t1.join();
t2.join();
t3.join();
return 0;
}
If I proceed with erase() without using mutex, segment fault almost immediately occurred.
So I used mutex for the erase() routine, which seemed to work normally.
About 10 minutes later, however, a nullptr exception occurred when referring to e in the foo1() function.
Q1. push_back inserts data at the end. But why does the NULL error occur at the middle point? ex. vector size: 521, error index: 129)
Q2. When using ordered containers such as vector and deque, do I need mutex in insert functions?
Q3. What about Unordered containers? (like, unorderd_map)
(Remove mutex from insert and operate without any problem for about 20 minutes.)
foo2() is accessing/modifying the vector outside of the mutex lock. As such, foo1() and/or foo3() (which do use the mutex) are able to modify the vector at the same time as foo2(). That is undefined behavior.
When push_back inserts data at the end, why does the middle point (ex. vector size: 521, error index: 129) Null point error occur?
Pushing a new element into a vector may require it to reallocate its internal array, thus moving all of the existing elements to a new memory block. You are doing that in foo2() without the protection of the mutex lock, so it is possible that the elements which foo1() and foo3() are accessing may disappear behind their backs unexpectedly.
Erasing elements from a vector will not reallocate the internal array, but it may still shift elements around within the array's existing memory.
When using ordered containers such as vector and deque, do I need mutex in insert and delete functions?
Yes. All standard containers are not thread-safe, so modifications must be serialized when used across thread boundaries.
What about Unordered containers? (like, unorderd_map) (Remove mutex from insert and operate without any problem for about 20 minutes.)
Same thing.
Related
https://en.cppreference.com/w/cpp/memory/shared_ptr/use_count states:
In multithreaded environment, the value returned by use_count is approximate (typical implementations use a memory_order_relaxed load)
But does this mean that use_count() is totally useless in a multi-threaded environment?
Consider the following example, where the Circular class implements a circular buffer of std::shared_ptr<int>.
One method is supplied to users - get(), which checks whether the reference count of the next element in the std::array<std::shared_ptr<int>> is greater than 1 (which we don't want, since it means that it's being held by a user which previously called get()).
If it's <= 1, a copy of the std::shared_ptr<int> is returned to the user.
In this case, the users are two threads which do nothing at all except love to call get() on the circular buffer - that's their purpose in life.
What happens in practice when I execute the program is that it runs for a few cycles (tested by adding a counter to the circular buffer class), after which it throws the exception, complaining that the reference counter for the next element is > 1.
Is this a result of the statement that the value returned by use_count() is approximate in a multi-threaded environment?
Is it possible to adjust the underlying mechanism to make it, uh, deterministic and behave as I would have liked it to behave?
If my thinking is correct - use_count() (or rather the real number of users) of the next element should never EVER increase above 1 when inside the get() function of Circular, since there are only two consumers, and every time a thread calls get(), it's already released its old (copied) std::shared_ptr<int> (which in turn means that the remaining std::shared_ptr<int> residing in Circular::ints_ should have a reference count of only 1).
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
class Circular {
public:
Circular() {
for (auto& i : ints_) { i = std::make_shared<int>(0); }
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
index_ = index_ % 2; // Re-set the index pointer.
if (ints_.at(index_).use_count() > 1) {
// This shouldn't happen - right? (but it does)
std::string excp = std::string("OOPSIE: ") + std::to_string(index_) + " " + std::to_string(ints_.at(index_).use_count());
throw std::logic_error(excp);
}
return ints_.at(index_++);
}
private:
std::mutex guard_;
unsigned int index_{0};
std::array<std::shared_ptr<int>, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{circ.get()};
}while(1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}
While use_count is fraught with problems, the core issue right now is outside of that logic.
Assume thread t1 takes the shared_ptr at index 0, and then t2 runs its loop twice before t1 finishes its first loop iteration. t2 will obtain the shared_ptr at index 1, release it, and then attempt to acquire the shared_ptr at index 0, and will hit your failure condition, since t1 is just running behind.
Now, that said, in a broader context, it's not particularly safe, as if a user creates a weak_ptr, it's entirely possible for the use_count to go from 1 to 2 without passing through this function. In this simple example, it would work to have it loop through the index array until it finds the free shared pointer.
use_count is for debugging only and shouldn't be used. If you want to know when nobody else has a reference to a pointer any more just let the shared pointer die and use a custom deleter to detect that and do whatever you need to do with the now unused pointer.
This is an example of how you might implement this in your code:
#include <mutex>
#include <array>
#include <memory>
#include <exception>
#include <thread>
#include <vector>
#include <iostream>
class Circular {
public:
Circular() {
size_t index = 0;
for (auto& i : ints_)
{
i = 0;
unused_.push_back(index++);
}
}
std::shared_ptr<int> get() {
std::lock_guard<std::mutex> lock_guard(guard_);
if (unused_.empty())
{
throw std::logic_error("OOPSIE: none left");
}
size_t index = unused_.back();
unused_.pop_back();
return std::shared_ptr<int>(&ints_[index], [this, index](int*) {
std::lock_guard<std::mutex> lock_guard(guard_);
unused_.push_back(index);
});
}
private:
std::mutex guard_;
std::vector<size_t> unused_;
std::array<int, 2> ints_;
};
Circular circ;
void func() {
do {
auto scoped_shared_int_pointer{ circ.get() };
} while (1);
}
int main() {
std::thread t1(func), t2(func);
t1.join(); t2.join();
}
A list of unused indexes is kept, when the shared pointer is destroyed the custom deleter returns the index back to the list of unused indexes ready to be used in the next call to get.
I am trying to implement a thread-safe class. I put lock_guard for setter and getter for each member variable.
#include <iostream>
#include <omp.h>
#include <mutex>
#include <vector>
#include <map>
#include <string>
class B {
std::mutex m_mutex;
std::vector<int> m_vec;
std::map<int,std::string> m_map;
public:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
void insertInMap(int x, const std::string& s) {
std::lock_guard<std::mutex> lock(m_mutex);
m_map[x] = s;
}
const std::string& getValAtKey(int k) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_map[k];
}
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
};
int main() {
B b;
#pragma omp parallel for num_threads(4)
for (int i = 0; i < 100000; i++) {
b.insertInVec(i);
b.insertInMap(i, std::to_string(i));
}
std::cout << b.getValAtKey(20) << std::endl;
std::cout << b.getValAtIdx(20) << std::endl;
return 0;
}
when I run this code, the output from map is correct but output from vector is garbage. I get o/p as
20
50008
Ofcourse the second o/p changes at each run.
1. what is wrong with this code? (I also have to consider the scenario, where there can be multiple instances of class B, running at multiple threads)
2. For each member variable do I need separate mutex variable? like
std::mutex vec_mutex;
std::mutex map_mutex;
I don't understand why you think that the output is garbage. Your loop is executed in 4 threads, so a possible sharing of the tasks could be:
Thread 1: 0 <= i < 25000
Thread 2: 25000 <= i < 50000
Thread 3: 50000 <= i < 75000
Thread 4: 75000 <= i < 100000
Each thread is doing a push_back of i to the vector. If now thread 1 starts and writes "0, 1, 2, 3, 4, 5, 6, 7, 8, 9" and then thread 1 writes "25000, 25001" and then thread 3 writes "50000, 50001, 50002, 50003, 50004, 50005, 50006, 50007, 50008". So you will end up with value 50008 at index 20. Of course other thread interleaving is also possible and you might also see values like for example 25003 or 75004.
The output you see is fine, only your expectations are off.
You add elements into the vector via:
void insertInVec(int x) {
std::lock_guard<std::mutex> lock(m_mutex);
m_vec.push_back(x);
}
and then retrive them via:
int getValAtIdx(int i) {
std::lock_guard<std::mutex> lock(m_mutex);
return m_vec[i];
}
Because the loop is executed in parallel, there is no guarantee that the values are inserted in the order you expect. Whichever thread first grabs the mutex will insert values first. If you wanted to insert the values in some specified order you would need to resize the vector upfront and then use something along the line of:
void setInVecAtIndex(int x,size_t index) {
std::lock_guard<std::mutex> lock(m_mutex); // maybe not needed, see PS
m_vec[index] = x;
}
So this isn't the problem with your code. However, I see two problems:
getValAtKey returns a reference to the value in the map. It is a const reference, but that does not prevent somebody else to modify the value via a call to insertInMap. Returning a reference here defeats the purpose of using the lock. Using that reference is not thread safe! To make it thread safe you need to return a copy.
You forgot to protect the compiler generated methods. For an overview, see What are all the member-functions created by compiler for a class? Does that happen all the time?. The compiler generated methods will not use your getters and setter, hence are not thread-safe by default. You should either define them yourself or delete them (see also rule of 3/5).
PS: Accessing different elements in a vector from different threads needs no synchronization. As long as you do not resize the vector and only access different elements you do not need the mutex if you can ensure that no two threads access the same index.
I'm pretty new to concurrent programming and I have a specific issue to which I could not find a solution by browsing the internet..
Basically I have this situation (schematic pseudocode):
void fun1(std::vector<std::shared_ptr<SmallObj>>& v) {
for(int i=0; i<v.size(); i++)
.. read and write on *v[i] ..
}
void fun2(std::vector<std::shared_ptr<SmallObj>>& w) {
for(int i=0; i<w.size(); i++)
.. just read on *w[i] ..
}
int main() {
std::vector<std::shared_ptr<SmallObj>> tot;
for(int iter=0; iter<iterMax; iter++) {
for(int nObj=0; nObj<nObjMax; nObj++)
.. create a SmallObj in the heap and store a shared_ptr in tot ..
std::vector<std::shared_ptr<SmallObj>> v, w;
.. copy elements of "tot" in v and w ..
fun1(v);
fun2(w);
}
return 0;
}
What I want to do is operating concurrently spawning two threads to execute fun1 and fun2 but I need to regulate the access to the SmallObjs using some locking mechanism. How can I do it? In the literature I can only find examples of using mutexes to lock the access to a specific object or a portion of code, but not on the same pointed variables by different objects (in this case v and w)..
Thank you very much and sorry for my ignorance on the matter..
I need to regulate the access to the SmallObjs using some locking mechanism. How can I do it?
Use getters and setters for your data members. Use a std::mutex (or a std::recursive_mutex depending on whether recursive locking is needed) data member to guard the accesses, then always lock with a lock guard.
Example (also see the comments in the code):
class SmallObject{
int getID() const{
std::lock_guard<std::mutex> lck(m_mutex);
return ....;
}
void setID(int id){
std::lock_guard<std::mutex> lck(m_mutex);
....;
}
MyType calculate() const{
std::lock_guard<std::mutex> lck(m_mutex);
//HERE is a GOTCHA if `m_mutex` is a `std::mutex`
int k = this->getID(); //Ooopsie... Deadlock
//To do the above, change the decaration of `m_mutex` from
//std::mutex, to std::recursive_mutex
}
private:
..some data
mutable std::mutex m_mutex;
};
The simplest solution is to hold an std::mutex for the whole vector:
#include <mutex>
#include <thread>
#include <vector>
void fun1(std::vector<std::shared_ptr<SmallObj>>& v,std::mutex &mtx) {
for(int i=0; i<v.size(); i++)
//Anything you can do before read/write of *v[i]...
{
std::lock_guard<std::mutex> guard(mtx);
//read-write *v[i]
}
//Anything you can do after read/write of *v[i]...
}
void fun2(std::vector<std::shared_ptr<SmallObj>>& w,std::mutex &mtx) {
for(int i=0; i<w.size(); i++) {
//Anything that can happen before reading *w[i]
{
std::lock_guard<std::mutex> guard(mtx);
//read *w[i]
}
//Anything that can happen after reading *w[i]
}
int main() {
std::mutex mtx;
std::vector<std::shared_ptr<SmallObj>> tot;
for(int iter=0; iter<iterMax; iter++) {
for(int nObj=0; nObj<nObjMax; nObj++)
.. create a SmallObj in the heap and store a shared_ptr in tot ..
std::vector<std::shared_ptr<SmallObj>> v, w;
.. copy elements of "tot" in v and w ..
std::thread t1([&v,&mtx] { fun1(v,mtx); });
std::thread t2([&w,&mtx] { fun2(w,mtx); });
t1.join();
t2.join();
}
return 0;
}
However you will only realistically get any parallelism on the bits done in the before/after blocks in the loop of fun1() and fun2().
You could further increase parallelism by introducing more locks.
For example you can maybe get away with only 2 mutexes which control odd and even elements:
void fun1(std::vector<int>&v,std::mutex& mtx0,std::mutex& mtx1 ){
for(size_t i{0};i<v.size();++i){
{
std::lock_guard<std::mutex> guard(i%2==0?mtx0:mtx1);
//read-write *v[i]
}
}
}
With a similar format for fun2().
You might be able to reduce contention by working from opposite ends of the vectors or using try_lock and moving onto subsequent elements and 'coming back' to the locked element when available.
That can be most significant if the execution of an iteration of one function is much greater than the other and there's some advantage in getting the results from the 'faster' one before the other finishes.
Alternatives:
It's obviously possible to add an std::mutex to each object.
Whether that works / is necessary will depend on what is actually done in the functions fun1 and fun2 as well as how those mutexes are managed.
If it's necessary to lock before either of the loops start there may be in fact no benefit in parallelism because one of fun1() or fun2() will essentially wait for the other to finish and the two will in effect run in series.
I am using a std::map and have large number of elements in it. If I need to clear the map, I can just call clear() on it. It can take some time to clear and especially if done under a lock in a multi-threaded environment, it can block other calls. To avoid calling clear(), I tried this:
std::mutex m;
std::map<int, int> my_map; // the map which I want to clear
void func()
{
std::map<int, int> temp_map;
{
std::lock_guard<std::mutex> l(m);
temp_map = std::move(my_map);
}
}
This will move my_map to temp_map under the lock which will empty it out. And then once the func ends, the temp_map will be destroyed.
Is this a better way to do this to prevent taking the lock for a long time? Are there any performance hits?
I would recommend using swap instead of move. A moved object is not guaranteed to be actually empty or even usable. But with swap and a freshly created object you are sure of the results:
void func()
{
std::map<int, int> temp_map;
using std::swap;
{
std::lock_guard<std::mutex> l(m);
swap(my_map, temp_map);
}
}
I'm trying to understand multi threading in C++. In the following bit of code, will the deque 'tempData' declared in retrieve() always have every element processed once and only once, or could there be multiple copies of tempData across multiple threads with stale data, causing some elements to be processed multiple times? I'm not sure if passing by reference actually causes there to be only one copy in this case?
static mutex m;
void AudioAnalyzer::analysisThread(deque<shared_ptr<AudioAnalysis>>& aq)
{
while (true)
{
m.lock();
if (aq.empty())
{
m.unlock();
break;
}
auto aa = aq.front();
aq.pop_front();
m.unlock();
if (false) //testing
{
retrieveFromDb(aa);
}
else
{
analyzeAudio(aa);
}
}
}
void AudioAnalyzer::retrieve()
{
deque<shared_ptr<AudioAnalysis>>tempData(data);
vector<future<void>> futures;
for (int i = 0; i < NUM_THREADS; ++i)
{
futures.push_back(async(bind(&AudioAnalyzer::analysisThread, this, _1), ref(tempData)));
}
for (auto& f : futures)
{
f.get();
}
}
Looks OK to me.
Threads have shared memory and if the reference to tempData turns up as a pointer in the thread then every thread sees exactly the same pointer value and the same single copy of tempData. [You can check that if you like with a bit of global code or some logging.]
Then the mutex ensures single-threaded access, at least in the threads.
One problem: somewhere there must be a push onto the deque, and that may need to be locked by the mutex as well. [Obviously the push_back onto the futures queue is just local.]