Modifying data in threads - c++

I have a class with the following structure:
class Nginx_sender
{
private:
std::vector<std::string> mMessagesBuffer;
boost::mutex mMutex;
void SendMessage(const std::string &msg)
{
mMutex.lock();
mMessagesBuffer.push_back(msg);
mMutex.unlock();
std::cout << "Vector size: " << mMessagesBuffer.size() << std::endl;
}
void NewThreadFunction()
{
while(true) {
mMutex.lock();
if (mMessagesBuffer.size() >= 1) std::cout << ">=1\n";
mMutex.unlock();
boost::this_thread::sleep(boost::posix_time::milliseconds(200));
}
}
};
int main()
{
Nginx_sender *NginxSenderHandle;
boost::thread sender(boost::bind(&Nginx_sender::NewThreadFunction, &NginxSenderHandle));
// ...
}
NewThreadFunction is running in new thread and it checks the size of mMessagesBuffer. Now I call anywhere in main function: NginxSenderHandle->SendMessage("Test");
This shows up: Vector size: 1 first time, 2 second time etc.
But! In NewThreadFunction it's always == 0. Why could it be?

You are most probably creating another copy of Nginx_sender when you bind it. Do you really need to reference NginxSenderHandle before passing it to bind() (it's already a pointer)? http://www.boost.org/doc/libs/1_49_0/libs/bind/bind.html#with_member_pointers

I bet compiler is caching some of mMessagesBuffer internals in thread-local cache. Try adding 'volatile' keyword to mMessagesBuffer to disable such optimizations.

Related

Signal from one process to another C++

I know the title is a bit broad so let me elaborate.
I have 2 processes running, one is writing into the shared memory, the other is reading from it.
To achieve shared memory effect I am using boost::interprocess (btw let me know if there are more convenient libraries).
So I implemented the following:
//Writer
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/windows_shared_memory.hpp>
#include <boost/interprocess/mapped_region.hpp>
#include <iostream>
namespace ip = boost::interprocess;
class SharedMemory
{
public:
template<typename OpenOrCreate>
SharedMemory(OpenOrCreate criteria, const char* name, ip::mode_t mode, size_t size) :
name_(name),
sm_(std::make_shared<ip::windows_shared_memory>(criteria, name, mode, size))
{
}
template<typename OpenOrCreate>
SharedMemory(OpenOrCreate criteria, const char* name, ip::mode_t mode) :
name_(name),
sm_(std::make_shared<ip::windows_shared_memory>(criteria, name, mode))
{
}
std::shared_ptr<ip::windows_shared_memory> getSM()
{
return sm_;
}
private:
std::function<void()> destroyer_;
std::string name_;
std::shared_ptr<ip::windows_shared_memory> sm_;
};
int main()
{
SharedMemory creator(ip::create_only, "SharedMemory", ip::read_write, 10);
ip::mapped_region region(*creator.getSM(), ip::read_write);
std::memset(region.get_address(), 1, region.get_size());
int status = system("reader.exe");
std::cout << status << std::endl;
}
So I am creating shared memory, writing 1 to it then calling the reader exe. (I skip the reader part as its pretty much the same but instead of write it reads)
This code works fine, I write into memory and the other process reads it and prints my 1's.
But what if I have this 2 exes running at the same time and I want to write into memory then notify the other process that there is an update? How to signal from one exe/process to another?
The scenario is that I am streaming some live data, writing into memory and then telling the other process that there is an update.
I think there are more convenient approaches indeed.
In principle to synchronize between processes you use all the same approaches as synchronizing inside a process (between threads): using synchronization primitives (mutex/critical section, condition variable, semaphores, barriers etc.).
In addition, you need to have a data structure that you synchronize. This is precisely the Achilles' heel at the moment. There is a total absence of data structure here.
Though you can do raw byte access with your own logic, I don't see the appeal of using a high-level library in doing so. Instead I'd use a managed memory segment, that lets you find or construct typed objects by name. This may include your synchronization primitives.
In fact, you can expedite the process by using a message_queue which has all the synchronization already built-in.
Manual Sync: Writer using Segment Manager
I'll provide portable code because I donot have a windows machine. First let's think of a datastructure. A simple example would be a queue of messages. Let's use a deque<string>.
Not exactly trivial data structures, but the great news is that Boost Interprocess comes with all the nuts and bolts to make things work (using interprocess allocators).
namespace Shared {
using Segment = ip::managed_shared_memory;
using Mgr = Segment::segment_manager;
template <typename T>
using Alloc = bc::scoped_allocator_adaptor<ip::allocator<T, Mgr>>;
template <typename T> using Deque = bc::deque<T, Alloc<T>>;
using String = bc::basic_string<char, std::char_traits<char>, Alloc<char>>;
using DataStructure = Deque<String>;
class Memory {
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{
}
DataStructure& get() { return data_; }
DataStructure const& get() const { return data_; }
private:
std::string name_;
Segment sm_;
DataStructure& data_;
};
} // namespace Shared
There, now we can have the writer be something like:
int main()
{
Shared::Memory creator("SharedMemory", 10*1024*1024);
creator.get().emplace_back("Hello");
creator.get().emplace_back("World");
std::cout << "Total queued: " << creator.get().size() << "\n";
}
Which will print e.g.
Total queued: 2
Total queued: 4
Total queued: 6
Depending on the number of times you ran it.
The Reader side
Now lets do the reader side. In fact it's so much the same, let's put it in the same main program:
int main(int argc, char**)
{
Shared::Memory mem("SharedMemory", 10*1024*1024);
auto& data = mem.get();
bool is_reader = argc > 1;
if (not is_reader) {
data.emplace_back("Hello");
data.emplace_back("World");
std::cout << "Total queued: " << data.size() << "\n";
} else {
std::cout << "Found entries: " << data.size() << "\n";
while (!data.empty()) {
std::cout << "Dequeued " << data.front() << "\n";
data.pop_front();
}
}
}
Simple for a start. Now running e.g. test.exe READER will conversely print something like:
Locking & Synchronization
The goal is to run writer and reader concurrently. That's not safe as it is now, because of a lack of locking and synchronization. Let's add it:
class Memory {
static constexpr size_t max_capacity = 100;
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, mx_(*sm_.find_or_construct<Mutex>("mutex")())
, cv_(*sm_.find_or_construct<Cond>("condition")())
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{ }
// ...
private:
std::string name_;
Segment sm_;
Mutex& mx_;
Cond& cv_;
DataStructure& data_;
};
Now let's be careful. Because we want all operations on the data_ queue to be synchronized, we shall not expose it as we did before (with the get() member function). Instead we expose the exact interface of operations we support:
size_t queue_length() const;
void enqueue(std::string message); // blocking when queue at max_capacity
std::string dequeue(); // blocking dequeue
std::optional<std::string> try_dequeue(); // non-blocking dequeue
These all do the locking as required, simply as you'd expect:
size_t queue_length() const {
ip::scoped_lock<Mutex> lk(mx_);
return data_.size();
}
It gets more interesting on the potentially blocking operations. I chose to have a maximum capacity, so enqueue needs to wait for capacity:
// blocking when queue at max_capacity
void enqueue(std::string message) {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return data_.size() < max_capacity; });
data_.emplace_back(std::move(message));
cv_.notify_one();
}
Conversely, dequeue needs to wait for a message to become available:
// blocking dequeue
std::string dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return not data_.empty(); });
return do_pop();
}
Alternatively, you could make it non-blocking, just optionally returning a value:
// non-blocking dequeue
std::optional<std::string> try_dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
if (data_.empty())
return std::nullopt;
return do_pop();
}
Now in main let's have three versions: writer, reader and continuous reader (where the latter demonstrates the blocking interface):
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/sync/interprocess_condition_any.hpp>
#include <boost/interprocess/sync/interprocess_mutex.hpp>
#include <boost/interprocess/sync/scoped_lock.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <boost/interprocess/containers/deque.hpp>
#include <boost/interprocess/containers/string.hpp>
#include <iostream>
#include <iomanip>
#include <optional>
namespace ip = boost::interprocess;
namespace bc = boost::container;
namespace Shared {
using Segment = ip::managed_shared_memory;
using Mgr = Segment::segment_manager;
template <typename T>
using Alloc = bc::scoped_allocator_adaptor<ip::allocator<T, Mgr>>;
template <typename T> using Deque = ip::deque<T, Alloc<T>>;
using String = ip::basic_string<char, std::char_traits<char>, Alloc<char>>;
using DataStructure = Deque<String>;
using Mutex = ip::interprocess_mutex;
using Cond = ip::interprocess_condition;
class Memory {
static constexpr size_t max_capacity = 100;
public:
Memory(const char* name, size_t size)
: name_(name)
, sm_(ip::open_or_create, name, size)
, mx_(*sm_.find_or_construct<Mutex>("mutex")())
, cv_(*sm_.find_or_construct<Cond>("condition")())
, data_(*sm_.find_or_construct<DataStructure>("data")(
sm_.get_segment_manager()))
{ }
size_t queue_length() const {
ip::scoped_lock<Mutex> lk(mx_);
return data_.size(); // caution: racy by design!
}
// blocking when queue at max_capacity
void enqueue(std::string message) {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return data_.size() < max_capacity; });
data_.emplace_back(std::move(message));
cv_.notify_one();
}
// blocking dequeue
std::string dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
cv_.wait(lk, [this] { return not data_.empty(); });
return do_pop();
}
// non-blocking dequeue
std::optional<std::string> try_dequeue() {
ip::scoped_lock<Mutex> lk(mx_);
if (data_.empty())
return std::nullopt;
return do_pop();
}
private:
std::string name_;
Segment sm_;
Mutex& mx_;
Cond& cv_;
DataStructure& data_;
// Assumes mx_ locked by current thread!
std::string do_pop() {
auto&& tmp = std::move(data_.front());
data_.pop_front();
cv_.notify_all(); // any of the waiters might be a/the writer
return std::string(tmp.begin(), tmp.end());
}
};
} // namespace Shared
int main(int argc, char**)
{
Shared::Memory mem("SharedMemory", 10*1024*1024);
switch (argc) {
case 1:
mem.enqueue("Hello");
mem.enqueue("World");
std::cout << "Total queued: " << mem.queue_length() << "\n";
break;
case 2:
std::cout << "Found entries: " << mem.queue_length() << "\n";
while (auto msg = mem.try_dequeue()) {
std::cout << "Dequeued " << *msg << "\n";
}
break;
case 3:
std::cout << "Continuous reader\n";
while (true) {
std::cout << "Dequeued " << mem.dequeue() << "\n";
}
break;
}
}
Little demo:
Summary, Caution
Note there are some loose ends with the above. Notably, the absence of robust locks in Boost Interprocess needs some extra care for proper shutdown without holding the lock.
I'd suggest to contrast with ip::message_queue as well:
How to put file in boost::interprocess::managed_shared_memory? (contrasts shared memory, message_queue and pure TCP sockets)
For error C2664 under msvc mentioned above, it could be solved by changing
data_.emplace_back(std::move(message));
to:
data_.emplace_back(std::move(message.data()));
Hope it could help anyone.

Failure to create a monitor with mutex and condition_variable in C++11

I am trying to create a monitor class to get sum of data[]. In the class, I use a monitor a do the reading/writer working. However, the condition_variable, unique_lock and mutex confuse me. When the data size is not over 56, my code works correctly but with a bigger data size the code fails, and the condition_variable.wait(cond_lock) cannot work when debugging in lldb.
The error: type std::__1::system_error: unique_lock::unlock: not locked:
Operation not permitted
does not help me to understand the problem.
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
const int datasize = 58;//*****datasize cannot be bigger than 56*****//
int *data = new int[datasize];
void data_init(){
for (int i = 0; i < datasize; ++i) {
data[i] = random() % datasize + 1;
}
int sum = 0;
for (int i = 0; i < datasize; ++i) {
sum += data[i];
}
std::cout << "true answer: " << sum << std::endl;
}
class monitor{
public:
std::mutex the_mutex;//the mutex to lock the function
std::unique_lock<std::mutex> cond_mutex;//trying to use this for condition_variable
std::condition_variable read_to_go, write_to_go;
int active_reader, active_writer, waiting_reader, waiting_writer;
bool write_flag;
void getTask(int Rank, int& task_one, int& task_two, int& second_rank);//**reader**//
void putResult(int Rank, int the_answer, int next_Rank);//**writer**//
explicit monitor(){
write_flag = true;
active_reader = active_writer = waiting_reader = waiting_writer = 0;
}
private:
inline void startRead();
inline void endRead();
inline void startWrite();
inline void endWrite();
};
monitor imonitor;
inline void monitor::startRead() {
the_mutex.lock();//lock the function code
cond_mutex.lock();//updated 1st
while((active_writer + active_reader) > 0){//if there are working reader and writer
waiting_reader++;//add one
read_to_go.wait(cond_mutex);//wait the thread
/*****when debugging with lldb, error appears here*****/
waiting_reader--;//one less reader waiting when notified
}
active_reader++;//one more working reader
the_mutex.unlock();
}
inline void monitor::endRead() {
the_mutex.lock();
active_reader--;//one less reader working
if(active_reader == 0 && waiting_writer > 0){//if there is not any reader working and there are some writer waiting
write_to_go.notify_one();//notify one writer
}//else get out directly
the_mutex.unlock();
}
inline void monitor::startWrite() {
the_mutex.lock();
cond_mutex.lock();//updated 1st
while((active_writer + active_reader) > 0){//if any reader or writer is working
waiting_writer++;//one more writer waiting
write_to_go.wait(cond_mutex);//block this thread
waiting_writer--;//when notfied, the number of waiting writer become less
}
active_writer++;//one more active writer
the_mutex.unlock();//updated 1st
}
inline void monitor::endWrite() {//write is over
the_mutex.lock();
active_writer--;//one less writer working
if(waiting_writer > 0){//if any writer waiting
write_to_go.notify_one();//notify one of them
}
else if(waiting_reader > 0){//if any reader waiting
read_to_go.notify_all();//notify all of them
}
the_mutex.unlock();
}
void monitor::getTask(int Rank, int &task_one, int &task_two, int &second_rank) {
startRead();
task_one = data[Rank];
while(Rank < (datasize - 1) && data[++Rank] == 0);
task_two = data[Rank];
second_rank = Rank;
//std::cout << "the second Rank is " << Rank << std::endl;
endRead();
}
void monitor::putResult(int Rank, int the_answer, int next_Rank) {
startWrite();
data[Rank] = the_answer;
data[next_Rank] = 0;
endWrite();
}
void reducer(int Rank){
//std::cout << "a reducer begins" << Rank << std::endl;
do {
int myTask1, myTask2, secondRank;
imonitor.getTask(Rank, myTask1, myTask2, secondRank);
if(myTask2 == 0) return;
//std::cout << "the second value Rank: " << secondRank << std::endl;
int answer = myTask1 + myTask2;
imonitor.putResult(Rank, answer, secondRank);
}while (true);
}
int main() {
std::cout << "Hello, World!" << std::endl;
data_init();
std::thread Reduce1(reducer, 0);
std::thread Reduce2(reducer, datasize/2);
/*std::thread Reduce3(reducer, 4);
std::thread Reduce4(reducer, 6);
std::thread Reduce5(reducer, 8);
std::thread Reduce6(reducer, 10);
std::thread Reduce7(reducer, 12);
std::thread Reduce8(reducer, 14);*/
Reduce1.join(); //std::cout << "A reducer in" <<std::endl;
Reduce2.join();
/*Reduce3.join();
Reduce4.join();
Reduce5.join();
Reduce6.join();
Reduce7.join();
Reduce8.join();*/
std::cout << data[0] << std::endl;
return 0;
}
My goal used to use 8 threads, but now the code can work with only one thread. Some cout for debugging are left in the code. Thank you for any help!
Updated 1st: I add cond_mutex.lock() in startWrite() and startRead() after the_mutex.lock(). The error in the last line of startWrite about cond_mutex.unlock() is fixed, which is replaced by the_mutex.unlock(). However, the problem is not fixed.
OK, guys. Thanking for your comments, I am inspired to work this problem out.
At the very beginning, I used std::unique_lock<std::mutex> cond_mutex; in the declaration of the monitor class, meaning the default initializer of unique_lock is called.
class monitor{ public: std::mutex the_mutex, assist_lock;/***NOTE HERE***/ std::unique_lock<std::mutex> cond_mutex;/***NOTE HERE***/ std::condition_variable read_to_go, write_to_go; int active_reader, active_writer, waiting_reader, waiting_writer; ...... };
Let us check the header __mutex_base, where the mutex and unique_lock are defined.(begin at line 104)
`template
class _LIBCPP_TEMPLATE_VIS unique_lock
{
public:
typedef _Mutex mutex_type;
private:
mutex_type* _m;
bool _owns;
public:
_LIBCPP_INLINE_VISIBILITY
unique_lock() _NOEXCEPT : _m(nullptr), _owns(false) {}/first and default/
_LIBCPP_INLINE_VISIBILITY
explicit unique_lock(mutex_type& __m)
: _m(_VSTD::addressof(__m)), _owns(true) {_m->lock();}/second and own an object/
......
};`
Obviously, the first initializer is used instead of second one. __m_ is nullptr.
When the <condition_variable>.wait(cond_mutex); call the lock(), we will find the error information: "unique_lock::lock: references null mutex"(in the header __mutex_base, line 207).
When <unique_lock>.unlock() called, because the lock() does not work the bool __owns_ is false(line 116&211), we will get error information: "unique_lock::unlock: not locked"(line 257).
To fix the problem, we cannot use an all-covered unique_lock cond_mutex, but to initialize cond_mutex each time we want to use startRead() and endStart() with an object the_mutex. In this way, the <unique_lock> cond_mutex(the_mutex) is initialized by the second initializer function, at the same time the_mutex is locked and <unique_lock> cond_mutex owns the_mutex. At the end of
startRead() and endStart(), cond_mutex.unlock() unlocks itself and the_mutex, then the cond_mutex dies and releases the_mutex.
class monitor{
public:
std::mutex the_mutex;
std::condition_variable read_to_go, write_to_go;
int active_reader, active_writer, waiting_reader, waiting_writer;
bool write_flag;
void getTask(int Rank, int& task_one, int& task_two, int& second_rank);//reader
void putResult(int Rank, int the_answer, int next_Rank);//writer
explicit monitor(){
write_flag = true;
active_reader = active_writer = waiting_reader = waiting_writer = 0;
}
private:
inline void startRead();
inline void endRead();
inline void startWrite();
inline void endWrite();
};
inline void monitor::startRead() {
std::unique_lock<std::mutex> cond_mutex(the_mutex);
while((active_writer + active_reader) > 0){
waiting_reader++;
read_to_go.wait(cond_mutex);
waiting_reader--;
}
active_reader++;
cond_mutex.unlock();
}
inline void monitor::startWrite() {
std::unique_lock<std::mutex> cond_mutex(the_mutex);
while((active_writer + active_reader) > 0){
waiting_writer++;
write_to_go.wait(cond_mutex);
waiting_writer--;
}
active_writer++;
cond_mutex.unlock();
}
The solution is here, just to replace my a part of original code with the code above. After all, thank everyone who gives comments.
You seem to be writing this:
class write_wins_mutex {
std::condition_variable cv_read;
std::condition_variable cv_write;
std::mutex m;
int readers=0;
int writers=0;
int waiting_writers=0;
int waiting_readers=0;
std::unique_lock<std::mutex> internal_lock(){ return std::unique_lock<std::mutex>( m );
public:
void shared_lock() {
auto l = internal_lock();
++waiting_readers;
cv_read.wait(l, [&]{ return !readers && !writers && !waiting_writers; } );
--waiting_readers;
++readers;
}
void lock() {
auto l = internal_lock();
++waiting_writers;
cv_write.wait(l, [&]{ return !readers && !writers; } );
--waiting_writers;
++writers;
}
private:
void notify(){
if (waiting_writers) {
if(!readers) cv_write.notity_one();
}
else if (waiting_readers) cv_read.notify_all();
}
public:
void unlock_shared(){
auto l = internal_lock();
--readers;
notify();
}
void unlock(){
auto l = internal_lock();
--writers;
notify();
}
};
probably has typos; wrote this on phone without compiling it.
But this is a mutex compatible with shared_lock and unique_lock (no try lock (for) but that just means those methods don't work).
In this model, readers are shared, but if any writer shows up it gets priority (even to the level of being able to starve readers)
If you want something less biased, just use shared mutex directly.

std::atomic_flag to stop multiple threads

I'm trying to stop multiple worker threads using a std::atomic_flag. Starting from Issue using std::atomic_flag with worker thread the following works:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
std::atomic_flag continueFlag;
std::thread t;
void work()
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
continueFlag.test_and_set(std::memory_order_relaxed);
t = std::thread(&work);
}
void stop()
{
continueFlag.clear(std::memory_order_relaxed);
t.join();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
Trying to rewrite into multiple worker threads:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
#include <vector>
#include <memory>
struct thread_data {
std::atomic_flag continueFlag;
std::thread thread;
};
std::vector<thread_data> threads;
void work(int threadNum, std::atomic_flag &continueFlag)
{
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work" << threadNum << " ";
std::this_thread::sleep_for(std::chrono::milliseconds(10));
}
}
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
thread_data td;
td.continueFlag.test_and_set(std::memory_order_relaxed);
td.thread = std::thread(&work, i, td.continueFlag);
threads.push_back(std::move(td));
////////////////////////////////////////////////////////////////////
//PROBLEM SECTOR
////////////////////////////////////////////////////////////////////
}
}
void stop()
{
//Flag stop
for (auto &data : threads) {
data.continueFlag.clear(std::memory_order_relaxed);
}
//Join
for (auto &data : threads) {
data.thread.join();
}
threads.clear();
}
int main()
{
std::cout << "Start" << std::endl;
start();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
std::cout << "Stop" << std::endl;
stop();
std::cout << "Stopped." << std::endl;
return 0;
}
My issue is "Problem Sector" in above. Namely creating the threads. I cannot wrap my head around how to instantiate the threads and passing the variables to the work thread.
The error right now is referencing this line threads.push_back(std::move(td)); with error Error C2280 'thread_data::thread_data(const thread_data &)': attempting to reference a deleted function.
Trying to use unique_ptr like this:
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, td->continueFlag);
threads.push_back(std::move(td));
Gives error std::atomic_flag::atomic_flag(const std::atomic_flag &)': attempting to reference a deleted function at line td->thread = std::thread(&work, i, td->continueFlag);. Am I fundamentally misunderstanding the use of std::atomic_flag? Is it really both immovable and uncopyable?
Your first approach was actually closer to the truth. The problem is that it passed a reference to an object within the local for loop scope to each thread, as a parameter. But, of course, once the loop iteration ended, that object went out of scope and got destroyed, leaving each thread with a reference to a destroyed object, resulting in undefined behavior.
Nobody cared about the fact that you moved the object into the std::vector, after creating the thread. The thread received a reference to a locally-scoped object, and that's all it knew. End of story.
Moving the object into the vector first, and then passing to each thread a reference to the object in the std::vector will not work either. As soon as the vector internally reallocates, as part of its natural growth, you'll be in the same pickle.
What needs to happen is to have the entire threads array created first, before actually starting any std::threads. If the RAII principle is religiously followed, that means nothing more than a simple call to std::vector::resize().
Then, in a second loop, iterate over the fully-cooked threads array, and go and spawn off a std::thread for each element in the array.
I was almost there with my unique_ptr solution. I just needed to pass the call as a std::ref() as such:
std::vector<std::unique_ptr<thread_data>> threads;
void start()
{
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
auto td = std::make_unique<thread_data>();
td->continueFlag.test_and_set(std::memory_order_relaxed);
td->thread = std::thread(&work, i, std::ref(td->continueFlag));
threads.push_back(std::move(td));
}
}
However, inspired by Sam above I also figured a non-pointer way:
std::vector<thread_data> threads;
void start()
{
const unsigned int numThreads = 2;
//create new vector, resize doesn't work as it tries to assign/copy which atomic_flag
//does not support
threads = std::vector<thread_data>(numThreads);
for (int i = 0; i < numThreads; i++) {
auto& t = threads.at(i);
t.continueFlag.test_and_set(std::memory_order_relaxed);
t.thread = std::thread(&work, i, std::ref(t.continueFlag));
}
}

Accessing random number engine from multiple threads

this is my first question, so please forgive me any violations against your policy. I want to have one global random number engine per thread, to which purpose I've devised the following scheme: Each thread I start gets a unique index from an atomic global int. There is a static vector of random engines, whose i-th member is thought to be used by the thread with the index i. If the index if greater than the vector size elements are added to it in a synchronized manner. To prevent performance penalties, I check twice if the index is greater than the vector size: once in an unsynced manner, and once more after locking the mutex. So far so good, but the following example fails with all sorts of errors (heap corruption, malloc-errors, etc.).
#include<vector>
#include<thread>
#include<mutex>
#include<atomic>
#include<random>
#include<iostream>
using std::cout;
std::atomic_uint INDEX_GEN{};
std::vector<std::mt19937> RNDS{};
float f = 0.0f;
std::mutex m{};
class TestAThread {
public:
TestAThread() :thread(nullptr){
cout << "Calling constructor TestAThread\n";
thread = new std::thread(&TestAThread::run, this);
}
TestAThread(TestAThread&& source) : thread(source.thread){
source.thread = nullptr;
cout << "Calling move constructor TestAThread. My ptr is " << thread << ". Source ptr is" << source.thread << "\n";
}
TestAThread(const TestAThread& source) = delete;
~TestAThread() {
cout << "Calling destructor TestAThread. Pointer is " << thread << "\n";
if (thread != nullptr){
cout << "Deleting thread pointer\n";
thread->join();
delete thread;
thread = nullptr;
}
}
void run(){
int index = INDEX_GEN.fetch_add(1);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true){
if (index >= RNDS.size()){
m.lock();
// add randoms in a synchronized manner.
while (index >= RNDS.size()){
cout << "index is " << index << ", size is " << RNDS.size() << std::endl;
RNDS.emplace_back();
}
m.unlock();
}
f += uniformRnd(RNDS[index]);
}
}
std::thread* thread;
};
int main(int argc, char* argv[]){
std::vector<TestAThread> threads;
for (int i = 0; i < 10; ++i){
threads.emplace_back();
}
cout << f;
}
What am I doing wrong?!
Obviously f += ... would be a race-condition regardless of the right-hand side, but I suppose you already knew that.
The main problem that I see is your use of the global std::vector<std::mt19937> RNDS. Your mutex-protected critical section only encompasses adding new elements; not accessing existing elements:
... uniformRnd(RNDS[index]);
That's not thread-safe because resizing RNDS in another thread could cause RNDS[index] to be moved into a new memory location. In fact, this could happen after the reference RNDS[index] is computed but before uniformRnd gets around to using it, in which case what uniformRnd thinks is a Generator& will be a dangling pointer, possibly to a newly-created object. In any event, uniformRnd's operator() makes no guarantee about data races [Note 1], and neither does RNDS's operator[].
You could get around this problem by:
computing a reference (or pointer) to the generator within the protected section (which cannot be contingent on whether the container's size is sufficient), and
using a std::deque instead of a std::vector, which does not invalidate references when it is resized (unless the referenced object has been removed from the container by the resizing).
Something like this (focusing on the race condition; there are other things I'd probably do differently):
std::mt19937& get_generator(int index) {
std::lock_guard<std::mutex> l(m);
if (index <= RNDS.size()) RNDS.resize(index + 1);
return RNDS[index];
}
void run(){
int index = INDEX_GEN.fetch_add(1);
auto& gen = get_generator(index);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true) {
/* Do something with uniformRnd(gen); */
}
}
[1] The prototype for operator() of uniformRnd is template< class Generator > result_type operator()( Generator& g );. In other words, the argument must be a mutable reference, which means that it is not implicitly thread-safe; only const& arguments to standard library functions are free of data races.

C++ vector erase iterator out of range due to mutexes

So, there is a vector of strings. Since its a static member of cl_mgr class, it acts as global variable.
std::vector<std::string> cl_mgr::to_send_queue;
However, i dont ever directly access this vector in my code. To add strings to it i call following function:
void cl_mgr::sendmsg(std::string msg)
{
std::mutex mtx;
mtx.lock();
if ( connected )
{
cl_mgr::to_send_queue.push_back(msg + '\r');
}
mtx.unlock();
}
This is where it goes wrong: the line
cl_mgr::to_send_queue.erase(cl_mgr::to_send_queue.begin());
sometimes gives iterator out of range.
This should only happen when vector is empty, but i already check for this in while condition.
So next i added sizes array to fill it with to_send_queue.size() and found out sometimes it returns zero ! Usually all array consists of 1's, but sometimes an element like sizes[9500] is a 0.
Whats wrong and how to fix this ?
std::mutex mtx;
mtx.lock();
while ( !cl_mgr::to_send_queue.empty() )
{
string tosend = cl_mgr::to_send_queue[0];
int sizes[10000];
sizes[0]=0;
for (int i = 1; i < 10000; ++i)
{
sizes[i] = cl_mgr::to_send_queue.size();
if ( sizes[i] < sizes[i-1] )
{
int breakpoint = 0; //should never be hit but it does !
}
}
cl_mgr::to_send_queue.erase(cl_mgr::to_send_queue.begin()); //CRASH HERE
send(hSocket, tosend.c_str(), tosend.length(), 0 );
Sleep(5);
}
mtx.unlock();
This std::mutex is local to the method. This means every invocation of this method has it's own mutex and doesn't protect anything.
To fix this, you must move the mutex to the same scope as the vector to_send_queue and use a std::lock_guard. At the website, there is an example how to use this
int g_i = 0;
std::mutex g_i_mutex; // protects g_i
void safe_increment()
{
std::lock_guard<std::mutex> lock(g_i_mutex);
++g_i;
std::cout << std::this_thread::get_id() << ": " << g_i << '\n';
// g_i_mutex is automatically released when lock
// goes out of scope
}