C++ return value from multithreads using reference - c++

Here is my code:
vector<MyClass> objs;
objs.resize(4);
vector<thread> multi_threads;
multi_threads.resize(4);
for(int i = 0; i < 4; i++)
{
multi_threads[i] = std::thread(&MyFunction, &objs[i]);
// each thread change some member variable in objs[i]
multi_threads[i].join();
}
I expect that the elements in objs can be changed at each thread. Then after the threads finished, I can get access to the member data.
However, when the program finished the above loop, the member variables I'd like to get is not changed at all.
I guess this is because the multi-threading mechanism in C++, but I don't know what exactly I did wrong. And may I know how to achieve my expectation?
Many thanks.
=================================================================================
EDIT:
Here is the source code of MyFunc:
void MyFunc(MyClass &obj)
{
vector<thread> myf_threads;
myf_threads.resize(10);
for(int i = 0; i < 10; i++)
{
myf_threads[i] = std::thread(&AnotherClass::increaseData, &obj);
myf_threads[i].join();
}
}
And here is AnotherClass::increaseData:
void AnotherClass::increaseData(Myclass& obj)
{
obj.add();
}
void MyClass::add()
{
data++;
}

objs is empty when first accessed, causing undefined behaviour:
multi_threads[i] = std::thread(&MyFunction, &objs[i]);
//^^ 'objs' is empty, so this access
// is out-of-bounds.
There must be instances of MyClass within objs before accessing it. Avoid potential reallocation of the internal buffer used by the objs vector by allocating the required number of elements upfront. If reallocation occurs previous pointers acquired would be dangling:
std::vector<MyClass> objs(4);
std::vector<std::thread> multi_threads(objs.size());
To avoid sequential execution of the threads join() with the threads in a subsequent loop instead of in the creation loop:
for(int i = 0; i < 4; i++)
{
multi_threads[i] = std::thread(&MyFunction, &objs[i]);
}
for (auto& t: multi_threads) t.join();
See demo.
After the update, it appears the function being used for the thread is taking a reference, not a pointer, to a MyClass instance (even though a pointer is being passed to the std::thread constructor?). In this case, std::ref(objs[i]) must be used to avoid copying of the MyClass instance by the std::thread constructor (see demo). Note from the std::thread::thread() reference page:
The arguments to the thread function are copied by value. If a reference argument needs to be passed to the thread function, it has to be wrapped (e.g. with std::ref or std::cref).

Maybe
std::ref(objs[i])
helps you.
I had the same problem with referenced parameters in the function for the thread
and that helped me.

Related

is it ok to access value(entry in thread safe map) pointed by pointer inside non-thread safe container?

For example,
// I am using thread safe map from
// code.google.com/p/thread-safe-stl-containers
#include <thread_safe_map.h>
class B{
vector<int> b1;
};
//Thread safe map
thread_safe::map<int, B> A;
B b_object;
A[1] = b_object;
// Non thread safe map.
map<int, B*> C;
C[1] = &A[1].second;
So are following operations still thread safe?
Thread1:
for(int i=0; i<10000; i++) {
cout << C[1]->b1[i];
}
Thread2:
for(int i=0; i<10000; i++) {
C[1]->b1.push_back(i);
}
Is there any problem in the above code? If so how can I fix it?
Is it OK to access value(entry in thread safe map) pointed by pointer inside non-thread safe container?
No, what you are doing there is not safe. The way your thread_safe_map is implemented is to take a lock for the duration of every function call:
//Element Access
T & operator[]( const Key & x ) { boost::lock_guard<boost::mutex> lock( mutex ); return storage[x]; }
The lock is released as soon as the access function ends which means that any modification you make through the returned reference has no protection.
As well as being not entirely safe this method is very slow.
A safe(er), efficient, but highly experimental way to lock containers is proposed here: https://github.com/isocpp/CppCoreGuidelines/issues/924
with source code here https://github.com/galik/GSL/blob/lockable-objects/include/gsl/gsl_lockable (shameless self promotion disclaimer).
In general, STL containers can be accessed from multiple threads as long as all threads either:
read from the same container
modify elements in a thread safe manner
You cannot push_back (or erase, insert, etc.) from one thread and read from another thread. Suppose that you are trying to access an element in thread 1 while push_back in thread 2 is in the middle of reallocation of vector's storage. This might crash the application, might return garbage (or might work, if you're lucky).
The second bullet point applies to situations like this:
std::vector<std::atomic_int> elements;
// Thread 1:
elements[10].store(5);
// Thread 2:
int v = elements[10].load();
In this case, you're concurrently reading and writing an atomic variable, but the vector itself is not modified - only its element is.
Edit: using thread_safe::map doesn't change anything in you're case. While the modifying the map is ok, modifying its elements is not. Putting std::vector in a thread-safe collection doesn't automagically make it thread-safe too.

Thread safety in std::map of std::shared_ptr

I know there are a lot of similar questions with answers around, but since I still don't understand this particular case, I decided to pose a question.
What I have is a map of shared_ptrs to a dynamically allocated array (MyVector). What I want is limited concurrent access without the need to lock. I know that the map per se is not thread safe, but I always thought what I'm doing here should be ok, which is:
I fill the map in a single threaded environment like that:
typedef shared_ptr<MyVector<float>> MyVectorPtr;
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
After the initialization, I have one thread that reads from the elements and one that replaces what the shared_ptrs point to.
Thread 1:
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
Thread 2:
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
}
After a while I get either a seg fault or a double free in one of the two threads. Somehow not really surprisingly, but still I don't really get it.
The reasons why I thought this would work, are:
I don't add or remove any items of the map in the multi-threaded
environment, so the iterators should always point to something valid.
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic. SO Question
Obviously, either one ore more of my assumptions are wrong, or I'm not doing what I think I am. I think that reset actually is not thread safe, would std::atomic_exchange help?
Can someone release me? Thanks a lot!
If someone wants to try out, here is the full code example:
#include <stdio.h>
#include <iostream>
#include <string>
#include <map>
#include <unistd.h>
#include <pthread.h>
using namespace std;
template<class T>
class MyVector
{
public:
MyVector(int length)
: number(length)
, array(new T[length])
{
}
~MyVector()
{
if (array != NULL)
{
delete[] array;
}
array = NULL;
}
int number;
private:
T* array;
};
typedef shared_ptr<MyVector<float>> MyVectorPtr;
static map<int,MyVectorPtr> content;
const int numRows = 1000;
const int numElements = 10;
//pthread_mutex_t write_lock;
double get_cache_size_in_megabyte()
{
double memory_use=0;
//BlockingLockGuard guard(write_lock);
for(auto i=content.begin();i!=content.end();i++)
{
MyVectorPtr p(i->second);
if (p)
{
memory_use+=sizeof(int) + sizeof(float) * p->number;
}
}
return memory_use/(1024.0*1024.0);
}
void* write_content(void*)
{
while(true)
{
//BlockingLockGuard guard(write_lock);
for (auto itr=content.begin();content.end()!=itr;++itr)
{
itr->second.reset(new MyVector<float>(numRows));
cout << "one new written" <<endl;
}
}
return NULL;
}
void* loop_size_checker(void*)
{
while (true)
{
cout << get_cache_size_in_megabyte() << endl;;
}
return NULL;
}
int main(int argc, const char* argv[])
{
for (int i = 0; i < numElements; i++)
{
content[i] = MyVectorPtr(new MyVector<float>(numRows));
}
pthread_attr_t attr;
pthread_attr_init(&attr) ;
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
pthread_t *grid_proc3 = new pthread_t;
pthread_create(grid_proc3, &attr, &loop_size_checker,NULL);
pthread_t *grid_proc = new pthread_t;
pthread_create(grid_proc, &attr, &write_content,(void*)NULL);
// to keep alive and avoid content being deleted
sleep(10000);
}
I thought concurrently changing a single element of the map is fine as long as the operation is atomic.
Changing the element in a map is not atomic unless you have a atomic type like std::atomic.
I thought the operations I do on the shared_ptr (increment ref count, decrement ref count in Thread 1, reset in Thread 2) are atomic.
That is correct. Unfortunately you are also changing the underlying pointer. That pointer is not atomic. Since it is not atomic you need synchronization.
One thing you can do though is use the atomic free functions that are introduced with std::shared_ptr. This will let you avoid having to use a mutex.
Lets expand MyVectorPtr p(i->second); which is running on thread-1:
The constructor called for this is:
template< class Y >
shared_ptr( const shared_ptr<Y>& r ) = default;
Which probably boils down to 2 assignments of the underlying shared pointer and the reference count.
It may very well happen that thread 2 would delete the shared pointer while in thread-1 the pointer is being assigned to p. The underlying pointer stored inside shared_ptr is not atomic.
Thus, you usage of std::shared_ptr is not thread safe. It is thread safe as long as you do not update or modify the underlying pointer.
TL;DR;
Changing std::map isn't thread safe, while using std::shared_ptr regarding additional references is.
You should protect accessing your map regarding read/write operations using an appropriate synchronization mechanism, like e.g. a std::mutex.
Also if the state of an instance referenced by the std::shared_ptr should change, it needs to be protected against data races if it's accessed from concurrent threads.
BTW, the MyVector you are showing is a way too naive implementation.

Creating std threads in C++ crashes the program

Whenever I execute the following piece of code using threads, the program has this error:
Debug Error!
Program: ... /path/to/.exe
abort() has been called
I want to create a thread that calls a member function. Here is the function I am using:
void ServerVote::createConnexionThreads()
{
for (int i = 0; i <= 50; ++i)
{
m_connexionThreads.push_back(&(std::thread(&ServerVote::acceptConnection,*this, i)));
}
for (int i = 0; i <= 50; ++i)
{
m_connexionThreads[i]->join();
}
}
I can provide additional code if required. When using the debugger, I find that the program crashes right after the first thread is created, after the thread is pushed_back. ~thread() is then called and it crashes inside this function. Here is the vector declaration:
std::vector<std::thread*> m_connexionThreads;
I am using Visual Studio 2015. The acceptConnection function has a while(true) inside it and is planned to be terminated later.
Edit:
Thank you for your answers, but I cannot compile when using a thread object instead of a pointer. So when I try to push into this vector:
std::vector<std::thread> m_connexionThreads;
for (int i = 0; i <= 50; ++i)
{
m_connexionThreads.push_back((std::thread(&ServerVote::acceptConnection,*this, i)));
}
I get this error while compiling:
error C2280: 'std::thread::thread(const std::thread &)': attempting to reference a deleted function
You should not try to use address of the temporary in any context. As a matter of fact, this is a bug in MSVC which allows this code. Any standard-conforming compiler would produce an error here.
Instead, you should use the thread object like this (see my edit below the code on why this is preferred):
#include <thread>
#include <vector>
void acceptConnection(int);
void foo() {
std::vector<std::thread> vec;
for (int i = 0; i <= 50; ++i)
vec.push_back(std::thread(acceptConnection, i));
}
Why this approach is preferred over using an allocated pointer to the thread object? There are multiple benefits:
It is less typing - and even if nothing else, all things being equal (though they are not!) less typing wins over more typing.
It takes caution to use the pointers. For instance, you shouldn't use the raw pointer as vector data type, you should use unique_ptr to ensure automatic memory cleanup - which makes the syntax even uglier!
Using dynamically allocated memory is a drag on performance. You are hit twice - first time when you allocate memory, second time when you free it. Why suffer this penalty?
You are creating a local instance of thread in stack, taking its address and pushing it to the vector. The thread object will be deleted on exit of the method, so you will be left with a pointer to a deleted object.
You should use new to create the thread object in heap so it will not be deleted on method exit, or not use pointers to thread objects.

Am I using this deque in a thread safe manner?

I'm trying to understand multi threading in C++. In the following bit of code, will the deque 'tempData' declared in retrieve() always have every element processed once and only once, or could there be multiple copies of tempData across multiple threads with stale data, causing some elements to be processed multiple times? I'm not sure if passing by reference actually causes there to be only one copy in this case?
static mutex m;
void AudioAnalyzer::analysisThread(deque<shared_ptr<AudioAnalysis>>& aq)
{
while (true)
{
m.lock();
if (aq.empty())
{
m.unlock();
break;
}
auto aa = aq.front();
aq.pop_front();
m.unlock();
if (false) //testing
{
retrieveFromDb(aa);
}
else
{
analyzeAudio(aa);
}
}
}
void AudioAnalyzer::retrieve()
{
deque<shared_ptr<AudioAnalysis>>tempData(data);
vector<future<void>> futures;
for (int i = 0; i < NUM_THREADS; ++i)
{
futures.push_back(async(bind(&AudioAnalyzer::analysisThread, this, _1), ref(tempData)));
}
for (auto& f : futures)
{
f.get();
}
}
Looks OK to me.
Threads have shared memory and if the reference to tempData turns up as a pointer in the thread then every thread sees exactly the same pointer value and the same single copy of tempData. [You can check that if you like with a bit of global code or some logging.]
Then the mutex ensures single-threaded access, at least in the threads.
One problem: somewhere there must be a push onto the deque, and that may need to be locked by the mutex as well. [Obviously the push_back onto the futures queue is just local.]

Deleting a pointer a different places results in different behaviors (crash or not)

This question is a refinement of this one, which went in a different direction than expected.
In my multithreaded application, the main thread creates parameters and stores them:
typedef struct {
int parameter1;
double parameter2;
float* parameter3;
} jobParams;
typedef struct {
int ID;
void* params;
} jobData;
std::vector<jobData> jobs;
// main thread
for (int i = 0; i < nbJobs; ++i) {
jobParams* p = new jobParams;
// fill and store params
jobData data;
data.ID = i;
data.params = p;
jobs.push_back(data);
}
// start threads and wait for their execution
// delete parameters
for (int i = 0; i < jobs.size(); ++i) {
delete jobs[i].params;
}
Then, each thread gets a pointer to a set of parameters, and calls a job function with it:
// thread (generic for any job function and any type of params)
jobData* job = main->getNextParams();
jobFunction(job->ID, job->params);
The whole thing takes void* as argument to be able to use any structure for the parameters, but then the job function casts it back to the right struct:
void* jobFunction(void* param) {
jobParams* params = (jobParams*) param;
// do stuff
return 0;
}
My problem is the following: if I delete params at the end of jobFunction(), it works perfectly. However, I'd prefer to have the deletion taken care of by the threads or the main thread, such that I don't have to remember to delete the params for each jobFunction() that I write.
If I try to delete params just after calling jobFunction() in the treads, or even in the main thread after being sure that all threads are done (and thus the params are not needed anymore), I get a heap corruption error:
HEAP[prog]: Invalid Address specified to RtlFreeHeap( 02E90000, 03C2EE38 )
I'm using Visual Studio 2008 Pro, and I thus can't use valgrind or other *nix tools for debugging. All access to the main thread from the "child threads" are synchronized using a mutex, so the problem is not that I delete the same parameters twice.
In fact, by using VS memory viewer, I know that the memory pointed by the jobParams pointer does not change between the end of jobFunction() and the point where I try to delete it (either in the main thread or in the "child threads").
I added the definition of both structures, as well as the way I'd like to delete the params.
Just as a thought .. can you try
for (int i = 0; i < jobs.size(); ++i) {
delete (jobParams*)jobs[i].params;
}
newing a type jobParams and then deleteing a void* might be the cause of your problems.
Is there any reason you store params as a void* in jobData? I'd argue if you wish to have different types of jobParams then you should be using an inheritance hierarchy and not blindly casting to a void*.
That sort of bug generally means you have a data race somewhere. Does main->getNextParams() do the right thing even if it's called by several threads at once? If it gives the same params to both, you could have a double-free in your hands.
Also, instead of
jobFunction(jobData->ID, jobData->params);
You probably meant
jobFunction(job->ID, job->params);
To debug it you could add a deleted member to the jobParams class and set that to true instead of actually deleting the object. Then see check the deleted flag in every method of jobParams and throw an exception if it's true. Then see where the exception gets thrown.