I know that the std::map class is thread unsafe in read and write in two threads. But is it OK to insert in multiple threads?
void writeMap()
{
for (int i = 0; i < 1000; i++)
{
long long random_variable = (std::rand()) % 1000;
std::cout << "Thread ID -> " << std::this_thread::get_id() << " with looping index " << i << std::endl;
k1map.insert(std::make_pair(i, new p(i)));
}
}
int main()
{
std::srand((int)std::time(0));
for (int i = 0; i < 1000; ++i)
{
long long random_variable = (std::rand()) % 1000;
std::thread t(writeMap);
std::cout << "Thread created " << t.get_id() << std::endl;
t.detach();
}
return 0;
}
Like such code is running normal no matter how many times I try.
program is complex,to some extent,like magic(LOL).
The code run results are different on various IDE.
Before, I used VS2013, it's always right.
But on vs19 and linux,the result of the same code is wrong.
Maybe on vs2013,the implement of MAP has special way.
No, std::map::insert is not thread-safe.
Most standard library types are thread safe only if you are using separate object instances in separate threads. Take a look at the thread safety part of container's docs.
As #NutCracker has mentioned, std::map::insert is not thread-safe.
But, if the posted code works fine, I think the reason is that the map fills very fast by one thread and as a result, other threads are not modifying the map anymore.
Related
I am trying to create a proof of concept for inter-thread communication by meanings of shared state: the main thread creates worker threads giving each a separate vector by reference, lets each do its work and fill its vector with results, and finally collects the results.
However, weird things are happening for which I can't find an explanation other than some race between the initialization of the vectors and the launch of the worker threads. Here is the code.
#include <iostream>
#include <vector>
#include <thread>
class Case {
public:
int val;
Case(int i):val(i) {}
};
void
run_thread (std::vector<Case*> &case_list, int idx)
{
std::cout << "size in thread " << idx <<": " << case_list.size() << '\n';
for (int i=0; i<10; i++) {
case_list.push_back(new Case(i));
}
}
int
main(int argc, char **argv)
{
int nthrd = 3;
std::vector<std::thread> threads;
std::vector<std::vector<Case*>> case_lists;
for (int i=0; i<nthrd; i++) {
case_lists.push_back(std::vector<Case*>());
std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
}
std::cout << "All threads lauched.\n";
for (int i=0; i<nthrd; i++) {
threads[i].join();
for (const auto cp:case_lists[i]) {
std::cout << cp->val << '\n';
}
}
return 0;
}
Tested on repl.it (gcc 4.6.3), the program gives the following result:
size of 0 in main:0
size of 1 in main:0
size of 2 in main:0
All threads lauched.
size in thread 0: 18446744073705569740
size in thread 2: 0
size in thread 1: 0
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
exit status -1
On my computer, besides something like the above, I also get:
Segmentation fault (core dumped)
It appears thread 0 is getting a vector that hasn't been initialized, although the vector appears properly initialized in main.
To isolate the problem, I have tried going single threaded by changing the line:
threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
to
run_thread(case_lists[i], i);
and commenting out:
threads[i].join();
Now the program runs as expected, with the "threads" running one after another before the main collects the results.
My question is: what is wrong with the multi-threaded version above?
References (and iterators) for a vector are invalidated any time the capacity of the vector changes. The exact rules for overallocation vary by implementation, but odds are, you've got at least one capacity change between the first push_back and the last, and all the references made before that final capacity increase are garbage the moment it occurs, invoking undefined behavior.
Either reserve your total vector size up front (so push_backs don't cause capacity increases), initialize the whole vector to the final size up front (so no resizes occur at all), or have one loop populate completely, then launch the threads (so all resizes occur before you extract any references). The simplest fix here would be to initialize it to the final size, changing:
std::vector<std::vector<Case*>> case_lists;
for (int i=0; i<nthrd; i++) {
case_lists.push_back(std::vector<Case*>());
std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
}
to:
std::vector<std::vector<Case*>> case_lists(nthrd); // Default initialize nthrd elements up front
for (int i=0; i<nthrd; i++) {
// No push_back needed
std::cout << "size of " << i << " in main:" << case_lists[i].size() << '\n';
threads.push_back( std::thread( run_thread, std::ref(case_lists[i]), i) );
}
You might be thinking that vectors would overallocate fairly aggressively, but at least on many popular compilers, this is not the case; both gcc and clang follow a strict doubling pattern, so the first three insertions reallocate every time (capacity goes from 1, to 2, to 4); the reference to the first element is invalidated by the insertion of the second, and the reference to the second is invalidated by the insertion of the third.
In the following code threadCount is one of 1,2,3,4 . But in the output, though the string part getting printed perfectly the num value getting missed randomly and it's getting appended after a few lines at times.
void *SPWork(void *t)
{
int* threadC = (int*)t;
int threadCount = *threadC;
cout<<"\n Thread count" << threadCount << endl;
cout << flush;
long long int i, adjustedIterationCount;
adjustedIterationCount = 100/(threadCount);
for (i=0; i< adjustedIterationCount; i++)
{
i++ ;
}
pthread_exit((void*) t);
}
Output
......
.....
Thread count1
Thread count1
Thread count2
Thread count1
Thread count
Thread count
Thread count234
.....
.....
Notice in the last line thread value is 234. But that value will never be 234.In the previous 2 line that value didn't get appended and so 2,3 got added to this line.
I know it got to do with flush or appending "\n", tried many combinations. But still, the issue persists.
N.B. This is a worker method of a pthread, compiler flags are "-g -Wall -O3 -lpthread"
While the standard streams are guaranteed to be thread-safe, there is no guarantee that the output won't be interleaved. If you want to print to a standard stream from multiple threads in a predictable way, you will need to do some synchronization yourself:
std::mutex cout_mutex;
void *SPWork(void *t)
{
//...
{
std::lock_guard<std::mutex> guard(cout_mutex);
std::cout << "\n Thread count" << threadCount << std::endl;
}
//...
}
There is no requirement that your calls to cout are an atomic operation. If you need them to be so, you can simply protect the code (just the output code) with a mutex.
In addition, injecting std::endl into the stream already flushes the data so there's little point in following that with a std::flush.
So, in its simplest form:
pthread_mutex_lock(&myMutex);
std::cout << "\n Thread count" << threadCount << std::endl;
pthread_mutex_unlock(&myMutex);
Note that, for recent C++ implementations, it's probably better to use std::mutex and std::lock_guard since they can guarantee correct clean up (see other answers for this). Since you have pthread_exit() in your code, I assume your limited to the POSIX threading model.
Can iterating over unsorted data structure like array, tree with multiple thread make it faster?
For example I have big array with unsorted data.
int array[1000];
I'm searching array[i] == 8
Can running:
Thread 1:
for(auto i = 0; i < 500; i++)
{
if(array[i] == 8)
std::cout << "found" << std::endl;
}
Thread 2:
for(auto i = 500; i < 1000; i++)
{
if(array[i] == 8)
std::cout << "found" << std::endl;
}
be faster than normal iteration?
#update
I've written simple test witch describe problem better:
For searching int* array = new int[100000000];
and repeating it 1000 times
I got the result:
a
Number of threads = 2
End of multithread iteration
End of normal iteration
Time with 2 threads 73581
Time with 1 thread 154070
Bool values:0
0
0
Process returned 0 (0x0) execution time : 256.216 s
Press any key to continue.
What's more when program was running with 2 threads cpu usage of the process was around ~90% and when iterating with 1 thread it was never more than 50%.
So Smeeheey and erip are right that it can make iteration faster.
Of course it can be more tricky for not such trivial problems.
And as I've learned from this test is that compiler can optimize main thread (when i was not showing boolean storing results of search loop in main thread was ignored) but it will not do that for other threads.
This is code I have used:
#include<cstdlib>
#include<thread>
#include<ctime>
#include<iostream>
#define SIZE_OF_ARRAY 100000000
#define REPEAT 1000
inline bool threadSearch(int* array){
for(auto i = 0; i < SIZE_OF_ARRAY/2; i++)
if(array[i] == 101) // there is no array[i]==101
return true;
return false;
}
int main(){
int i;
std::cin >> i; // stops program enabling to set real time priority of the process
clock_t with_multi_thread;
clock_t normal;
srand(time(NULL));
std::cout << "Number of threads = "
<< std::thread::hardware_concurrency() << std::endl;
int* array = new int[SIZE_OF_ARRAY];
bool true_if_found_t1 =false;
bool true_if_found_t2 =false;
bool true_if_found_normal =false;
for(auto i = 0; i < SIZE_OF_ARRAY; i++)
array[i] = rand()%100;
with_multi_thread=clock();
for(auto j=0; j<REPEAT; j++){
std::thread t([&](){
if(threadSearch(array))
true_if_found_t1=true;
});
std::thread u([&](){
if(threadSearch(array+SIZE_OF_ARRAY/2))
true_if_found_t2=true;
});
if(t.joinable())
t.join();
if(u.joinable())
u.join();
}
with_multi_thread=(clock()-with_multi_thread);
std::cout << "End of multithread iteration" << std::endl;
for(auto i = 0; i < SIZE_OF_ARRAY; i++)
array[i] = rand()%100;
normal=clock();
for(auto j=0; j<REPEAT; j++)
for(auto i = 0; i < SIZE_OF_ARRAY; i++)
if(array[i] == 101) // there is no array[i]==101
true_if_found_normal=true;
normal=(clock()-normal);
std::cout << "End of normal iteration" << std::endl;
std::cout << "Time with 2 threads " << with_multi_thread<<std::endl;
std::cout << "Time with 1 thread " << normal<<std::endl;
std::cout << "Bool values:" << true_if_found_t1<<std::endl
<< true_if_found_t2<<std::endl
<<true_if_found_normal<<std::endl;// showing bool values to prevent compiler from optimization
return 0;
}
The answer is yes, it can make it faster - but not necessarily. In your case, when you're iterating over pretty small arrays, it is likely that the overhead of launching a new thread will be much higher than the benefit gained. If you array was much bigger then this would be reduced as a proportion of the overall runtime and eventually become worth it. Note you will only get speed up if your system has more than 1 physical core available to it.
Additionally, you should note that whilst that the code that reads the array in your case is perfectly thread-safe, writing to std::cout is not (you will get very strange looking output if your try this). Instead perhaps your thread should do something like return an integer type indicating the number of instances found.
My apologies if this question looks simple. I'm still learning about threads. I already tried searching for a solution to this on here but didn't find any.
I'm trying to get my program to create a number of threads based on user input (ex: "cin >> 5" will create 5 threads) but it says the "i" in "threads myThreads[ i ]" needs to be a constant value. The code is below:
void exec(int n)
{
cout << "Thread " << n << endl;
}
int main()
{
int numThreads = 0;
// create threads
cin >> numThreads;
thread myThreads[numThreads]; // this part says myThreads "must be a constant value"
for (int i = 0; i < numThreads; i++)
{
myThreads[i] = thread(exec, i);
}
for (int i = 0; i < numThreads; i++)
{
myThreads[i].join();
}
cout << "Done!" << endl;
}
Any ideas as to how that section can be fixed? I've tried a few different ways but they haven't worked so far. Thank you very much.
There's no problem with multithreading. The problem is static array that you using as dynamic array.
Try something like this:
thread* myThreads = new thread[numThreads];
More about dynamic memory in C++:
http://www.cplusplus.com/doc/tutorial/dynamic/
UPD By James Adkison:
Do not forget to delete[] your array to avoid memory leaking.
I have a search problem, which I want to parallelize. If one thread has found a solution, I want all other threads to stop. Otherwise, if all threads exit regularly, I know, that there is no solution.
The following code (that demonstrates my cancelling strategy) seems to work, but I'm not sure, if it is safe and the most efficient variant:
#include <iostream>
#include <thread>
#include <cstdint>
#include <chrono>
using namespace std;
struct action {
uint64_t* ii;
action(uint64_t *ii) : ii(ii) {};
void operator()() {
uint64_t k = 0;
for(; k < *ii; ++k) {
//do something useful
}
cout << "counted to " << k << " in 2 seconds" << endl;
}
void cancel() {
*ii = 0;
}
};
int main(int argc, char** argv) {
uint64_t ii = 1000000000;
action a{&ii};
thread t(a);
cout << "start sleeping" << endl;
this_thread::sleep_for(chrono::milliseconds(2000));
cout << "finished sleeping" << endl;
a.cancel();
cout << "cancelled" << endl;
t.join();
cout << "joined" << endl;
}
Can I be sure, that the value, to which ii points, always gets properly reloaded? Is there a more efficient variant, that doesn't require the dereferenciation at every step? I tried to make the upper bound of the loop a member variable, but since the constructor of thread copies the instance of action, I wouldn't have access to that member later.
Also: If my code is exception safe and does not do I/O (and I am sure, that my platform is Linux), is there a reason not to use pthread_cancel on the native thread?
No, there's no guarantee that this will do anything sensible. The code has one thread reading the value of ii and another thread writing to it, without any synchronization. The result is that the behavior of the program is undefined.
I'd just add a flag to the class:
std::atomic<bool> time_to_stop;
The constructor of action should set that to false, and the cancel member function should set it to true. Then change the loop to look at that value:
for(; !time_to_stop && k < *ii; ++k)
You might, instead, make ii atomic. That would work, but it wouldn't be as clear as having a named member to look at.
First off there is no reason to make ii a pointer. You can have it just as a plain uint64_t.
Secondly if you have multiple threads and at least one of them writes to a shared variable then you are going to have to have some sort of synchronization. In this case you could just use std::atomic<uint64_t> to get that synchronization. Otherwise you would have to use a mutex or some sort of memory fence.