I am trying to modify a some strings from threads (each thread would have its own string) but all strings are stored in a vector, because i need to be able to access them after the threads have done their thing.
I haven't used threads in c++, so if this is a terrible thing to do, all suggestions welcome :)
Basically the only thing the program does now is:
create some threads
send a string and an id to each thread
thread function modifies the string to add the id to it
end
This gives a segfault :(
Is this just a bad approach? How else could I do this?
static const int cores = 8;
void bmh_t(std::string & resr, int tid){
resr.append(std::to_string(tid));
resr.append(",");
return;
}
std::vector<std::string> parbmh(std::string text, std::string pat){
std::vector<std::string> distlists;
std::thread t[cores];
//Launch a group of threads
for (int i = 0; i < cores; ++i) {
distlists.push_back(" ");
t[i] = std::thread(bmh_t,std::ref(distlists[i]), i);
}
for (int i = 0; i < cores; ++i) {
t[i].join();
}
return distlists;
}
Your basic approach is fine. The main thing you need to consider when writing parallel code is that any data shared between threads is done so in a safe way. Because your algorithm uses a different string for each thread, it's a good approach.
The reason you're seeing a crash is because you're calling push_back on your vector of strings after you've already given each thread a reference to data stored within the vector. This is a problem because push_back needs to grow your vector, when its size reaches its capacity. That growth can invalidate the references that you've dispatched to each thread, causing them to write to freed memory.
The fix is very simple: just make sure ahead of time that your vector doesn't need to grow. This can be accomplished with a constructor argument specifying an initial number of elements; a call to reserve(); or a call to resize().
Here's an implementation that doesn't crash:
static const int cores = 8;
void bmh_t(std::string & resr, int tid){
resr.append(std::to_string(tid));
resr.append(",");
return;
}
std::vector<std::string> parbmh(){
std::vector<std::string> distlists;
std::thread t[cores];
distlists.reserve(cores);
//Launch a group of threads
for (int i = 0; i < cores; ++i) {
distlists.push_back(" ");
t[i] = std::thread(bmh_t, std::ref(distlists[i]), i);
}
for (int i = 0; i < cores; ++i) {
t[i].join();
}
return distlists;
}
The vector of strings is being destructed before the threads can act on the contained strings. You'll want to join the threads before returning so that the vector of strings isn't destroyed.
Related
I´m trying to create a multithread part in my program, where a loop creates multiple threads, that get a vector consisting of objects along some integers and the vector which holds the results.
The problem is I can´t seem to wrap my head around how the thread part works, I tried different things but all end in the same three errors.
This is where I don´t know how to proceed:
std::thread thread_superdiecreator;
for (int64_t i = 0; i < dicewithside.back().sides; i++) {
thread_superdiecreator(func_multicreator(dicewithside, i, amount, lastdiepossibilities, superdie));
}
term does not evalutate to a function taking 1 arguments
I tried this:
thread_superdiecreator(func_multicreator, dicewithside, i, amount, lastdiepossibilities, superdie);
call of an object of a class type without appropriate operator() or conversion functions to pointer-to-function type
And this:
std::thread thread_superdiecreator(func_multicreator, dicewithside, i, amount, lastdiepossibilities, superdie);
Invoke error in thread.
The whole code snippet:
#pragma once
#include <mutex>
#include <thread>
#include <algorithm>
#include "class_Diewithside.h"
#include "struct_Sortedinput.h"
#include "func_maximumpossibilities.h"
std::mutex superdielock;
void func_multicreator(std::vector<Diewithside> dicewithside, int64_t lastdieside, int64_t size, int64_t lastdiepossibilities, std::vector<int64_t> &superdie) {
// Set the last die side to number of the thread
dicewithside[size-1].dieside = lastdieside;
//
std::vector<int64_t> partsuperdie;
partsuperdie.reserve(lastdiepossibilities);
// Calculate all possible results of all dice thrown with the last one set
for (int64_t i = 0; i < lastdiepossibilities; i++) {
// Reset the result
int64_t result = 0;
for (int64_t j = 0; j < size; j++) {
result += dicewithside[j].alleyes[dicewithside[j].dieside];
}
partsuperdie.push_back(result);
//
for (int64_t j = 0; j < size - 1; j++) {
if (dicewithside[j].dieside == dicewithside[j].sides - 1) {
dicewithside[j].dieside = 0;
}
else {
dicewithside[j].dieside += 1;
break;
}
}
}
superdielock.lock();
for (int64_t i = 0; i < lastdiepossibilities; i++) {
superdie.push_back(partsuperdie[i]);
}
superdielock.unlock();
}
// The function superdie creates an array that holds all possible outcomes of the dice thrown
std::vector<int64_t> func_superdiecreator(sortedinput varsortedinput) {
// Get the size of the diceset vector and create a new vector out of class Diewithside
int64_t size = varsortedinput.dicesets.size();
std::vector<Diewithside> dicewithside;
// Initialize the integer amount and iterate through all the amounts of vector dicesets adding them together to set the vector dicewithside reserve
int64_t amount = 0;
for (int64_t i = 0; i < size; i++) {
amount += varsortedinput.dicesets[i].amount;
}
dicewithside.reserve(amount);
// Fill the new vector dicewithside with each single die and add the starting value of 0
for (int64_t i = 0; i < size; i++) {
for (int64_t j = 0; j < varsortedinput.dicesets[i].amount; j++) {
dicewithside.push_back(Diewithside{varsortedinput.dicesets[i].plusorminus, varsortedinput.dicesets[i].sides, varsortedinput.dicesets[i].alleyes, 0});
}
}
// Get the maximum possibilities and divide by sides of the last die to get the amount of iterations each thread has to run
int64_t maximumpossibilities = func_maximumpossibilities(varsortedinput.dicesets, size);
int64_t lastdiepossibilities = maximumpossibilities / dicewithside[amount-1].sides;
// Multithread calculate all possibilities and save them in array
std::vector<int64_t> superdie;
superdie.reserve(maximumpossibilities);
std::thread thread_superdiecreator;
for (int64_t i = 0; i < dicewithside.back().sides; i++) {
thread_superdiecreator(func_multicreator(dicewithside, i, amount, lastdiepossibilities, superdie));
}
thread_superdiecreator.join();
return superdie;
Thanks for any help!
You indeed need to create the thread using the third alternative mentioned in the question, i.e. use the constructor of std::thread to start the thread.
The issue with this approach is the fact the last parameter of func_multicreator being a lvalue reference: std::thread creates copies of parameters and moves those copies during for calling the function on the background thread, and an rvalue reference cannot be implicitly converted to an lvalue reference. You need to use std::reference_wrapper here to be able to "pass" an lvalue reference to the thread.
You should join every thread created so you need to create a collection of threads.
Simplified example:
(The interesting stuff is between the ---... comments.)
struct Diewithside
{
int64_t sides;
};
void func_multicreator(std::vector<Diewithside> dicewithside, int64_t lastdieside, int64_t size, int64_t lastdiepossibilities, std::vector<int64_t>& superdie)
{
}
std::vector<int64_t> func_superdiecreator() {
std::vector<Diewithside> dicewithside;
// Initialize the integer amount and iterate through all the amounts of vector dicesets adding them together to set the vector dicewithside reserve
int64_t amount = 0;
int64_t lastdiepossibilities = 0;
std::vector<int64_t> superdie;
// -----------------------------------------------
std::vector<std::thread> threads;
for (int64_t i = 0; i < dicewithside.back().sides; i++) {
// create thread using constructor std::thread(func_multicreator, dicewithside, i, amount, lastdiepossibilities, std::reference_wrapper(superdie));
threads.emplace_back(func_multicreator, dicewithside, i, amount, lastdiepossibilities, std::reference_wrapper(superdie));
}
for (auto& t : threads)
{
t.join();
}
// -----------------------------------------------
return superdie;
}
std::thread thread_superdiecreator;
A single std::thread object always represents a single execution threads. You seem to be trying to use this single object to represent multiple execution threads. No matter what you will try, it won't work. You need multiple std::thread objects, one for each execution thread.
thread_superdiecreator(func_multicreator, dicewithside, i, amount, lastdiepossibilities, superdie);
An actual execution thread gets created by constructing a new std::thread object, and not by invoking it as a function.
Constructing an execution thread object corresponds to the creation of a new execution thread, it's just that simple. And the simplest way to have multiple execution threads is to have a vector of them.
std::vector<std::thread> all_execution_threads.
With that in place, creating a new execution thread involves nothing more than constructing a new std::thread object and moving it into the vector. Or, better yet, emplace it directly:
all_execution_threads.emplace_back(
func_multicreator, dicewithside, i,
amount, lastdiepossibilities, superdie
);
This presumes that everything else is correct: func_multicreator agrees with the following parameters, none of them are passed by reference (you need to fix this, at least, your attempt to pass a reference into a thread function will not work), leaving dangling references behind, all access to all objects by multiple execution threads are correctly synchronized, with mutexes, and all the other usual pitfalls when working with multiple execution threads. But this covers the basics of creating some unspecified number of multiple, concurrent, execution threads. When all and said and done you end up with a std::vector of std::threads, one for each actual execution thread.
I have read various articles on C++ threading, among others GeeksForGeeks article. I have also read this quection but none of these has an answer for my need. In my project, (which is too complex to mention here), I would need something along the lines:
#include <iostream>
#include <thread>
using namespace std;
class Simulate{
public:
int Numbers[100][100];
thread Threads[100][100];
// Method to be passed to thread - in the same way as function pointer?
void DoOperation(int i, int j) {
Numbers[i][j] = i + j;
}
// Method to start the thread from
void Update(){
// Start executing threads
for (int i = 0; i < 100; i++) {
for (int j = 0; j < 100; j++) {
Threads[i][j] = thread(DoOperation, i, j);
}
}
// Wait till all of the threads finish
for (int i = 0; i < 100; i++) {
for (int j = 0; j < 100; j++) {
if (Threads[i][j].joinable()) {
Threads[i][j].join();
}
}
}
}
};
int main()
{
Simulate sim;
sim.Update();
}
How can I do this please? Any help is appreciated, and alternative solutions wellcomed. I am a mathematician by training, learning C++ for less than a week, so simplicity is pereferred. I desperately need something along these lines to make my research simulations faster.
The easiest way to call member functions and pass arguments is to use a lambda expression:
Threads[i][j] = std::thread([this, i, j](){ this->DoOperation(i, j); });
The variables listed in [] are captured and their values can be used by the code inside {}. The lambda itself has a unique anonymous type, but can be implicitly cast to std::function which is accepted by std::thread constructor.
However, starting 100x100 = 10000 threads will quickly exhaust memory on most systems. Adding more threads than there are CPU cores does not improve performance for computational tasks. Instead it is a better idea to start e.g. 10 threads that each process 1000 items in a loop.
First of all, I think it is important to say that I am new to multithreading and know very little about it. I was trying to write some programs in C++ using threads and ran into a problem (question) that I will try to explain to you now:
I wanted to use several threads to fill an array, here is my code:
static const int num_threads = 5;
int A[50], n;
//------------------------------------------------------------
void ThreadFunc(int tid)
{
for (int q = 0; q < 5; q++)
{
A[n] = tid;
n++;
}
}
//------------------------------------------------------------
int main()
{
thread t[num_threads];
n = 0;
for (int i = 0; i < num_threads; i++)
{
t[i] = thread(ThreadFunc, i);
}
for (int i = 0; i < num_threads; i++)
{
t[i].join();
}
for (int i = 0; i < n; i++)
cout << A[i] << endl;
return 0;
}
As a result of this program I get:
0
0
0
0
0
1
1
1
1
1
2
2
2
2
2
and so on.
As I understand, the second thread starts writing elements to an array only when the first thread finishes writing all elements to an array.
The question is why threads dont't work concurrently? I mean why don't I get something like that:
0
1
2
0
3
1
4
and so on.
Is there any way to solve this problem?
Thank you in advance.
Since n is accessed from more than one thread, those accesses need to be synchronized so that changes made in one thread don't conflict with changes made in another. There are (at least) two ways to do this.
First, you can make n an atomic variable. Just change its definition, and do the increment where the value is used:
std::atomic<int> n;
...
A[n++] = tid;
Or you can wrap all the accesses inside a critical section:
std::mutex mtx;
int next_n() {
std::unique_lock<std::mutex> lock(mtx);
return n++;
}
And in each thread, instead of directly incrementing n, call that function:
A[next_n()] = tid;
This is much slower than the atomic access, so not appropriate here. In more complex situations it will be the right solution.
The worker function is so short, i.e., finishes executing so quickly, that it's possible that each thread is completing before the next one even starts. Also, you may need to link with a thread library to get real threads, e.g., -lpthread. Even with that, the results you're getting are purely by chance and could appear in any order.
There are two corrections you need to make for your program to be properly synchronized. Change:
int n;
// ...
A[n] = tid; n++;
to
std::atomic_int n;
// ...
A[n++] = tid;
Often it's preferable to avoid synchronization issues altogether and split the workload across threads. Since the work done per iteration is the same here, it's as easy as dividing the work evenly:
void ThreadFunc(int tid, int first, int last)
{
for (int i = first; i < last; i++)
A[i] = tid;
}
Inside main, modify the thread create loop:
for (int first = 0, i = 0; i < num_threads; i++) {
// possible num_threads does not evenly divide ASIZE.
int last = (i != num_threads-1) ? std::size(A)/num_threads*(i+1) : std::size(A);
t[i] = thread(ThreadFunc, i, first, last);
first = last;
}
Of course by doing this, even though the array may be written out of order, the values will be stored to the same locations every time.
I have a program which reads the file line by line and then stores each possible substring of length 50 in a hash table along with its frequency. I tried to use threads in my program so that it will read 5 lines and then use five different threads to do the processing. The processing involves reading each substring of that line and putting them into hash map with frequency. But it seems there is something wrong which I could not figure out for which the program is not faster then the serial approach. Also, for large input file it is aborted. Here is the piece of code I am using
unordered_map<string, int> m;
mutex mtx;
void parseLine(char *line, int subLen){
int no_substr = strlen(line) - subLen;
for(int i = 0; i <= no_substr; i++) {
char *subStr = (char*) malloc(sizeof(char)* subLen + 1);
strncpy(subStr, line+i, subLen);
subStr[subLen]='\0';
mtx.lock();
string s(subStr);
if(m.find(s) != m.end()) m[s]++;
else {
pair<string, int> ret(s, 1);
m.insert(ret);
}
mtx.unlock();
}
}
int main(){
char **Array = (char **) malloc(sizeof(char *) * num_thread +1);
int num = 0;
while (NOT END OF FILE) {
if(num < num_th) {
if(num == 0)
for(int x = 0; x < num_th; x++)
Array[x] = (char*) malloc(sizeof(char)*strlen(line)+1);
strcpy(Array[num], line);
num++;
}
else {
vector<thread> threads;
for(int i = 0; i < num_th; i++) {
threads.push_back(thread(parseLine, Array[i]);
}
for(int i = 0; i < num_th; i++){
if(threads[i].joinable()) {
threads[i].join();
}
}
for(int x = 0; x < num_th; x++) free(seqArray[x]);
num = 0;
}
}
}
It's a myth that just by the virtue of using threads, the end result must be faster. In general, in order to take advantage of multithreading, two conditions must be met(*):
1) You actually have to have sufficient physical CPU cores, that can run the threads at the same time.
2) The threads have independent tasks to do, that they can do on their own.
From a cursory examination of the shown code, it seems to fail on the second part. It seems to me that, most of the time all of these threads will be fighting each other in order to acquire the same mutex. There's little to be gained from multithreading, in this situation.
(*) Of course, you don't always use threads for purely performance reasons. Multithreading also comes in useful in many other situations too, for example, in a program with a GUI, having a separate thread updating the GUI helps the UI working even while the main execution thread is chewing on something, for a while...
I'm having trouble coming up with an algorithm to ensure mutual exclusion while threads are accessing a shared global variable. I'm trying to write a threaded function that can use a global variable, instead of switching it to a local variable.
I have this code so far:
int sumGlobal = 0;
void sumArray(int** array, int size){
for (int i=0; i<size; i++){
for (int j=0; j<size; j++){
sumGlobal += array[i][j];
}
}
}
int main(){
int size = 4000;
int numThreads = 4;
int** a2DArray = new int*[size];
for (int i=0; i<size; i++){
a2DArray[i] = new int[size];
for (int j=0; j<dim; j++){
a2DArray[i][j] = genRandNum(0,100);
}
}
std::vector<std::future<void>> thread_Pool;
for (int i = 0; i < numThreads; ++i) {
thread_Pool.push_back( std::async(launch::async,
sumArray, a2DArray, size));
}
}
I'm unsure of how to guarantee that sumGlobal is not rewritten with every thread. I want to update it correctly, so that each thread adds its value to the global variable when it's finished. I'm just trying to learn threading, and not be restricted to non-void functions.
Make the variable atomic:
#include <atomic>
...
std::atomic<int> sumGlobal {0};
An atomic variable is exempt from data races: it behaves well even when several threads are trying to read and write it. Wheteher the atomicity is implemented through mutual exclusion or in a lock free maner is left to the implementation. As you use += to atomically update the variable, there is no risk of having inconsistencies in your example.
This nice video explains you in much more detail what atomics are, why they are needed, and how they work.
You could also keep your variable as they are and use a mutex/lock_gard to protect it, as explained my #Miles Budnek. The problem is, that only one thread at a time can execute code protected by the mutex. In your example, this would mean that the processing in the different threads would not really work concurently. The atomic approach should have superior performance : one thread may still compute indexes and read the array while the other is updating the global variable.
If you don't want to use a synchronized object like std::atomic<int> as #Christophe suggests, you can use std::mutex and std::lock_guard to manually synchronize access to your sum.
int sumGlobal = 0;
std::mutex sumMutex;
void sumArray(...) {
for(...) {
for(...) {
std::lock_guard<std::mutex> lock(sumMutex);
sumGlobal += ...;
}
}
}
Keep in mind that all that locking and unlocking will incur quite a bit of overhead.