I'm in the process of trying to figure out multithreading - I'm pretty new to it. I'm using a thread_pool type that I found here. For sufficiently large N, the following code segfaults. Could you guys help me understand why and how to fix?
#include "thread_pool.hpp"
#include <thread>
#include <iostream>
static std::mutex mtx;
void printString(const std::string &s) {
std::lock_guard lock(mtx);
std::hash<std::thread::id> tid{};
auto id = tid(std::this_thread::get_id()) % 16;
std::cout << "thread: " << id << " " << s << std::endl;
TEST(test, t) {
thread_pool pool(16);
int N = 1000000;
std::vector<std::string> v(N);
for (int i = 0; i < N; i++) {
v[i] = std::to_string(i);
for (auto &s: v) {
pool.push_task([&s]() {
Here's the thread sanitizer output (note the ===> comments where I direct you to appropriate line"):
SEGV on unknown address 0x000117fbdee8 (pc 0x000102fa35b6 bp 0x7e8000186b50 sp 0x7e8000186b30 T257195)
0x102fa35b6 std::basic_string::__get_short_size const string:1514
0x102fa3321 std::basic_string::size const string:970
0x102f939e6 std::operator<<<…> ostream:1056
0x102f9380b printString RoadRunnerMapTests.cpp:37 // ==> this line: void printString(const std::string &s) {
0x102fabbd5 $_0::operator() const RoadRunnerMapTests.cpp:49 // ===> this line: v[i] = std::to_string(i);
0x102fabb3d (test_cxx_api_RoadRunnerMapTests:x86_64+0x10001eb3d) type_traits:3694
0x102fabaad std::__invoke_void_return_wrapper::__call<…> __functional_base:348
0x102faba5d std::__function::__alloc_func::operator() functional:1558
0x102fa9669 std::__function::__func::operator() functional:1732
0x102f9d383 std::__function::__value_func::operator() const functional:1885
0x102f9c055 std::function::operator() const functional:2560
0x102f9bc29 thread_pool::worker thread_pool.hpp:389 // ==> [this]( line
0x102fa00bc (test_cxx_api_RoadRunnerMapTests:x86_64+0x1000130bc) type_traits:3635
0x102f9ff1e std::__thread_execute<…> thread:286
0x102f9f005 std::__thread_proxy<…> thread:297
0x1033e9a2c __tsan_thread_start_func
0x7fff204828fb _pthread_start
0x7fff2047e442 thread_start

Destructors are called in the order opposite to variable declaration order. i.e. v will be destructed earlier than pool, therefore at the moment when some threads from pool will call to printString(), the argument string will not be a valid object, because v and its content are already destroyed. To resolve this, I'd recommend to declare v before pool.

Tasks passed to thread pool contain references to content of vector v, however this vector goes out of scope prior to pool leaving tasks with dangling references. In order to fix this you need to reorder scopes of variables:
int N = 1000000;
std::vector<std::string> v(N);
thread_pool pool(16);


Threading returns unexpected result - c++

I'm learning about threads for homework, and I've tried to implement threading on a simple program I've made. Without threading the program works perfectly, but when I thread the two random number generator functions, it returns incorrect results. The result always seems to be '42' for both number generators, not sure why this would be the case.
Also for context, I'm just starting with threads so I understand this program doesn't need multithreading. I'm doing it just for learning purposes.
Thanks for any help!
// struct for vector to use
struct readings {
std::string name;
int data;
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);
// random generator for light value - stores in vector of struct
void gen_light(std::vector<readings>& storage) {
readings l = {"Light", rand() % 100 + 1};
storage.insert(storage.begin(), l);
int main() {
// vector of readings struct
std::vector<readings> storage;
// initialising threads of random generators
std::thread H(gen_heat, std::ref(storage));
std::thread L(gen_light, std::ref(storage));
// waiting for both to finish
// print values in vec of struct
for (const auto& e : storage) {
std::cout << "Type: " << << std::endl
<< "Numbers: " << << std::endl;
// send to another function
return 0;
Since you have several threads accessing a mutual resource, in this case the vector of readings, and some of them are modifying it, you need to make the accesses to that resource exclusive. There are many ways of synchronizing the access; one of them, simple enough and not going down to the use of mutexes, is a binary semaphore (since C++20). You basically:
own the access to the resource by acquiring the semaphore,
use the resource, and then,
release the semaphore so others can access the resource.
If a thread A tries to acquire the semaphore while other thread B is using the resource, thread A will block until the resource is freed.
Notice the semaphore is initialized to 1 indicating the resource is free. Once a thread acquires the semaphore, the count will go down to 0, and no other thread will be able to acquire it until the count goes back to 1 (what will happen after a release).
#include <cstdlib> // rand
#include <iostream> // cout
#include <semaphore>
#include <string>
#include <thread>
#include <vector>
std::binary_semaphore readings_sem{1};
// struct for vector to use
struct readings {
std::string name;
int data;
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);
// random generator for light value - stores in vector of struct
void gen_light(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
readings l = {"Light", rand() % 100 + 1};
storage.insert(storage.begin(), l);
int main() {
// vector of readings struct
std::vector<readings> storage;
// initialising threads of random generators
std::thread H(gen_heat, std::ref(storage));
std::thread L(gen_light, std::ref(storage));
// waiting for both to finish
// print values in vec of struct
for (const auto& e : storage) {
std::cout << "Type: " << << std::endl
<< "Numbers: " << << std::endl;
// Outputs (something like):
// Type: Heat
// Numbers: 5
// Type: Light
// Numbers: 83
// Type: Light
// Numbers: 40
// ...
[Update on Ben Voigt's comment]
The acquisition and release of the resource can be encapsulated by using RAII (Resource Acquisition Is Initialization), a mechanism which is already provided by the language. E.g.:
Both threads still try and acquire a mutex to get access to the vector of readings resource.
But they acquire it by just creating a lock guard.
Once the lock guard goes out of scope and is destroyed, the mutex is released.
#include <mutex> // lock_guard
std::mutex mtx{};
// random generator for heat value - stores in vector of struct
void gen_heat(std::vector<readings>& storage) {
for (auto i{0}; i < 5; ++i) {
std::lock_guard<std::mutex> lg{ mtx };
readings h = {"Heat", rand() % 100 + 1};
storage.insert(storage.begin(), h);

Segmentation Fault when assigning value to a pointer C++

When I run the following parallel code I get a segmentation fault at the assignment at row 18 (between the two prints). I don't really understand what is causing.
This is a minimal working example which describes the problem:
#include <iostream>
#include <numeric>
#include <vector>
#include <thread>
struct Worker{
std::vector<int>* v;
void f(){
std::vector<int> a(20);
std::iota(a.begin(), a.end(), 1);
auto b = new std::vector<int>(a);
std::cout << "Test 1" << std::endl;
v = b;
std::cout << "Test 2" << std::endl;
int main(int argc, char** argv) {
int nw = 1;
std::vector<std::thread> threads(nw);
std::vector<std::unique_ptr<Worker>> W;
for(int i = 0; i < nw; i++){
threads[i] = std::thread([&]() { W[i]->f(); } );
// Pinning threads to cores
cpu_set_t cpuset;
CPU_SET(i, &cpuset);
pthread_setaffinity_np(threads[i].native_handle(), sizeof(cpu_set_t), &cpuset);
for (int i = 0; i < nw; i++) {
std::cout << (*(W[i]->v))[0] << std::endl;
It seems that compiling it with -fsanitize=address the code works fine but I get worst performances. How can I make it work?
std::vector is not thread-safe. None of the containers in the C++ library are thread safe.
threads[i] = std::thread([&]() { W[i]->f(); } );
The new execution thread captures the vector by reference and accesses it.
The original execution thread continuously modifies the vector here, without synchronizing access to the W vector with any of the new execution threads. Any push_back may invalidate the existing contents of the vector in order to reallocate it, and if a different execution thread attempts to get W[i] at the same time, while it's being reallocated, hillarity ensues.
This is undefined behavior.
You must either synchronize access to the vector using a mutex, or make sure that the vector will never be reallocated, using any number of known techniques. A sufficiently-large reserve(), in advance, should do the trick.
Additionally, it's been pointed out that i is also captured by reference, so by the time each new execution thread starts, its value could be anything.
In addition to the vector synchronization problem mentioned by Sam, there is another problem.
This line:
threads[i] = std::thread([&]() { W[i]->f(); } );
captures i by reference. There is a good chance that i goes out of scope (and is destroyed) before the thread starts running. The statement W[i]->f(); is likely to read an invalid value of i which is negative or too large. Note that before i goes out of scope, the last value written to it is nw, so if even if the memory that previously contained i is still accessible, it's likely to have the value nw which is too large.
You could fix this problem by capturing i by value:
threads[i] = std::thread([&W, i]() { W[i]->f(); } );
// ^^^^^
// captures W by reference, and i by value
As noted by others, the capture is the problem.
I've added the i parameter to the f() call:
void f(int i){
std::vector<int> a(20);
std::iota(a.begin(), a.end(), 1);
auto b = new std::vector<int>(a);
std::cout << "Test 1 " << i << std::endl;
v = b;
std::cout << "Test 2 " << v->size() << std::endl;
and the output: Test 1 1
The call to f works however but it is called without a valid Worker instance and when you assign to v it is surely at a wrong memory.

std::atomic_flag to stop multiple threads

I'm trying to stop multiple worker threads using a std::atomic_flag. Starting from Issue using std::atomic_flag with worker thread the following works:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
std::atomic_flag continueFlag;
std::thread t;
void work()
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work ";
void start()
t = std::thread(&work);
void stop()
int main()
std::cout << "Start" << std::endl;
std::cout << "Stop" << std::endl;
std::cout << "Stopped." << std::endl;
return 0;
Trying to rewrite into multiple worker threads:
#include <iostream>
#include <atomic>
#include <chrono>
#include <thread>
#include <vector>
#include <memory>
struct thread_data {
std::atomic_flag continueFlag;
std::thread thread;
std::vector<thread_data> threads;
void work(int threadNum, std::atomic_flag &continueFlag)
while (continueFlag.test_and_set(std::memory_order_relaxed)) {
std::cout << "work" << threadNum << " ";
void start()
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
thread_data td;
td.thread = std::thread(&work, i, td.continueFlag);
void stop()
//Flag stop
for (auto &data : threads) {
for (auto &data : threads) {
int main()
std::cout << "Start" << std::endl;
std::cout << "Stop" << std::endl;
std::cout << "Stopped." << std::endl;
return 0;
My issue is "Problem Sector" in above. Namely creating the threads. I cannot wrap my head around how to instantiate the threads and passing the variables to the work thread.
The error right now is referencing this line threads.push_back(std::move(td)); with error Error C2280 'thread_data::thread_data(const thread_data &)': attempting to reference a deleted function.
Trying to use unique_ptr like this:
auto td = std::make_unique<thread_data>();
td->thread = std::thread(&work, i, td->continueFlag);
Gives error std::atomic_flag::atomic_flag(const std::atomic_flag &)': attempting to reference a deleted function at line td->thread = std::thread(&work, i, td->continueFlag);. Am I fundamentally misunderstanding the use of std::atomic_flag? Is it really both immovable and uncopyable?
Your first approach was actually closer to the truth. The problem is that it passed a reference to an object within the local for loop scope to each thread, as a parameter. But, of course, once the loop iteration ended, that object went out of scope and got destroyed, leaving each thread with a reference to a destroyed object, resulting in undefined behavior.
Nobody cared about the fact that you moved the object into the std::vector, after creating the thread. The thread received a reference to a locally-scoped object, and that's all it knew. End of story.
Moving the object into the vector first, and then passing to each thread a reference to the object in the std::vector will not work either. As soon as the vector internally reallocates, as part of its natural growth, you'll be in the same pickle.
What needs to happen is to have the entire threads array created first, before actually starting any std::threads. If the RAII principle is religiously followed, that means nothing more than a simple call to std::vector::resize().
Then, in a second loop, iterate over the fully-cooked threads array, and go and spawn off a std::thread for each element in the array.
I was almost there with my unique_ptr solution. I just needed to pass the call as a std::ref() as such:
std::vector<std::unique_ptr<thread_data>> threads;
void start()
const unsigned int numThreads = 2;
for (int i = 0; i < numThreads; i++) {
auto td = std::make_unique<thread_data>();
td->thread = std::thread(&work, i, std::ref(td->continueFlag));
However, inspired by Sam above I also figured a non-pointer way:
std::vector<thread_data> threads;
void start()
const unsigned int numThreads = 2;
//create new vector, resize doesn't work as it tries to assign/copy which atomic_flag
//does not support
threads = std::vector<thread_data>(numThreads);
for (int i = 0; i < numThreads; i++) {
auto& t =;
t.thread = std::thread(&work, i, std::ref(t.continueFlag));

Segmentation fault multithreading C++ 11

I have a vector entities containing 44 million names. I want to split it into 4 parts and process each part in parallel. Class Freebase contains the function loadData() which is used to split the vector and call function multiThread in order to do the processing.
loadEntities() reads a text file containing the names. I didn't put the implementation in the class because it's not important
loadData() splits the vector entities that was initialized in the constructor into 4 parts and adds every part the vector<thread> threads as follows:
threads.push_back(thread(&Freebase::multiThread, this, i, i + right, ref(data)));
multiThread is the function where I process the files
i and i+right are the indices used in the for loop of multithread to loop through entities
returnValues is a subfunction of multiThreadand is used to call an external function.
cout <<"Entity " << entities[i] << endl; is showing the following results:
Entity m.0rzf6wv (ok)
Entity m.0rzf70 (ok)
Entity m.068s4h9 m.0n_k8bz (WRONG)
Entity Entity m.068s5_1 (WRONG)
The last 2 outputs are wrong. The output should be:
Entity name not entity entity name nor entity name name
This is causing a segmentation fault when the input is being sent to function returnValues. How can I solve it?
Source Code
#ifndef FREEBASE_H
#define FREEBASE_H
class Freebase
Freebase(const std::string &, const std::string &, const std::string &, const std::string &);
void loadData();
std::string _serverURL;
std::string _entities;
std::string _xmlFile;
void multiThread(int,int, std::vector<std::pair<std::string, std::string>> &);
//private data members
std::vector<std::string> entities;
#include "Freebase.h"
#include "queries/SparqlQuery.h"
Freebase::Freebase(const string & url, const string & e, const string & xmlFile, const string & tfidfDatabase):_serverURL(url), _entities(e), _xmlFile(xmlFile), _tfidfDatabase(tfidfDatabase)
entities = loadEntities();
void Freebase::multiThread(int start, int end, vector<pair<string,string>> & data)
string basekb = "PREFIX basekb:<> ";
for(int i = start; i < end; i++)
cout <<"Entity " << entities[i] << endl;
vector<pair<string, string>> description = returnValues(basekb + "select ?description where {"+ entities[i] +" basekb:common.topic.description ?description. FILTER (lang(?description) = 'en') }");
string desc = "";
for(auto &d: description)
desc += d.first + " ";
data.push_back(make_pair(entities[i], desc));
void Freebase::loadData()
vector<pair<string, string>> data;
vector<thread> threads;
int Size = entities.size();
//split database into 4 parts
int p = 4;
int right = round((double)Size / (double)p);
int left = Size % p;
float totalduration = 0;
vector<pair<int, int>> coordinates;
int counter = 0;
for(int i = 0; i < Size; i += right)
if(i < Size - right)
threads.push_back(thread(&Freebase::multiThread, this, i, i + right, ref(data)));
threads.push_back(thread(&Freebase::multiThread, this, i, Size, ref(data)));
}//end outer for
for(auto &t : threads)
vector<pair<string, string>> Freebase::returnValues(const string & query)
vector<pair<string, string>> data;
SparqlQuery sparql(query, _serverURL);
string result = sparql.retrieveInformations();
istringstream str(result);
string line;
//skip first line
while(getline(str, line))
vector<string> values;
line.erase(remove( line.begin(), line.end(), '\"' ), line.end());
boost::split(values, line, boost::is_any_of("\t"));
if(values.size() == 2)
pair<string,string> fact = make_pair(values[0], values[1]);
data.push_back(make_pair(line, ""));
return data;
}//end function
Arnon Zilca is correct in his comments. You are writing to a single vector from multiple threads (in Freebase::multiThread()), a recipe for disaster. You can use a mutex as described below to protect the push_back operation.
For more info on thread safety on containers see Is std::vector or boost::vector thread safe?.
data.push_back(make_pair(entities[i], desc));
Another option is using the same strategy as you do in returnValues, creating a local vector in multiThread and only pushing the contents to the data vector when thread is done processing.
void Freebase::multiThread(int start, int end, vector<pair<string,string>> & data)
vector<pair<string,string>> threadResults;
string basekb = "PREFIX basekb:<> ";
for(int i = start; i < end; i++)
cout <<"Entity " << entities[i] << endl;
vector<pair<string, string>> description = returnValues(basekb + "select ?description where {"+ entities[i] +" basekb:common.topic.description ?description. FILTER (lang(?description) = 'en') }");
string desc = "";
for(auto &d: description)
desc += d.first + " ";
threadResults.push_back(make_pair(entities[i], desc));
data.insert(data.end(), threadResults.begin(), threadResults.end());
Note: I would suggest using a different mutex than the one you use for the cout. The overall result vector data is a different resource than cout. So threads who want to use cout, should not have to wait for another thread to finish with data.
You could use a mutex around
cout <<"Entity " << entities[i] << endl;
That would prevent multiple threads using cout at "the same time". That way you can be sure that an entire message is printed by a thread before another thread gets to print a message. Note that this will impact your performance since threads will have to wait for the mutex to become available before they are allowed to print.
Note: Protecting the cout will only cleanup your output on the stream, it will not influence the behavior of the rest of the code, see above for that.
See for an example.
// mutex::lock/unlock
#include <iostream> // std::cout
#include <thread> // std::thread
#include <mutex> // std::mutex
std::mutex mtx; // mutex for critical section
void print_thread_id (int id) {
// critical section (exclusive access to std::cout signaled by locking mtx):
std::cout << "thread #" << id << '\n';
int main ()
std::thread threads[10];
// spawn 10 threads:
for (int i=0; i<10; ++i)
threads[i] = std::thread(print_thread_id,i+1);
for (auto& th : threads) th.join();
return 0;

Accessing random number engine from multiple threads

this is my first question, so please forgive me any violations against your policy. I want to have one global random number engine per thread, to which purpose I've devised the following scheme: Each thread I start gets a unique index from an atomic global int. There is a static vector of random engines, whose i-th member is thought to be used by the thread with the index i. If the index if greater than the vector size elements are added to it in a synchronized manner. To prevent performance penalties, I check twice if the index is greater than the vector size: once in an unsynced manner, and once more after locking the mutex. So far so good, but the following example fails with all sorts of errors (heap corruption, malloc-errors, etc.).
using std::cout;
std::atomic_uint INDEX_GEN{};
std::vector<std::mt19937> RNDS{};
float f = 0.0f;
std::mutex m{};
class TestAThread {
TestAThread() :thread(nullptr){
cout << "Calling constructor TestAThread\n";
thread = new std::thread(&TestAThread::run, this);
TestAThread(TestAThread&& source) : thread(source.thread){
source.thread = nullptr;
cout << "Calling move constructor TestAThread. My ptr is " << thread << ". Source ptr is" << source.thread << "\n";
TestAThread(const TestAThread& source) = delete;
~TestAThread() {
cout << "Calling destructor TestAThread. Pointer is " << thread << "\n";
if (thread != nullptr){
cout << "Deleting thread pointer\n";
delete thread;
thread = nullptr;
void run(){
int index = INDEX_GEN.fetch_add(1);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true){
if (index >= RNDS.size()){
// add randoms in a synchronized manner.
while (index >= RNDS.size()){
cout << "index is " << index << ", size is " << RNDS.size() << std::endl;
f += uniformRnd(RNDS[index]);
std::thread* thread;
int main(int argc, char* argv[]){
std::vector<TestAThread> threads;
for (int i = 0; i < 10; ++i){
cout << f;
What am I doing wrong?!
Obviously f += ... would be a race-condition regardless of the right-hand side, but I suppose you already knew that.
The main problem that I see is your use of the global std::vector<std::mt19937> RNDS. Your mutex-protected critical section only encompasses adding new elements; not accessing existing elements:
... uniformRnd(RNDS[index]);
That's not thread-safe because resizing RNDS in another thread could cause RNDS[index] to be moved into a new memory location. In fact, this could happen after the reference RNDS[index] is computed but before uniformRnd gets around to using it, in which case what uniformRnd thinks is a Generator& will be a dangling pointer, possibly to a newly-created object. In any event, uniformRnd's operator() makes no guarantee about data races [Note 1], and neither does RNDS's operator[].
You could get around this problem by:
computing a reference (or pointer) to the generator within the protected section (which cannot be contingent on whether the container's size is sufficient), and
using a std::deque instead of a std::vector, which does not invalidate references when it is resized (unless the referenced object has been removed from the container by the resizing).
Something like this (focusing on the race condition; there are other things I'd probably do differently):
std::mt19937& get_generator(int index) {
std::lock_guard<std::mutex> l(m);
if (index <= RNDS.size()) RNDS.resize(index + 1);
return RNDS[index];
void run(){
int index = INDEX_GEN.fetch_add(1);
auto& gen = get_generator(index);
std::uniform_real_distribution<float> uniformRnd{ 0.0f, 1.0f };
while (true) {
/* Do something with uniformRnd(gen); */
[1] The prototype for operator() of uniformRnd is template< class Generator > result_type operator()( Generator& g );. In other words, the argument must be a mutable reference, which means that it is not implicitly thread-safe; only const& arguments to standard library functions are free of data races.