right way to write to file from different threads (using Qt c++ )

right way to write to file from different threads (using Qt c++ ) - c++

i have thread pool that create threads each thread worker calculate some work and when it done
it writes the result to file , there is only 1 result file , that each worker thread needs to write to .
now my question is how i guarantee that there wouldn't be any locks or missing write data to the file where allot of threads trying to write to single file ? what is the right strategy for such scenario?
mybe keep all result in memory ? or chanks of results
im using already the QThreadPool framework , and i need to find solution with it.
also i wander , does writing to single file from worker threads , i will have to use
singleton file manager or static class , is it good idea? for multithreaded app ?

So you have many concurrent threads competing for one shared resource. That begs a synchronization primitive of some sort, for example a mutex.
Here's (non-Qt specific) code showcasing 10 threads simultaneously writing to a single file. On a side note, C++11 introduced a lot of goodies like std::mutex (and std::thread too, so that can help eliminate some Qt-specific threading code).
#include <fstream>
#include <mutex>
#include <thread>
#include <vector>
std::mutex m;
std::ofstream file;
int main() {
file.open("file.txt");
std::vector<std::thread> workers;
for (int i = 0; i < 10; ++i) {
workers.push_back(std::thread([=i] {
for (int j = 0; j < 10; ++j) {
std::lock_guard<std::mutex> lock(m);
file << "thread " << i << ": " << j << endl;
}
}));
}
for (auto& worker : workers) {
worker.join();
}
file.close();
return 0;
}
Of course, if you have a lot of places in your code accessing the shared resource, it's better to encapsulate it, along with the mutex, in some "accessor" class that would manage all state-modifying calls to the resource.
P.S. If you're not on C++11 compiler, you can use boost::mutex or Qt-specific mutex wrapper. The key thing is that you need some synchronization primitive associated with the shared resource.

As I assume, you crate your own Runnable, deriving from QRunnable
You can pass some context information when constructing your Runnable class. You need to pass device to write in and mutex to lock device, for example.
class Runnable: public QRunnable
{
public:
Runnable(QIOdevice* device, QMutex* mutex):_d(device), _m(mutex){}
void run()
{
saveInFile();
}
private:
void saveInFile()
{
QMutexLocker lock(_m);
//now only one thread can write to file in the same moment of time
device->write(...);
}
QIOdevice* _d;
QMutex* _m;
};

Related

Two questions on std::condition_variables

I have been trying to figure out std::condition_variables and I am particularly confused by wait() and whether to use notify_all or notify_one.
First, I've written some code and attached it below. Here's a short explanation: Collection is a class that holds onto a bunch of Counter objects. These Counter objects have a Counter::increment() method, which needs to be called on all the objects, over and over again. To speed everything up, Collection also maintains a thread pool to distribute the work over, and sends out all the work with its Collection::increment_all() method.
These threads don't need to communicate with each other, and there are usually many more Counter objects than there are threads. It's fine if one thread processes more than Counters than others, just as long as all the work gets done. Adding work to the queue is easy and only needs to be done in the "main" thread. As far as I can see, the only bad thing that can happen is if other methods (e.g. Collection::printCounts) are allowed to be called on the counters in the middle of the work being done.
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
#include <condition_variable>
#include <queue>
class Counter{
private:
int m_count;
public:
Counter() : m_count(0) {}
void increment() {
m_count ++;
}
int getCount() const { return m_count; }
};
class Collection{
public:
Collection(unsigned num_threads, unsigned num_counters)
: m_shutdown(false)
{
// start workers
for(size_t i = 0; i < num_threads; ++i){
m_threads.push_back(std::thread(&Collection::work, this));
}
// intsntiate counters
for(size_t j = 0; j < num_counters; ++j){
m_counters.emplace_back();
}
}
~Collection()
{
m_shutdown = true;
for(auto& t : m_threads){
if(t.joinable()){
t.join();
}
}
}
void printCounts() {
// wait for work to be done
std::unique_lock<std::mutex> lk(m_mtx);
m_work_complete.wait(lk); // q2: do I need a while lop?
// print all current counters
for(const auto& cntr : m_counters){
std::cout << cntr.getCount() << ", ";
}
std::cout << "\n";
}
void increment_all()
{
std::unique_lock<std::mutex> lock(m_mtx);
m_work_complete.wait(lock);
for(size_t i = 0; i < m_counters.size(); ++i){
m_which_counters_have_work.push(i);
}
}
private:
void work()
{
while(!m_shutdown){
bool action = false;
unsigned which_counter;
{
std::unique_lock<std::mutex> lock(m_mtx);
if(m_which_counters_have_work.size()){
which_counter = m_which_counters_have_work.front();
m_which_counters_have_work.pop();
action = true;
}else{
m_work_complete.notify_one(); // q1: notify_all
}
}
if(action){
m_counters[which_counter].increment();
}
}
}
std::vector<Counter> m_counters;
std::vector<std::thread> m_threads;
std::condition_variable m_work_complete;
std::mutex m_mtx;
std::queue<unsigned> m_which_counters_have_work;
bool m_shutdown;
};
int main() {
int num_threads = std::thread::hardware_concurrency()-1;
int num_counters = 10;
Collection myCollection(num_threads, num_counters);
myCollection.printCounts();
myCollection.increment_all();
myCollection.printCounts();
myCollection.increment_all();
myCollection.printCounts();
return 0;
}
I compile this on Ubuntu 18.04 with g++ -std=c++17 -pthread thread_pool.cpp -o tp && ./tp I think the code accomplishes all of those objectives, but a few questions remain:
I am using m_work_complete.wait(lk) to make sure the work is finished before I start printing all the new counts. Why do I sometimes see this written inside a while loop, or with a second argument as a lambda predicate function? These docs mention spurious wake ups. If a spurious wake up occurs, does that mean printCounts could prematurely print? If so, I don't want that. I just want to ensure the work queue is empty before I start using the numbers that should be there.
I am using m_work_complete.notify_all instead of m_work_complete.notify_one. I've read this thread, and I don't think it matters--only the main thread is going to be blocked by this. Is it faster to use notify_one just so the other threads don't have to worry about it?

std::condition_variable is not really a condition variable, it's more of a synchronization tool for reaching a certain condition. What that condition is is up to the programmer, and it should still be checked after each condition_variable wake-up, since it can wake-up spuriously, or "too early", when the desired condition isn't yet reached.
On POSIX systems, condition_variable::wait() delegates to pthread_cond_wait, which is susceptible to spurious wake-up (see "Condition Wait Semantics" in the Rationale section). On Linux, pthread_cond_wait is in turn implemented via a futex, which is again susceptible to spurious wake-up.
So yes you still need a flag (protected by the same mutex) or some other way to check that the work is actually complete. A convenient way to do this is by wrapping the check in a predicate and passing it to the wait() function, which would loop for you until the predicate is satisfied.
notify_all unblocks all threads waiting on the condition variable; notify_one unblocks just one (or at least one, to be precise). If there are more than one waiting threads, and they are equivalent, i.e. either one can handle the condition fully, and if the condition is sufficient to let just one thread continue (as in submitting a work unit to a thread pool), then notify_one would be more efficient since it won't unblock other threads unnecessarily for them to only notice no work to be done and going back to waiting. If you ever only have one waiter, then there would be no difference between notify_one and notify_all.

It's pretty simple: Use notify() when;
There is no reason why more than one thread needs to know about the event. (E.g., use notify() to announce the availability of an item that a worker thread will "consume," and thereby make the item unavailable to other workers)*AND*
There is no wrong thread that could be awakened. (E.g., you're probably safe if all of the threads are wait()ing in the same line of the same exact function.)
Use notify_all() in all other cases.

Thread usage counter C++

In a C++ class, How can I limit the number calls/uses of a certain function for each thread?
For example, each thread is allowed only to use a certain data setter for 3 times.

You just have to count how often the method has been called for each thread and then react accordingly:
void Foo::set(int x) {
static std::map<std::thread::id,unsigned> counter;
auto counts = ++counter[std::this_thread::get_id()];
if (counts > max_counts) return;
x_member = x;
}
This is just to outline the basic idea. I am not so sure about the static map. I am not even sure if it is a good idea to let the method itself implement the counter. I would rather put this elsewhere, eg each thread could get a CountedFoo instance that holds a reference to the actual Foo object and the CountedFoo controls the maximum number of calls.
PS: And of course, don't forget to use some synchronisation when multiple threads are calling the method concurrently (for the sake of brevity I did not include any mutex or similar in the above code).

Using std::map to store thread Ids as sugested by #formerlyknownas_463035818 would probably be the most robust solution, but synchronization might prove more complex.
The fastest solution to this issue is using thread_local. This will enable each thread to have its own copy of the counter. Here is the working example which might prove useful.
thread_local unsigned int N_Calls = 0;
std::mutex mtx;
void controlledIncreese(const std::string& thread_name){
while (N_Calls < 3) {
++N_Calls;
std::this_thread::sleep_for(std::chrono::seconds(rand() % 2));
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Call for thread " << thread_name.c_str() << ": " << N_Calls << '\n';
}
}
int main(){
std::thread first_t(controlledIncreese, "first"), second_t(controlledIncreese, "second");
first_t.join();
second_t.join();
}
Since both Threads are using std::cout the actual output will be sequential, so this specific example is not very useful but it does provide easy working solution to thread execution counting problem.

Creating a class to store threads and calling them

Here is a simplified version of what I am trying to do:
#include <iostream>
#include <vector>
#include <thread>
#include <atomic>
class client {
private:
std::vector<std::thread> threads;
std::atomic<bool> running;
void main() {
while(running) {
std::cout << "main" << std::endl;
}
}
void render() {
while(running) {
std::cout << "render" << std::endl;
}
}
public:
client() {
running = true;
threads.push_back(std::thread(&client::main, this));
threads.push_back(std::thread(&client::render, this));
}
~client() {
running = false;
for(auto& th : threads) th.join();
};
};
int main() {
client c;
std::string inputString;
getline(std::cin, inputString);
return 0;
}
(Note: code has been changed since question was written)
What I am trying to do is create a class that holds threads for the main loop(of the class), rendering, and a couple other things. However I cannot get this simplified version to work. I have tried using mutex to lock and unlock the threads, but didn't seem to help any. I do not know why it is not working, but I suspect that it is a result of the use of this in threads.push_back(std::thread(this->main, this));.
The current structure of the code doesn't have to remain... The only requirement is that uses one of it's own member functions as a thread (and that, that thread is stored in the class). I am not sure if this requires two classes or if my attempt to do it in one class was the correct approach. I have seen many examples of creating an object, and then calling a member that creates threads. I am trying to avoid this and instead create the threads within the constructor.

The problem here is that you do not wait for the threads to end. In main you create c. This then spawns the threads. The next thing to happen is to return which destroys c. When c is destroyed it destroys its members. Now when a thread is destroyed if it has not been joined or detached then std::terminate is called and the program ends
What you need to do is in the destructor, set running to false and then call join on both the threads. This will stop the loop in each thread and allow c to be destructed correctly.
Doing this however brings up another issue. running is not an atomic variable so writing to it while threads are reading it is undefined behavior. We can fin that though by changing running to a std::atomic<bool> which provides synchronization.
I also had to make a change to the thread construction. When you want to use a member function the syntax should be
std::thread(&class_name::function_name, pointer_to_instance_of_class_name, function_parameters)
so in this case it would be
threads.push_back(std::thread(&client::main, this));
threads.push_back(std::thread(&client::render, this));

What is critical resources to a std::mutex object

I am new to concurrency and I am having doubts in std::mutex. Say I've a int a; and books are telling me to declare a mutex amut; to get exclusive access over a. Now my question is how a mutex object is recognizing which critical resources it has to protect ? I mean which variable?
say I've two variables int a,b; now i declare mutex abmut; Now abmut will protect what???
both a and b or only a or b???

Your doubts are justified: it doesn't. That's your job as a programmer, to make sure you only access a if you've got the mutex. If somebody else got the mutex, do not access a or you will have the same problems you'd have without the mutex. That goes for all thread-syncronization constructs. You can use them to protect a resource. They don't do it on their own.

Mutex is more like a sign rather than a lock. When someone sees a sign saying "occupied" in a public washroom, he will wait until the user gets out and flips the sign. But you have to teach him to wait when seeing the sign. The sign itself won't prevent him from breaking in. Of course, the "wait" order is already set by mutex.lock(), so you can use it conveniently.

A std::mutex does not protect any data at all. A mutex works like this:
When you try to lock a mutex you look if the mutex is not already locked, else you wait until it is unlocked.
When you're finished using a mutex you unlock it, else threads that are waiting will do that forever.
How does that protect things? consider the following:
#include <iostream>
#include <future>
#include <vector>
struct example {
static int shared_variable;
static void incr_shared()
{
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
int example::shared_variable = 0;
int main() {
std::vector<std::future<void> > handles;
handles.reserve(10000);
for(int i = 0; i < 10000; i++) {
handles.push_back(std::async(std::launch::async, example::incr_shared));
}
for(auto& handle: handles) handle.wait();
std::cout << example::shared_variable << std::endl;
}
You might expect it to print 1000000000, but you don't really have a guarantee of that. We should include a mutex, like this:
struct example {
static int shared_variable;
static std::mutex guard;
static void incr_shared()
{
std::lock_guard<std::mutex>{ guard };
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
So what does this exactly do? First of all std::lock_guard uses RAII to call mutex.lock() when it's created and mutex.unlock when it's destroyed, this last one happens when it leaves scope (here when the function exits). So in this case only one thread can be executing the for loop because as soon as a thread passes the lock_guard it holds the lock, and we saw before that no other thread can hold it. Therefore this loop is now safe. Note that we could also put the lock_guard inside the loop, but that might make your program slow (locking and unlocking is relatively expensive).
So in conclusion, a mutex protects blocks of code, in our example the for-loop, not the variable itself. If you want variable protection, consider taking a look at std::atomic. The following example is for example again unsafe because decr_shared can be called simultaneously from any thread.
struct example {
static int shared_variable;
static std::mutex guard;
static void decr_shared() { shared_variable--; }
static void incr_shared()
{
std::lock_guard<std::mutex>{ guard };
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
This however is again safe, because now the variable itself is protected, in any code that uses it.
struct example {
static std::atomic_int shared_variable;
static void decr_shared() { shared_variable--; }
static void incr_shared()
{
for(int i = 0; i < 100000; i++)
{
shared_variable++;
}
}
};
std::atomic_int example::shared_variable{0};

A mutex doesn't inherently protect any specific variables... instead, the programmer needs to realise that they have some group of 1 or more variables that several threads may attempt to use, then use a mutex so that only one of those threads can be running such variable-using/changing code at any point in time.
Note especially that you're only protected from other threads' code accessing/modifying those variables if their code locks the same mutex during the variable access. A mutex used by only one thread protects nothing.

mutex is used to synchronize access to a resource. Say you have a data say int, where you are going to do read write operation using an getter and a setter. So both getter and setters will use the same mutex to to sync read/write operation.
both of these function will lock the mutex at the beginning and unlock it before it returns. you can use scoped_lock that will automatically unlock on its destructor.
void setter(value_type v){
boost::mutex::scoped_lock lock(mutex);
value = v;
}
value_type getter() const{
boost::mutex::scoped_lock lock(mutex);
return value;
}

Imagine you sit at a table with your friends and a delicious cake (the resources you want to guard, e.g. some integer a) in the middle. In addition you have a single tennis ball (this is our mutex). Now, only a single person can have the ball (lock the mutex using a lock_guard or similar mechanisms), but everyone can eat the cake (access the integer a).
As a group you can decide to set up a rule that only whoever has the ball may eat from the cake (only the person who has locked the mutex may access a). This person may relinquish the ball by putting it on the table (unlock the mutex), so another person can grab it (lock the mutex). This way you ensure that no one stabs another person with the fork, while frantically eating the cake.
Setting up and upholding a rule like described in the last paragraph is your job as a programmer. There is no inherent connection between a mutex and some resource (e.g. some integer a).

Is possible to get a thread-locking mechanism in C++ with a std::atomic_flag?

Using MS Visual C++2012
A class has a member of type std::atomic_flag
class A {
public:
...
std::atomic_flag lockFlag;
A () { std::atomic_flag_clear (&lockFlag); }
};
There is an object of type A
A object;
who can be accessed by two (Boost) threads
void thr1(A* objPtr) { ... }
void thr2(A* objPtr) { ... }
The idea is wait the thread if the object is being accessed by the other thread.
The question is: do it is possible construct such mechanism with an atomic_flag object? Not to say that for the moment, I want some lightweight that a boost::mutex.
By the way the process involved in one of the threads is very long query to a dBase who get many rows, and I only need suspend it in a certain zone of code where the collision occurs (when processing each row) and I can't wait the entire thread to finish join().
I've tryed in each thread some as:
thr1 (A* objPtr) {
...
while (std::atomic_flag_test_and_set_explicit (&objPtr->lockFlag, std::memory_order_acquire)) {
boost::this_thread::sleep(boost::posix_time::millisec(100));
}
... /* Zone to portect */
std::atomic_flag_clear_explicit (&objPtr->lockFlag, std::memory_order_release);
... /* the process continues */
}
But with no success, because the second thread hangs. In fact, I don't completely understand the mechanism involved in the atomic_flag_test_and_set_explicit function. Neither if such function returns inmediately or can delay until the flag can be locked.
Also it is a mistery to me how to get a lock mechanism with such a function who always set the value, and return the previous value. with no option to only read the actual setting.
Any suggestion are welcome.

By the way the process involved in one of the threads is very long query to a dBase who get many rows, and I only need suspend it in a certain zone of code where the collision occurs (when processing each row) and I can't wait the entire thread to finish join().
Such a zone is known as the critical section. The simplest way to work with a critical section is to lock by mutual exclusion.
The mutex solution suggested is indeed the way to go, unless you can prove that this is a hotspot and the lock contention is a performance problem. Lock-free programming using just atomic and intrinsics is enormously complex and cannot be recommended at this level.
Here's a simple example showing how you could do this (live on http://liveworkspace.org/code/6af945eda5132a5221db823fa6bde49a):
#include <iostream>
#include <thread>
#include <mutex>
struct A
{
std::mutex mux;
int x;
A() : x(0) {}
};
void threadf(A* data)
{
for(int i=0; i<10; ++i)
{
std::lock_guard<std::mutex> lock(data->mux);
data->x++;
}
}
int main(int argc, const char *argv[])
{
A instance;
auto t1 = std::thread(threadf, &instance);
auto t2 = std::thread(threadf, &instance);
t1.join();
t2.join();
std::cout << instance.x << std::endl;
return 0;
}

It looks like you're trying to write a spinlock. Yes, you can do that with std::atomic_flag, but you are better off using std::mutex instead. Don't use atomics unless you really know what you're doing.

To actually answer the question asked: Yes, you can use std::atomic_flag to create a thread locking object called a spinlock.
#include <atomic>
class atomic_lock
{
public:
atomic_lock()
: lock_( ATOMIC_FLAG_INIT )
{}
void lock()
{
while ( lock_.test_and_set() ) { } // Spin until the lock is acquired.
}
void unlock()
{
lock_.clear();
}
private:
std::atomic_flag lock_;
};

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js