Static member function and thread safety - c++

I've a private static method in a class that has all the methods as static. It's a helper class that helps with logging and other stuff. This helper class would be called by multiple threads. I don't understand how the method in question is working safely with multiple threads without a lock.
//Helper.cpp
std::recursive_mutex Logger::logMutex_;
void Logger::write(const std::string &logFilePath, const std::string &formattedLog)
{
std::lock_guard<std::mutex> guard(logMutex_);
std::ofstream logFile(logFilePath.c_str(), std::ios::out | std::ios::app);
if (logFile.is_open())
{
logFile << formattedLog;
logFile.close();
}
}
void Logger::error(const string &appId, const std::string &fmt, ...)
{
auto logFile = validateLogFile(appId); //Also a private static method that validates if a file exists for this app ID.
if (!logFile.empty())
{
//Format the log
write(logFile, log);
}
}
//Helper.h
class Logger
{
public:
static void error(const std::string &Id, const std::string &fmt, ...);
private:
static void write(const std::string &fileName, const std::string &formattedLog);
static std::recursive_mutex logMutex_;
};
I understand that the local variables inside the static methods are purely local. i.e., A stack is created every time these methods are called and the variables are initialized in them. Now, in the Logger::write method I'm opening a file and writing to it. So when multiple threads calls the write method through the Logger::error method(Which is again static) and when there's no lock, I believe I should see some data race/crash.
Because multiple threads are trying to open the same file. Even if the kernel allows to open a file multiple times, I must see some fault in the data written to the file.
I tested this with running up to 100 threads and I see no crash, all the data is being written to the file concurrently. I can't completely understand how this is working. With or without the lock, I see the data is being written to the file perfectly.
TEST_F(GivenALogger, WhenLoggerMethodsAreCalledFromMultipleThreads_AllTheLogsMustBeLogged)
{
std::vector<std::thread> threads;
int num_threads = 100;
int i = 0;
for (; i < num_threads / 2; i++)
{
threads.push_back(std::thread(&Logger::error, validId, "Trial %d", i));
}
for (; i < num_threads; i++)
{
threads.push_back(std::thread(&Logger::debug, validId, "Trial %d", i));
}
std::for_each(threads.begin(), threads.end(), [](std::thread &t) { t.join(); });
auto actualLog = getActualLog(); // Returns a vector of log lines.
EXPECT_EQ(num_threads, actualLog.size());
}
Also, how should I properly/safely access the file?

The key is this line:
std::lock_guard<std::mutex> guard(logMutex_);
The std::lock_guard<std::mutex> will lock the mutex logMutex_ in the constructor and will unlock the mutex in the destructor when it goes out of scope, ie when the method returns.
If another thread attempts to write while the first thread is within the guard scope, the new (local) guard will try to lock the logMutex_ and that thread will be put to sleep until the lock is released.

Related

About shared_mutex and shared_ptr across multiple threads

I implemented code such that multiple instances running on different threads reads other instances' data using reader-writer lock and shared_ptr. It seemed fine, but I am not 100% sure about that and I came up with some questions about usage of those.
Detail
I have multiple instances of a class called Chunk and each instance does some calculations in a dedicated thread. A chunk needs to read neighbour chunks' data as well as its own data, but it doesn't write neighbours' data, so reader-writer lock is used. Also, neighbours can be set at runtime. For example, I might want o set a different neighbour chunk at runtime, sometimes just nullptr. It is possible to delete a chunk at runtime, too. Raw pointers can be used but I thought shared_ptr and weak_ptr are better for this, in order to keep track of the lifetime. Own data in shared_ptr and neighbours' data in weak_ptr.
I provided a simpler version of my code below. ChunkData has data and a mutex for it. I use InitData for data initialization and DoWork function is called in a dedicated thread after that. other functions can be called from main thread.
This seems to work, but I am not so confident. Especially, about use of shared_ptr across multiple threads.
What happens if a thread calls shared_ptr's reset() (in ctor and InitData) and other uses it with weak_ptr's lock (in DoWork)? Does this need a lock dataMutex or chunkMutex?
How about copy(in SetNeighbour)? Do I need locks for this as well?
I think other parts are ok, but please let me know if you find anything dangerous. Appreciate that.
By the way, I considered about storing shared_ptr of Chunk instead of ChunkData, but decided not to use this method because internal code, which I don't manage, has GC system and it can delete a pointer to Chunk when I don't expect it.
class Chunk
{
public:
class ChunkData
{
public:
shared_mutex dataMutex; // mutex to read/write data
int* data;
int size;
ChunkData() : data(nullptr) { }
~ChunkData()
{
if (data)
{
delete[] data;
data = nullptr;
}
}
};
private:
mutex chunkMutex; // mutex to read/write member variables
shared_ptr<ChunkData> chunkData;
weak_ptr<ChunkData> neighbourChunkData;
string result;
public:
Chunk(string _name)
: chunkData(make_shared<ChunkData>())
{
}
~Chunk()
{
EndProcess();
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset();
}
void InitData(int size)
{
ChunkData* NewData = new ChunkData();
NewData->size = size;
NewData->data = new int[size];
{
unique_lock lock(chunkMutex); // is this needed?
chunkData.reset(NewData);
cout << "init chunk " << name << endl;
}
}
// This is executed in other thread. e.g. thread t(&Chunk::DoWork, this);
void DoWork()
{
lock_guard lock(chunkMutex); // we modify some members such as result(string) reading chunk data, so need this.
if (chunkData)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data)
{
// read chunkData->data[i] and modify some members such as result(string)
for (int i = 0; i < chunkData->size; ++i)
{
// Is this fine, or should I write data result outside of readLock scope?
result += to_string(chunkData->data[i]) + " ";
}
}
}
// does this work?
if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock())
{
shared_lock readLock(neighbour->dataMutex);
if (neighbour->data)
{
// read neighbour->data[i] and modify some members as above
}
}
}
shared_ptr<ChunkData> GetChunkData()
{
unique_lock lock(chunkMutex);
return chunkData;
}
void SetNeighbour(Chunk* neighbourChunk)
{
if (neighbourChunk)
{
// safe?
shared_ptr<ChunkData> newNeighbourData = neighbourChunk->GetChunkData();
unique_lock lock(chunkMutex); // lock for chunk properties
{
shared_lock readLock(newNeighbourData->dataMutex); // not sure if this is needed.
neighbourChunkData = newNeighbourData;
}
}
}
int GetDataAt(int index)
{
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}
};
Edit 1
I added more detail for DoWork function. Chunk data is read and chunk's member variables are edited in the function.
After Homer512's anwer, I came up with other questions.
A) In DoWork function I write a member variable inside a read lock. Should I only read data in a read lock scope and if I need to modify other data based on read data, do I have to do outside of the read lock? For example, copy the whole array to a local variable in a read lock, and modify other members outside of the read lock using the local.
B) I followed Homer512 and modifed GetDataAt/SetDataAt as below. I do read/write lock chunkData->dataMutex before unlocking chunkMutex. I also do this in DoWork function. Should I instead do locks separately? For example, make a local variable shared_ptr and set chunkData to it in a chunkMutex lock, unlock it, then lastly read/write lock that local variable's dataMutex and read/write data.
int GetDataAt(int index)
{
lock_guard chunkLock(chunkMutex);
shared_lock readLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
return chunkData->data[index];
}
return 0;
}
void SetDataAt(int index, int element)
{
lock_guard chunkLock(chunkMutex);
unique_lock writeLock(chunkData->dataMutex);
if (chunkData->data && 0 <= index && index < chunkData->size)
{
chunkData->data[index] = element;
}
}
I have several remarks:
~ChunkData: You could change your data member from int* to unique_ptr<int[]> to get the same result without an explicit destructor. Your code is correct though, just less convenient.
~Chunk: I don't think you need a lock or call the reset method. By the time the destructor runs, by definition, no one should have a reference to the Chunk object. So the lock can never be contested. And reset is unnecessary because the shared_ptr destructor will handle that.
InitData: Yes, the lock is needed because InitData can race with DoWork. You could avoid this by moving InitData to the constructor but I assume there are reasons for this division. You could also change the shared_ptr to std::atomic<std::shared_ptr<ChunkData> > to avoid the lock.
It is more efficient to write InitData like this:
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
// deletes old chunkData outside locked region if it was initialized before
}
make_shared avoids an additional memory allocation for the reference counter. This also moves all allocations and deallocations out of the critical section.
DoWork: Your comment "ready chunkData->data[i] and modify some members". You only take a shared_lock but say that you modify members. Well, which is it, reading or writing? Or do you mean to say that you modify Chunk but not ChunkData, with Chunk being protected by its own mutex?
SetNeighbour: You need to lock both your own chunkMutex and the neighbour's. You should not lock both at the same time to avoid the dining philosopher's problem (though std::lock solves this).
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
std::shared_ptr<ChunkData> newNeighbourData;
{
std::lock_guard<std::mutex> lock(neighbourChunk->chunkMutex);
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> lock(this->chunkMutex);
this->neighbourChunkData = newNeighbourData;
}
GetDataAt and SetDataAt: You need to lock chunkMutex. Otherwise you might race with InitData. There is no need to use std::lock because the order of locks is never swapped around.
EDIT 1:
DoWork: The line if (shared_ptr<ChunkData> neighbour = neighbourChunkData.lock()) doesn't keep the neighbur alive. Move the variable declaration out of the if to keep the reference.
EDIT: Alternative design proposal
What I'm bothered with is that your DoWork may be unable to proceed if InitData is still running or waiting to run. How do you want to deal with this? I suggest you make it possible to wait until the work can be done. Something like this:
class Chunk
{
std::mutex chunkMutex;
std::shared_ptr<ChunkData> chunkData;
std::weak_ptr<ChunkData> neighbourChunkData;
std::condition_variable chunkSet;
void waitForChunk(std::unique_lock<std::mutex>& lock)
{
while(! chunkData)
chunkSet.wait(lock);
}
public:
// modified version of my code above
void InitData(int size)
{
std::shared_ptr<ChunkData> NewData = std::make_shared<ChunkData>();
NewData->size = size;
NewData->data = new int[size]; // or std::make_unique<int[]>(size)
{
std::lock_guard<std::mutex> lock(chunkMutex);
chunkData.swap(NewData);
}
chunkSet.notify_all();
}
void DoWork()
{
std::unique_lock<std::mutex> ownLock(chunkMutex);
waitForChunk(lock); // blocks until other thread finishes InitData
{
shared_lock readLock(chunkData->dataMutex);
...
}
shared_ptr<ChunkData> neighbour = neighbourChunkData.lock();
if(! neighbour)
return;
shared_lock readLock(neighbour->dataMutex);
...
}
void SetNeighbour(Chunk* neighbourChunk)
{
if(! neighbourChunk)
return;
shared_ptr<ChunkData> newNeighbourData;
{
std::unique_lock<std::mutex> lock(neighbourChunk->chunkMutex);
neighbourChunk->waitForChunk(lock); // wait until neighbor has finished InitData
newNeighbourData = neighbourChunk->chunkData;
}
std::lock_guard<std::mutex> ownLock(this->chunkMutex);
this->neighbourChunkData = std::move(newNeighbourData);
}
};
The downside to this is that you could deadlock if InitData is never called or if it failed with an exception. There are ways around this, like using an std::shared_future which knows that it is valid (set when InitData is scheduled) and whether it failed (records exception of associated promise or packaged_task).

How to create thread safe cache of shared_ptr

I have a code that reads a lot of files. Some files can be cached. Consumer receives shared_ptr when asks for a file. Other consumers can ask for this file and get it from the cache if file is still in memory. If file is not in memory, it will be loaded and put in to the cache.
Simplified code:
struct File
{
File(std::string);
bool AllowCache() const;
};
typedef std::shared_ptr<File> SharedPtr;
typedef std::weak_ptr<File> WeakPtr;
std::map<std::string, WeakPtr> Cache;
SharedPtr GetFile(std::wstring Name)
{
auto Found = Cache.find(Name);
if (Found != Cache.end())
if (auto Exist = Found->second.lock())
return Exist;
auto New = boost::make_shared<File>(Name);
if (New->AllowCache())
Cache[Name] = New;
return New;
}
My question is: how to make this code thead safe? Even if I protect content of GetFile() by a mutex, it still can return non-null pointer from weak_ptr::lock() while other thread is running destructor of pointed File object.
I see some solutions, like:
Store shared_ptrs in the cache and run a separate thread that will
continuously remove shared_ptr-s with use_count()==1 (let's call it Cleanup()).
Store shared_ptrs in the cache and require consumers to use special wrapper of shared_ptr<File>. This wrapper will have shared_ptr<File> as a member and will reset() it at destructor and then call Cleanup().
1st solution is a bit ofoverkill. 2nd solution require to refactor all the code in my project. Both solutions are bad for me. Is any other way to make it thead safe?
Unless I'm misunderstanding the scenario you've described, A lock() of a WeakPtr will fail (i.e. return a dummy shared_ptr) if another thread is running a destructor of the File object. That's the logic of shared and weak pointers. So - your current solution should be thread safe in this respect; but - it may be non-thread-safe as you add or remove elemnrs. Read about that, say, in this question: C++ Thread-Safe Map .
I expected the following code will fail. But it's not. It seems like weak_ptr::lock() will not return pointer to an object that is in destruction process. And if so, it is a simpliest solution to just add a mutex and don't worry about returning dead objects by weak_ptr::lock().
char const *TestPath = "file.xml";
void Log(char const *Message, std::string const &Name, void const *File)
{
std::cout << Message
<< ": " << Name
<< " at memory=" << File
<< ", thread=" << std::this_thread::get_id()
<< std::endl;
}
void Sleep(int Seconds)
{
std::this_thread::sleep_for(std::chrono::seconds(Seconds));
}
struct File
{
File(std::string Name) : Name(Name)
{
Log("created", Name, this);
}
~File()
{
Log("destroying", Name, this);
Sleep(5);
Log("destroyed", Name, this);
}
std::string Name;
};
std::map<std::string, std::weak_ptr<File>> Cache;
std::mutex Mutex;
std::shared_ptr<File> GetFile(std::string Name)
{
std::unique_lock<std::mutex> Lock(Mutex); // locking is added
auto Found = Cache.find(Name);
if (Found != Cache.end())
if (auto Exist = Found->second.lock())
{
Log("found in cache", Name, Exist.get());
return Exist;
}
auto New = std::make_shared<File>(Name);
Cache[Name] = New;
return New;
}
void Thread2()
{
auto File = GetFile(TestPath);
//Sleep(3); // uncomment to share file with main thead
}
int main()
{
std::thread thread(&Thread2);
Sleep(1);
auto File = GetFile(TestPath);
thread.join();
return 0;
}
My expectation:
thread2: created
thread2: destroying
thread1: found in cache <--- fail. dead object :(
thread2: destroyed
VS2017 results:
thread2: created
thread2: destroying
thread1: created <--- old object is not re-used! great ;)
thread2: destroyed

c++: Writing to a file in multithreaded program

So I have multiple threads writing to the same file by calling Log::write method.
class Log
{
private:
ofstream log;
string file_path;
public:
Log(string);
void write(string);
};
Log::Log(string _file_path)
{
file_path=_file_path;
}
void Log::write(string str)
{
EnterCriticalSection(&CriticalSection);
log.open(file_path.c_str(),std::ofstream::app);
log<<str+'\n';
log.close();
LeaveCriticalSection(&CriticalSection);
}
Is it safe if threads will call Log::write method of the same object at the same time?
Your code is wasteful and does not follow C++ idioms.
Starting from the end : yes, write is thread safe, because win32 CRITICAL_SECTION protects it from concurrent modifications.
although:
why open and close the stream each time? this is very wasteful thing to do. open the stream in the constructor and leave it open. the destructor will deal with closing the stream.
if you want to use Win32 critical section, at least make it RAII safe. make a class which wraps a reference to critical section, locking it in a constructor and unlocking it in the destructor. this way even if an exception is thrown - you are guaranteed that the lock will be unlocked.
where is the deceleration of CriticalSection anyway? it should be a member of Log.
are you aware of std::mutex?
why are you passing strings by value? it is very un-efficient. pass then by const reference.
you use snake_case for some of the variables (file_path) but upper camel case for other (CriticalSection). use the same convention.
str is never a good name for a string variable, and the file stream is not a log. is the thing that does the actual logging. logger is a better name. in my correction is just named it m_file_stream.
Corrected code:
class Log
{
private:
std::mutex m_lock;
std::ofstream m_file_stream;
std::string m_file_path;
public:
Log(const std::string& file_path);
void write(const std::string& log);
};
Log::Log(const std::string& file_path):
m_file_path(file_path)
{
m_file_stream.open(m_file_path.c_str());
if (!m_file_stream.is_open() || !m_file_stream.good())
{
//throw relevant exception.
}
}
void Log::write(const std::string& log)
{
std::lock_guard<std::mutex> lock(m_lock);
m_file_stream << log << '\n';
}

When one worker thread fails, how to abort remaining workers?

I have a program which spawns multiple threads, each of which executes a long-running task. The main thread then waits for all worker threads to join, collects results, and exits.
If an error occurs in one of the workers, I want the remaining workers to stop gracefully, so that the main thread can exit shortly afterwards.
My question is how best to do this, when the implementation of the long-running task is provided by a library whose code I cannot modify.
Here is a simple sketch of the system, with no error handling:
void threadFunc()
{
// Do long-running stuff
}
void mainFunc()
{
std::vector<std::thread> threads;
for (int i = 0; i < 3; ++i) {
threads.push_back(std::thread(&threadFunc));
}
for (auto &t : threads) {
t.join();
}
}
If the long-running function executes a loop and I have access to the code, then
execution can be aborted simply by checking a shared "keep on running" flag at the top of each iteration.
std::mutex mutex;
bool error;
void threadFunc()
{
try {
for (...) {
{
std::unique_lock<std::mutex> lock(mutex);
if (error) {
break;
}
}
}
} catch (std::exception &) {
std::unique_lock<std::mutex> lock(mutex);
error = true;
}
}
Now consider the case when the long-running operation is provided by a library:
std::mutex mutex;
bool error;
class Task
{
public:
// Blocks until completion, error, or stop() is called
void run();
void stop();
};
void threadFunc(Task &task)
{
try {
task.run();
} catch (std::exception &) {
std::unique_lock<std::mutex> lock(mutex);
error = true;
}
}
In this case, the main thread has to handle the error, and call stop() on
the still-running tasks. As such, it cannot simply wait for each worker to
join() as in the original implementation.
The approach I have used so far is to share the following structure between
the main thread and each worker:
struct SharedData
{
std::mutex mutex;
std::condition_variable condVar;
bool error;
int running;
}
When a worker completes successfully, it decrements the running count. If
an exception is caught, the worker sets the error flag. In both cases, it
then calls condVar.notify_one().
The main thread then waits on the condition variable, waking up if either
error is set or running reaches zero. On waking up, the main thread
calls stop() on all tasks if error has been set.
This approach works, but I feel there should be a cleaner solution using some
of the higher-level primitives in the standard concurrency library. Can
anyone suggest an improved implementation?
Here is the complete code for my current solution:
// main.cpp
#include <chrono>
#include <mutex>
#include <thread>
#include <vector>
#include "utils.h"
// Class which encapsulates long-running task, and provides a mechanism for aborting it
class Task
{
public:
Task(int tidx, bool fail)
: tidx(tidx)
, fail(fail)
, m_run(true)
{
}
void run()
{
static const int NUM_ITERATIONS = 10;
for (int iter = 0; iter < NUM_ITERATIONS; ++iter) {
{
std::unique_lock<std::mutex> lock(m_mutex);
if (!m_run) {
out() << "thread " << tidx << " aborting";
break;
}
}
out() << "thread " << tidx << " iter " << iter;
std::this_thread::sleep_for(std::chrono::milliseconds(100));
if (fail) {
throw std::exception();
}
}
}
void stop()
{
std::unique_lock<std::mutex> lock(m_mutex);
m_run = false;
}
const int tidx;
const bool fail;
private:
std::mutex m_mutex;
bool m_run;
};
// Data shared between all threads
struct SharedData
{
std::mutex mutex;
std::condition_variable condVar;
bool error;
int running;
SharedData(int count)
: error(false)
, running(count)
{
}
};
void threadFunc(Task &task, SharedData &shared)
{
try {
out() << "thread " << task.tidx << " starting";
task.run(); // Blocks until task completes or is aborted by main thread
out() << "thread " << task.tidx << " ended";
} catch (std::exception &) {
out() << "thread " << task.tidx << " failed";
std::unique_lock<std::mutex> lock(shared.mutex);
shared.error = true;
}
{
std::unique_lock<std::mutex> lock(shared.mutex);
--shared.running;
}
shared.condVar.notify_one();
}
int main(int argc, char **argv)
{
static const int NUM_THREADS = 3;
std::vector<std::unique_ptr<Task>> tasks(NUM_THREADS);
std::vector<std::thread> threads(NUM_THREADS);
SharedData shared(NUM_THREADS);
for (int tidx = 0; tidx < NUM_THREADS; ++tidx) {
const bool fail = (tidx == 1);
tasks[tidx] = std::make_unique<Task>(tidx, fail);
threads[tidx] = std::thread(&threadFunc, std::ref(*tasks[tidx]), std::ref(shared));
}
{
std::unique_lock<std::mutex> lock(shared.mutex);
// Wake up when either all tasks have completed, or any one has failed
shared.condVar.wait(lock, [&shared](){
return shared.error || !shared.running;
});
if (shared.error) {
out() << "error occurred - terminating remaining tasks";
for (auto &t : tasks) {
t->stop();
}
}
}
for (int tidx = 0; tidx < NUM_THREADS; ++tidx) {
out() << "waiting for thread " << tidx << " to join";
threads[tidx].join();
out() << "thread " << tidx << " joined";
}
out() << "program complete";
return 0;
}
Some utility functions are defined here:
// utils.h
#include <iostream>
#include <mutex>
#include <thread>
#ifndef UTILS_H
#define UTILS_H
#if __cplusplus <= 201103L
// Backport std::make_unique from C++14
#include <memory>
namespace std {
template<typename T, typename ...Args>
std::unique_ptr<T> make_unique(
Args&& ...args)
{
return std::unique_ptr<T>(new T(std::forward<Args>(args)...));
}
} // namespace std
#endif // __cplusplus <= 201103L
// Thread-safe wrapper around std::cout
class ThreadSafeStdOut
{
public:
ThreadSafeStdOut()
: m_lock(m_mutex)
{
}
~ThreadSafeStdOut()
{
std::cout << std::endl;
}
template <typename T>
ThreadSafeStdOut &operator<<(const T &obj)
{
std::cout << obj;
return *this;
}
private:
static std::mutex m_mutex;
std::unique_lock<std::mutex> m_lock;
};
std::mutex ThreadSafeStdOut::m_mutex;
// Convenience function for performing thread-safe output
ThreadSafeStdOut out()
{
return ThreadSafeStdOut();
}
#endif // UTILS_H
I've been thinking about your situation for sometime and this maybe of some help to you. You could probably try doing a couple of different methods to achieve you goal. There are 2-3 options that maybe of use or a combination of all three. I will at minimum show the first option for I'm still learning and trying to master the concepts of Template Specializations as well as using Lambdas.
Using a Manager Class
Using Template Specialization Encapsulation
Using Lambdas.
Pseudo code of a Manager Class would look something like this:
class ThreadManager {
private:
std::unique_ptr<MainThread> mainThread_;
std::list<std::shared_ptr<WorkerThread> lWorkers_; // List to hold finished workers
std::queue<std::shared_ptr<WorkerThread> qWorkers_; // Queue to hold inactive and waiting threads.
std::map<unsigned, std::shared_ptr<WorkerThread> mThreadIds_; // Map to associate a WorkerThread with an ID value.
std::map<unsigned, bool> mFinishedThreads_; // A map to keep track of finished and unfinished threads.
bool threadError_; // Not needed if using exception handling
public:
explicit ThreadManager( const MainThread& main_thread );
void shutdownThread( const unsigned& threadId );
void shutdownAllThreads();
void addWorker( const WorkerThread& worker_thread );
bool isThreadDone( const unsigned& threadId );
void spawnMainThread() const; // Method to start main thread's work.
void spawnWorkerThread( unsigned threadId, bool& error );
bool getThreadError( unsigned& threadID ); // Returns True If Thread Encountered An Error and passes the ID of that thread,
};
Only for demonstration purposes did I use bool value to determine if a thread failed for simplicity of the structure, and of course this can be substituted to your like if you prefer to use exceptions or invalid unsigned values, etc.
Now to use a class of this sort would be something like this: Also note that a class of this type would be considered better if it was a Singleton type object since you wouldn't want more than 1 ManagerClass since you are working with shared pointers.
SomeClass::SomeClass( ... ) {
// This class could contain a private static smart pointer of this Manager Class
// Initialize the smart pointer giving it new memory for the Manager Class and by passing it a pointer of the Main Thread object
threadManager_ = new ThreadManager( main_thread ); // Wouldn't actually use raw pointers here unless if you had a need to, but just shown for simplicity
}
SomeClass::addThreads( ... ) {
for ( unsigned u = 1, u <= threadCount; u++ ) {
threadManager_->addWorker( some_worker_thread );
}
}
SomeClass::someFunctionThatSpawnsThreads( ... ) {
threadManager_->spawnMainThread();
bool error = false;
for ( unsigned u = 1; u <= threadCount; u++ ) {
threadManager_->spawnWorkerThread( u, error );
if ( error ) { // This Thread Failed To Start, Shutdown All Threads
threadManager->shutdownAllThreads();
}
}
// If all threads spawn successfully we can do a while loop here to listen if one fails.
unsigned threadId;
while ( threadManager_->getThreadError( threadId ) ) {
// If the function passed to this while loop returns true and we end up here, it will pass the id value of the failed thread.
// We can now go through a for loop and stop all active threads.
for ( unsigned u = threadID + 1; u <= threadCount; u++ ) {
threadManager_->shutdownThread( u );
}
// We have successfully shutdown all threads
break;
}
}
I like the design of manager class since I have used them in other projects, and they come in handy quite often especially when working with a code base that contains many and multiple resources such as a working Game Engine that has many assets such as Sprites, Textures, Audio Files, Maps, Game Items etc. Using a Manager Class helps to keep track and maintain all of the assets. This same concept can be applied to "Managing" Active, Inactive, Waiting Threads, and knows how to intuitively handle and shutdown all threads properly. I would recommend using an ExceptionHandler if your code base and libraries support exceptions as well as thread safe exception handling instead of passing and using bools for errors. Also having a Logger class is good to where it can write to a log file and or a console window to give an explicit message of what function the exception was thrown in and what caused the exception where a log message might look like this:
Exception Thrown: someFunctionNamedThis in ThisFile on Line# (x)
threadID 021342 failed to execute.
This way you can look at the log file and find out very quickly what thread is causing the exception, instead of using passed around bool variables.
The implementation of the long-running task is provided by a library whose code I cannot modify.
That means you have no way to synchronize the job done by working threads
If an error occurs in one of the workers,
Let's suppose that you can really detect worker errors; some of then can be easily detected if reported by the used library others cannot i.e.
the library code loops.
the library code prematurely exit with an uncaught exception.
I want the remaining workers to stop **gracefully**
That's just not possible
The best you can do is writing a thread manager checking on worker thread status and if an error condition is detected it just (ungracefully) "kills" all the worker threads and exits.
You should also consider detecting a looped working thread (by timeout) and offer to the user the option to kill or continue waiting for the process to finish.
Your problem is that the long running function is not your code, and you say you cannot modify it. Consequently you cannot make it pay any attention whatsoever to any kind of external synchronisation primitive (condition variables, semaphores, mutexes, pipes, etc), unless the library developer has done that for you.
Therefore your only option is to do something that wrestles control away from any code no matter what it's doing. This is what signals do. For that, you're going to have to use pthread_kill(), or whatever the equivalent is these days.
The pattern would be that
The thread that detects an error needs to communicate that error back to the main thread in some manner.
The main thread then needs to call pthread_kill() for all the other remaining threads. Don't be confused by the name - pthread_kill() is simply a way of delivering an arbitrary signal to a thread. Note that signals like STOP, CONTINUE and TERMINATE are process-wide even if raised with pthread_kill(), not thread specific so don't use those.
In each of those threads you'll need a signal handler. On delivery of the signal to a thread the execution path in that thread will jump to the handler no matter what the long running function was doing.
You are now back in (limited) control, and can (probably, well, maybe) do some limited cleanup and terminate the thread.
In the meantime the main thread will have been calling pthread_join() on all the threads it's signaled, and those will now return.
My thoughts:
This is a really ugly way of doing it (and signals / pthreads are notoriously difficult to get right and I'm no expert), but I don't really see what other choice you have.
It'll be a long way from looking 'graceful' in source code, though the end user experience will be OK.
You will be aborting execution part way through running that library function, so if there's any clean up it would normally do (e.g. freeing up memory it has allocated) that won't get done and you'll have a memory leak. Running under something like valgrind is a way of working out if this is happening.
The only way of getting the library function to clean up (if it needs it) will be for your signal handler to return control to the function and letting it run to completion, just what you don't want to do.
And of course, this won't work on Windows (no pthreads, at least none worth speaking of, though there may be an equivalent mechanism).
Really the best way is going to be to re-implement (if at all possible) that library function.

How do I make a non concurrent print to file system in C++?

I am programming in C++ with the intention to provide some client/server communication between Unreal Engine 4 and my server.
I am in need of a logging system but the current ones are flooded by system messages.
So I made a Logger class with a ofstream object which I do file << "Write message." << endl.
Problem is that each object makes another instance of the ofstream and several longer writes to the file get cut off by newer writes.
I am looking for a way to queue writing to a file, this system/function/stream being easy to include and call.
Bonus points: the ofstream seems to complain whenever I try to write std::string and Fstring :|
log asynchronously using i.e. g2log or using a non-blocking socket wrapper, such as zeromq
ofstream can't be used across multiple threads. It needs to be synchronized using mutex or similar objects. Check the below thread for details:ofstream shared by mutiple threads - crashes after awhile
I wrote a quick example of how you can implement something like that. Please keep in mind that this may not be a final solution and still requires additional error checking and so on ...
#include <concurrent_queue.h>
#include <string>
#include <thread>
#include <fstream>
#include <future>
class Message
{
public:
Message() : text_(), sender_(), quit_(true)
{}
Message(std::string text, std::thread::id sender)
: text_(std::move(text)), sender_(sender), quit_(false)
{}
bool isQuit() const { return quit_; }
std::string getText() const { return text_; }
std::thread::id getSender() const { return sender_; }
private:
bool quit_;
std::string text_;
std::thread::id sender_;
};
class Log
{
public:
Log(const std::string& fileName)
: workerThread_(&Log::threadFn, this, fileName)
{}
~Log()
{
queue_.push(Message()); // push quit message
workerThread_.join();
}
void write(std::string text)
{
queue_.push(Message(std::move(text), std::this_thread::get_id()));
}
private:
static void threadFn(Log* log, std::string fileName)
{
std::ofstream out;
out.open(fileName, std::ios::out);
assert(out.is_open());
// Todo: ... some error checking here
Message msg;
while(true)
{
if(log->queue_.try_pop(msg))
{
if(msg.isQuit())
break;
out << msg.getText() << std::endl;
}
else
{
std::this_thread::yield();
}
}
}
concurrency::concurrent_queue<Message> queue_;
std::thread workerThread_;
};
int main(int argc, char* argv[])
{
Log log("test.txt");
Log* pLog = &log;
auto fun = [pLog]()
{
for(int i = 0; i < 100; ++i)
pLog->write(std::to_string(i));
};
// start some test threads
auto f0 = std::async(fun);
auto f1 = std::async(fun);
auto f2 = std::async(fun);
auto f3 = std::async(fun);
// wait for all
f0.get();
f1.get();
f2.get();
f3.get();
return 0;
}
The main idea is to use one Log class that has a thread safe write() method that may be called from multiple threads simultaneously. The Log class uses a worker thread to put all the file access to another thread. It uses a threadsafe (possibly lock-free) data structure to transfer all messages from the sending thread to the worker thread (I used concurrent_queue here - but there are others as well). Using a small Message wrapper it is very simple to tell the worker thread to shut down. Afterwards join it and everything is fine.
You have to make sure that the Log is not destroyed as long as any thread that may possibly write to it is still running.