c++: Writing to a file in multithreaded program

c++: Writing to a file in multithreaded program - c++

So I have multiple threads writing to the same file by calling Log::write method.
class Log
{
private:
ofstream log;
string file_path;
public:
Log(string);
void write(string);
};
Log::Log(string _file_path)
{
file_path=_file_path;
}
void Log::write(string str)
{
EnterCriticalSection(&CriticalSection);
log.open(file_path.c_str(),std::ofstream::app);
log<<str+'\n';
log.close();
LeaveCriticalSection(&CriticalSection);
}
Is it safe if threads will call Log::write method of the same object at the same time?

Your code is wasteful and does not follow C++ idioms.
Starting from the end : yes, write is thread safe, because win32 CRITICAL_SECTION protects it from concurrent modifications.
although:
why open and close the stream each time? this is very wasteful thing to do. open the stream in the constructor and leave it open. the destructor will deal with closing the stream.
if you want to use Win32 critical section, at least make it RAII safe. make a class which wraps a reference to critical section, locking it in a constructor and unlocking it in the destructor. this way even if an exception is thrown - you are guaranteed that the lock will be unlocked.
where is the deceleration of CriticalSection anyway? it should be a member of Log.
are you aware of std::mutex?
why are you passing strings by value? it is very un-efficient. pass then by const reference.
you use snake_case for some of the variables (file_path) but upper camel case for other (CriticalSection). use the same convention.
str is never a good name for a string variable, and the file stream is not a log. is the thing that does the actual logging. logger is a better name. in my correction is just named it m_file_stream.
Corrected code:
class Log
{
private:
std::mutex m_lock;
std::ofstream m_file_stream;
std::string m_file_path;
public:
Log(const std::string& file_path);
void write(const std::string& log);
};
Log::Log(const std::string& file_path):
m_file_path(file_path)
{
m_file_stream.open(m_file_path.c_str());
if (!m_file_stream.is_open() || !m_file_stream.good())
{
//throw relevant exception.
}
}
void Log::write(const std::string& log)
{
std::lock_guard<std::mutex> lock(m_lock);
m_file_stream << log << '\n';
}

Related

Static member function and thread safety

I've a private static method in a class that has all the methods as static. It's a helper class that helps with logging and other stuff. This helper class would be called by multiple threads. I don't understand how the method in question is working safely with multiple threads without a lock.
//Helper.cpp
std::recursive_mutex Logger::logMutex_;
void Logger::write(const std::string &logFilePath, const std::string &formattedLog)
{
std::lock_guard<std::mutex> guard(logMutex_);
std::ofstream logFile(logFilePath.c_str(), std::ios::out | std::ios::app);
if (logFile.is_open())
{
logFile << formattedLog;
logFile.close();
}
}
void Logger::error(const string &appId, const std::string &fmt, ...)
{
auto logFile = validateLogFile(appId); //Also a private static method that validates if a file exists for this app ID.
if (!logFile.empty())
{
//Format the log
write(logFile, log);
}
}
//Helper.h
class Logger
{
public:
static void error(const std::string &Id, const std::string &fmt, ...);
private:
static void write(const std::string &fileName, const std::string &formattedLog);
static std::recursive_mutex logMutex_;
};
I understand that the local variables inside the static methods are purely local. i.e., A stack is created every time these methods are called and the variables are initialized in them. Now, in the Logger::write method I'm opening a file and writing to it. So when multiple threads calls the write method through the Logger::error method(Which is again static) and when there's no lock, I believe I should see some data race/crash.
Because multiple threads are trying to open the same file. Even if the kernel allows to open a file multiple times, I must see some fault in the data written to the file.
I tested this with running up to 100 threads and I see no crash, all the data is being written to the file concurrently. I can't completely understand how this is working. With or without the lock, I see the data is being written to the file perfectly.
TEST_F(GivenALogger, WhenLoggerMethodsAreCalledFromMultipleThreads_AllTheLogsMustBeLogged)
{
std::vector<std::thread> threads;
int num_threads = 100;
int i = 0;
for (; i < num_threads / 2; i++)
{
threads.push_back(std::thread(&Logger::error, validId, "Trial %d", i));
}
for (; i < num_threads; i++)
{
threads.push_back(std::thread(&Logger::debug, validId, "Trial %d", i));
}
std::for_each(threads.begin(), threads.end(), [](std::thread &t) { t.join(); });
auto actualLog = getActualLog(); // Returns a vector of log lines.
EXPECT_EQ(num_threads, actualLog.size());
}
Also, how should I properly/safely access the file?

The key is this line:
std::lock_guard<std::mutex> guard(logMutex_);
The std::lock_guard<std::mutex> will lock the mutex logMutex_ in the constructor and will unlock the mutex in the destructor when it goes out of scope, ie when the method returns.
If another thread attempts to write while the first thread is within the guard scope, the new (local) guard will try to lock the logMutex_ and that thread will be put to sleep until the lock is released.

C++ condition variable without mutexes?

Problem
I think I'm misunderstanding the CV-Mutex design pattern because I'm creating a program that seems to not need a mutex, only CV.
Goal Overview
I am parsing a feed from a website from 2 different accounts. Alice, Bob. The parsing task is slow, so I have two separate threads each dedicated to handling the feeds from Alice and Bob.
I then have a thread that receives messages from the network and assigns the work to either the threadA or threadB, depending on who the update message is for. That way the reader/network thread isn't stalled, and the messages for Alice are in-order and the messages for Bob are in-order, too.
I don't care if Alice thread is a little bit behind Bob thread chronologically, as long as the individual account feeds are in-order.
Implementation Details
This is very similar to a thread pool, except the threads are essentially locked to a fixed-size array of size 2, and I use the same thread for each feed.
I create a AccountThread class which maintains a queue of JSON messages to be processed as soon as possible within the class. Here is the code for that:
#include <queue>
#include <string>
#include <condition_variable>
#include <mutex>
using namespace std;
class AccountThread {
public:
AccountThread(const string& name) : name(name) { }
void add_message(const string& d) {
this->message_queue.push(d);
this->cv.notify_all(); // could also do notify_one but whatever
}
void run_parsing_loop() {
while (true) {
std::unique_lock<std::mutex> mlock(lock_mutex);
cv.wait(mlock, [&] {
return this->is_dead || this->message_queue.size() > 0;
});
if (this->is_dead) { break; }
const auto message = this->message_queue.front();
this->message_queue.pop();
// Do message parsing...
}
}
void kill_thread() {
this->is_dead = true;
}
private:
const string& name;
condition_variable cv;
mutex lock_mutex;
queue<string> message_queue;
// To Kill Thread if Needed
bool is_dead;
};
I can add the main.cpp code, but it's essentially just a reader loop that calls thread.add_message(message) based on what the account name is.
Question
Why do I need the lock_mutex here? I don't see it's purpose since this class is essentially single-threaded. Is there a better design pattern for this? I feel like if I'm including a variable that I don't really need, such as the mutex then I'm using the wrong design pattern for this task.
I'm just adapting the code from some article I saw online about a threadpool implementation and was curious.

First things first: there's no condition_variable::wait without a mutex. The interface of wait requires a mutex. So regarding
I'm creating a program that seems to not need a mutex, only CV
note that the mutex is needed to protect the condition variable itself. If the notion of how you'd have a data race without the mutex doesn't immediately make sense, check Why do pthreads’ condition variable functions require a mutex.
Secondly there's multiple pain points in the code you provide. Consider this version where the problems are addressed and I'll explain the issues below:
class AccountThread {
public:
AccountThread(const string& name) : name(name)
{
consumer = std::thread(&AccountThread::run_parsing_loop, this); // 1
}
~AccountThread()
{
kill_thread(); // 2
consumer.join();
}
void add_message(const string& d) {
{
std::lock_guard lok(lock_mutex); // 3
this->message_queue.push(d);
}
this->cv.notify_one();
}
private:
void run_parsing_loop()
{
while (!is_dead) {
std::unique_lock<std::mutex> mlock(lock_mutex);
cv.wait(mlock, [this] { // 4
return is_dead || !message_queue.empty();
});
if (this->is_dead) { break; }
std::string message = this->message_queue.front();
this->message_queue.pop();
string parsingMsg = name + " is processing " + message + "\n";
std::cout << parsingMsg;
}
}
void kill_thread() {
{
std::lock_guard lock(lock_mutex);
this->is_dead = true;
}
cv.notify_one(); // 5
}
private:
string name; // 6
mutable condition_variable cv; // 7
mutable mutex lock_mutex;
std::thread consumer;
queue<string> message_queue;
bool is_dead{false}; // 8
};
Top to bottom the problems noted (in the numbered comments are):
If you have a worker thread class, like AccountThread, it's easier to get right when the class provides the thread. This way only the relevant interface is exposed and you have better control over the lifetime and workings of the consumer.
Case in point, when an AccountThread "dies" the worker should also die. In the example above I fix this dependency by killing the consumer thread inside the destructor.
add_message caused a data race in your code. Since you intend to run the parsing loop in a different thread, it's wrong to simply push to the queue without having a critical section.
It's cleaner to capture this here, e.g. you probably don't need the reference to mlock captured.
kill_thread was not correct. You need to notify the, potentially waiting, consumer thread that a change in state happened. To correctly do that you need to protect the state checked in the predicate with a lock.
The initial version with const string &name is probably not something you want. Member const references don't extend the lifetime of temporaries, and the way your constructor is written can leave an instance with dangling state. Even if you do the typical checks, overload the constructor with an r-value reference version, you'll be depending on an external string being alive longer than your AccountThread object. Better use a value member.
Remember the M&M rule.
You had undefined behavior. The is_alive member was used without being initialized.
Demo
All in all, I think the suggested changes point in the right direction. You can also check an implementation of a Go-like communication channel if you want more insight on how something like the TBB component you mention is implemented. Such a channel (or buffer queue) would simplify implementation to avoid manual usage of mutexes, CVs and alive states:
class AccountThread {
public:
AccountThread(const string& name) : name(name) {
consumer = std::thread(&AccountThread::run_parsing_loop, this);
}
~AccountThread() {
kill_thread();
consumer.join();
}
void add_message(const string& d) { _data.push(d); }
private:
void run_parsing_loop() {
try {
while (true) {
// This pop waits until there's data or the channel is closed.
auto message = _data.pop();
// TODO: Implement parsing here
}
} catch (...) {
// Single exception thrown per thread lifetime
}
}
void kill_thread() { _data.set(yap::BufferBehavior::Closed); }
private:
string name;
std::thread consumer;
yap::BufferQueue<string> _data;
};
Demo2

Two pcap_compile() on one device at same time?

I have two threads and each one has packet capture from the same deviсe at the same time but the program crashes when the second thread reaches the pcap_compile() function. Also each thread has their own variables and don't use global. It seems that they get the same handle of the device, therefore the program crashes. Why do I need two threads? Because I want to seperate packets on the sent and on the recived by specified pcap filter. So how do I solve this? Or is it better to use one thread and sort manually the sent and the received packets by using the address from tcp header?

pcap_compile is not thread safe. You must surround all calls to it that may be encountered by separate threads with a critical section/mutex to prevent errors because of non thread-safe state within the parser that compiles the expression (for the gory details, it uses YACC to create code for parsing the expression and the code generated for that is eminently not thread safe).
You need to explicitly open the device once per thread that you're planning on using for the capture, if you reuse the same device handle across multiple threads then it will simply not do what you're asking for. You should open the pcap handle within the thread that you're planning on using it, so each thread that's planning on doing capture should do it's own pcap_open.
to guard the call to pcap_compile with a Critical Section, you could create a simple wrapper (C++ wrapper of windows critical section):
class lock_interface {
public:
virtual void lock() = 0;
virtual void unlock() = 0;
};
class cs : public lock_interface {
CRITICAL_SECTION crit;
public:
cs() { InitializeCriticalSection(&crit); }
~cs() { DeleteCriticalSection(&crit); }
virtual void lock() {
EnterCriticalSection(&crit);
}
virtual void unlock() {
LeaveCriticalSection(&crit);
}
private:
cs(const locker &);
cs &operator=(const cs &);
};
class locker {
lock_interface &m_ref;
public:
locker(lock_interface &ref) : m_ref(ref) { m_ref.lock(); }
~locker() { m_ref.unlock(); }
private:
locker(const locker &);
locker &operator=(const locker &);
};
static cs section;
int
wrapped_pcap_compile(pcap_t *p, struct bpf_program *fp, const char *str, int optimize, bpf_u_int32 netmask)
{
locker locked(section);
pcap_compile(p, fp, str, optimize, netmask);
}

If you are using C++11, you can have something like:
int thread_safe_pcap_compile_nopcap(int snap_len, int link_type,
struct bpf_program *fp, char const *str,
int optimize, bpf_u_int32 netmask) {
static std::mutex mtx;
std::lock_guard<std::mutex> lock(mtx);
return pcap_compile_nopcap(snap_len, link_type, fp, str, optimize, netmask);
}
It is similar for pcap_compile function.

How do I make a non concurrent print to file system in C++?

I am programming in C++ with the intention to provide some client/server communication between Unreal Engine 4 and my server.
I am in need of a logging system but the current ones are flooded by system messages.
So I made a Logger class with a ofstream object which I do file << "Write message." << endl.
Problem is that each object makes another instance of the ofstream and several longer writes to the file get cut off by newer writes.
I am looking for a way to queue writing to a file, this system/function/stream being easy to include and call.
Bonus points: the ofstream seems to complain whenever I try to write std::string and Fstring :|

log asynchronously using i.e. g2log or using a non-blocking socket wrapper, such as zeromq

ofstream can't be used across multiple threads. It needs to be synchronized using mutex or similar objects. Check the below thread for details:ofstream shared by mutiple threads - crashes after awhile

I wrote a quick example of how you can implement something like that. Please keep in mind that this may not be a final solution and still requires additional error checking and so on ...
#include <concurrent_queue.h>
#include <string>
#include <thread>
#include <fstream>
#include <future>
class Message
{
public:
Message() : text_(), sender_(), quit_(true)
{}
Message(std::string text, std::thread::id sender)
: text_(std::move(text)), sender_(sender), quit_(false)
{}
bool isQuit() const { return quit_; }
std::string getText() const { return text_; }
std::thread::id getSender() const { return sender_; }
private:
bool quit_;
std::string text_;
std::thread::id sender_;
};
class Log
{
public:
Log(const std::string& fileName)
: workerThread_(&Log::threadFn, this, fileName)
{}
~Log()
{
queue_.push(Message()); // push quit message
workerThread_.join();
}
void write(std::string text)
{
queue_.push(Message(std::move(text), std::this_thread::get_id()));
}
private:
static void threadFn(Log* log, std::string fileName)
{
std::ofstream out;
out.open(fileName, std::ios::out);
assert(out.is_open());
// Todo: ... some error checking here
Message msg;
while(true)
{
if(log->queue_.try_pop(msg))
{
if(msg.isQuit())
break;
out << msg.getText() << std::endl;
}
else
{
std::this_thread::yield();
}
}
}
concurrency::concurrent_queue<Message> queue_;
std::thread workerThread_;
};
int main(int argc, char* argv[])
{
Log log("test.txt");
Log* pLog = &log;
auto fun = [pLog]()
{
for(int i = 0; i < 100; ++i)
pLog->write(std::to_string(i));
};
// start some test threads
auto f0 = std::async(fun);
auto f1 = std::async(fun);
auto f2 = std::async(fun);
auto f3 = std::async(fun);
// wait for all
f0.get();
f1.get();
f2.get();
f3.get();
return 0;
}
The main idea is to use one Log class that has a thread safe write() method that may be called from multiple threads simultaneously. The Log class uses a worker thread to put all the file access to another thread. It uses a threadsafe (possibly lock-free) data structure to transfer all messages from the sending thread to the worker thread (I used concurrent_queue here - but there are others as well). Using a small Message wrapper it is very simple to tell the worker thread to shut down. Afterwards join it and everything is fine.
You have to make sure that the Log is not destroyed as long as any thread that may possibly write to it is still running.

Multithreaded logger

I am trying to create a logger for multithreaded c++ code using boost. Here's my code:
class logger
{
private:
boost::mutex logMtx;
public:
logger()
{
}
~logger()
{
}
void logString(string z)
{
boost::mutex::scoped_lock lock(logMtx);
std::cout<<z<<std::endl;
std::cout.flush();
}
};
Then I share an instance (The instance is created in main thread before creating other threads) of this with multiple threads and call the logString function for logging. It does not seem to work. Some lines are coming truncated (Whole string is not printing - i.e if I pass "abcd" it prints "bcd" sometimes).
Is there something wrong with this approach?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

c++: Writing to a file in multithreaded program - c++

Related

Static member function and thread safety

C++ condition variable without mutexes?

Two pcap_compile() on one device at same time?

How do I make a non concurrent print to file system in C++?

Multithreaded logger

Categories

Resources