What is the easiest way to create my own std::cerr so that it is line-by-line thread-safe.
I am preferably looking for the code to do it.
What I need is so that a line of output (terminated with std::endl) generated by one thread stays as a line of output when I actually see it on my console (and is not mixed with some other thread's output).
Solution: std::cerr is much slower than cstdio. I prefer using fprintf(stderr, "The message") inside of a CriticalSectionLocker class whose constructor acquires a thread-safe lock and the destructor releases it.
If available, osyncstream (C++20) solves this problem:
#include <syncstream> // C++20
std::osyncstream tout(std::cout);
std::osyncstream terr(std::cerr);
If the above feature is not available, here is a drop-in header file containing two macros for thread-safe writing to std::cout and std::cerr (which must share a mutex in order to avoid interleaving of output). These are based on two other answers, but I have made some changes to make it easy to drop into an existing code base. This works with C++11 and forward.
I've tested this with 4 threads on a 4-core processor, with each thread writing 25,000 lines per second to tout and occasional output to terr, and it solves the output interleaving problem. Unlike a struct-based solution, there was no measurable performance hit for my application when dropping in this header file. The only drawback I can think of is that since this relies on macros, they can't be placed into a namespace.
threadstream.h
#ifndef THREADSTREAM
#define THREADSTREAM
#include <iostream>
#include <sstream>
#include <mutex>
#define terr ThreadStream(std::cerr)
#define tout ThreadStream(std::cout)
/**
* Thread-safe std::ostream class.
*
* Usage:
* tout << "Hello world!" << std::endl;
* terr << "Hello world!" << std::endl;
*/
class ThreadStream : public std::ostringstream
{
public:
ThreadStream(std::ostream& os) : os_(os)
{
// copyfmt causes odd problems with lost output
// probably some specific flag
// copyfmt(os);
// copy whatever properties are relevant
imbue(os.getloc());
precision(os.precision());
width(os.width());
setf(std::ios::fixed, std::ios::floatfield);
}
~ThreadStream()
{
std::lock_guard<std::mutex> guard(_mutex_threadstream);
os_ << this->str();
}
private:
static std::mutex _mutex_threadstream;
std::ostream& os_;
};
std::mutex ThreadStream::_mutex_threadstream{};
#endif
test.cc
#include <thread>
#include <vector>
#include <iomanip>
#include "threadstream.h"
void test(const unsigned int threadNumber)
{
tout << "Thread " << threadNumber << ": launched" << std::endl;
}
int main()
{
std::locale mylocale(""); // get global locale
std::cerr.imbue(mylocale); // imbue global locale
std::ios_base::sync_with_stdio(false); // disable synch with stdio (enables input buffering)
std::cout << std::fixed << std::setw(4) << std::setprecision(5);
std::cerr << std::fixed << std::setw(2) << std::setprecision(2);
std::vector<std::thread> threads;
for (unsigned int threadNumber = 0; threadNumber < 16; threadNumber++)
{
std::thread t(test, threadNumber);
threads.push_back(std::move(t));
}
for (std::thread& t : threads)
{
if (t.joinable())
{
t.join();
}
}
terr << std::endl << "Main: " << "Test completed." << std::endl;
return 0;
}
compiling
g++ -g -O2 -Wall -c -o test.o test.cc
g++ -o test test.o -pthread
output
./test
Thread 0: launched
Thread 4: launched
Thread 3: launched
Thread 1: launched
Thread 2: launched
Thread 6: launched
Thread 5: launched
Thread 7: launched
Thread 8: launched
Thread 9: launched
Thread 10: launched
Thread 11: launched
Thread 12: launched
Thread 13: launched
Thread 14: launched
Thread 15: launched
Main: Test completed.
Here's a thread safe line based logging solution I cooked up at some point. It uses boost mutex for thread safety. It is slightly more complicated than necessary because you can plug in output policies (should it go to a file, stderr, or somewhere else?):
logger.h:
#ifndef LOGGER_20080723_H_
#define LOGGER_20080723_H_
#include <boost/thread/mutex.hpp>
#include <iostream>
#include <cassert>
#include <sstream>
#include <ctime>
#include <ostream>
namespace logger {
namespace detail {
template<class Ch, class Tr, class A>
class no_output {
private:
struct null_buffer {
template<class T>
null_buffer &operator<<(const T &) {
return *this;
}
};
public:
typedef null_buffer stream_buffer;
public:
void operator()(const stream_buffer &) {
}
};
template<class Ch, class Tr, class A>
class output_to_clog {
public:
typedef std::basic_ostringstream<Ch, Tr, A> stream_buffer;
public:
void operator()(const stream_buffer &s) {
static boost::mutex mutex;
boost::mutex::scoped_lock lock(mutex);
std::clog << now() << ": " << s.str() << std::endl;
}
private:
static std::string now() {
char buf[64];
const time_t tm = time(0);
strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", localtime(&tm));
return buf;
}
};
template<template <class Ch, class Tr, class A> class OutputPolicy, class Ch = char, class Tr = std::char_traits<Ch>, class A = std::allocator<Ch> >
class logger {
typedef OutputPolicy<Ch, Tr, A> output_policy;
public:
~logger() {
output_policy()(m_SS);
}
public:
template<class T>
logger &operator<<(const T &x) {
m_SS << x;
return *this;
}
private:
typename output_policy::stream_buffer m_SS;
};
}
class log : public detail::logger<detail::output_to_clog> {
};
}
#endif
Usage looks like this:
logger::log() << "this is a test" << 1234 << "testing";
note the lack of a '\n' and std::endl since it's implicit. The contents are buffered and then atomically output using the template specified policy. This implementation also prepends the line with a timestamp since it is for logging purposes. The no_output policy is stricly optional, it's what I use when I want to disable logging.
This:
#define myerr(e) {CiriticalSectionLocker crit; std::cerr << e << std::endl;}
works on most compilers for the common case of myerr("ERR: " << message << number).
Why not just create a locking class and use it where ever you want to do thread-safe IO?
class LockIO
{
static pthread_mutex_t *mutex;
public:
LockIO() { pthread_mutex_lock( mutex ); }
~LockIO() { pthread_mutex_unlock( mutex ); }
};
static pthread_mutex_t* getMutex()
{
pthread_mutex_t *mutex = new pthread_mutex_t;
pthread_mutex_init( mutex, NULL );
return mutex;
}
pthread_mutex_t* LockIO::mutex = getMutex();
Then you put any IO you want in a block:
std::cout <<"X is " <<x <<std::endl;
becomes:
{
LockIO lock;
std::cout <<"X is " <<x <<std::endl;
}
An improvement (that doesn't really fit in a comment) on the approach in unixman's comment.
#define LOCKED_ERR \
if(ErrCriticalSectionLocker crit = ErrCriticalSectionLocker()); \
else std::cerr
Which can be used like
LOCKED_ERR << "ERR: " << message << endl;
if ErrCriticalSectionLocker is implemented carefully.
But, I would personally prefer Ken's suggestion.
Related
[It is not necessary to follow the links to understand the question].
I combined the implementation of the singleton pattern in this answer, together with the synchronized file writing of this other answer.
Then I wanted to see if the interface of SynchronizedFile could provide a variadic templated write method, but I couldn't figure out how to properly combine this with the std::lock_guard.
Below is a non-working example. In this case it doesn't work because (I think) the two threads manage to pump stuff into the buffer i_buf in a non-synchronized way, resulting in a garbled LOGFILE.txt.
If I put the std::lock_guard inside the general template of write then the program doesn't halt.
#include <iostream>
#include <mutex>
#include <sstream>
#include <fstream>
#include <string>
#include <memory>
#include <thread>
static const int N_LOOP_LENGTH{10};
// This class manages a log file and provides write method(s)
// that allow passing a variable number of parameters of different
// types to be written to the file in a line and separated by commas.
class SynchronizedFile {
public:
static SynchronizedFile& getInstance()
{
static SynchronizedFile instance;
return instance;
}
private:
std::ostringstream i_buf;
std::ofstream i_fout;
std::mutex _writerMutex;
SynchronizedFile () {
i_fout.open("LOGFILE.txt", std::ofstream::out);
}
public:
SynchronizedFile(SynchronizedFile const&) = delete;
void operator=(SynchronizedFile const&) = delete;
template<typename First, typename... Rest>
void write(First param1, Rest...param)
{
i_buf << param1 << ", ";
write(param...);
}
void write()
{
std::lock_guard<std::mutex> lock(_writerMutex);
i_fout << i_buf.str() << std::endl;
i_buf.str("");
i_buf.clear();
}
};
// This is just some class that is using the SynchronizedFile class
// to write stuff to the log file.
class Writer {
public:
Writer (SynchronizedFile& sf, const std::string& prefix)
: syncedFile(sf), prefix(prefix) {}
void someFunctionThatWritesToFile () {
syncedFile.write(prefix, "AAAAA", 4343, "BBBBB", 0.2345435, "GGGGGG");
}
private:
SynchronizedFile& syncedFile;
std::string prefix;
};
void thread_method()
{
SynchronizedFile &my_file1 = SynchronizedFile::getInstance();
Writer writer1(my_file1, "Writer 1:");
for (int i = 0; i < N_LOOP_LENGTH; ++ i)
writer1.someFunctionThatWritesToFile();
}
int main()
{
std::thread t(thread_method);
SynchronizedFile &my_file2 = SynchronizedFile::getInstance();
Writer writer2(my_file2, "Writer 2:");
for (int i = 0; i < N_LOOP_LENGTH; ++i)
writer2.someFunctionThatWritesToFile();
t.join();
std::cout << "Done" << std::endl;
return 0;
}
How could I successfully combine these three ideas?
The program deadlocks because write calls itself recursively while still holding the lock.
Either use a std::recursive_mutex or release the lock after writing your data out but before calling write.
E: Unlocking doesn't do the job, I didn't think this through...
E: Or lock once and defer to another private method to do the write.
template<typename... Args>
void write(Args&&... args)
{
std::unique_lock<std::mutex> lock(_writerMutex);
_write(std::forward<Args>(args)...);
}
template<typename First, typename... Rest>
void _write(First&& param1, Rest&&... param) // private method
{
i_buf << std::forward<First>(param1) << ", ";
_write(std::forward<Rest>(param)...);
}
void _write()
{
i_fout << i_buf.str() << std::endl;
i_buf.clear();
}
I'm using CRTP design pattern to implement logging mechanism for my project. Base CRTP class looks like this:
#include <fstream>
#include <memory>
#include <mutex>
#include <iostream>
#include <sstream>
template <typename LogPolicy>
class Logger
{
public:
template <typename... Args>
void operator()(Args... args)
{
loggingMutex.lock();
putTime();
print_impl(args...);
}
void setMaxLogFileSize(unsigned long maxLogFileSizeArg)
{
//if (dynamic_cast<FileLogPolicy *>(policy.get()))
// policy->setMaxLogFileSize(maxLogFileSizeArg);
}
~Logger()
{
print_impl(END_OF_LOGGING);
}
protected:
std::stringstream buffer;
std::mutex loggingMutex;
std::string d_time;
private:
static constexpr auto END_OF_LOGGING = "***END OF LOGGING***";
void putTime()
{
time_t raw_time;
time(&raw_time);
std::string localTime = ctime(&raw_time);
localTime.erase(std::remove(localTime.begin(), localTime.end(), '\n'), localTime.end());
buffer << localTime;
}
template <typename First, typename... Rest>
void print_impl(First first, Rest... rest)
{
buffer << " " << first;
print_impl(rest...);
}
void print_impl()
{
static_cast<LogPolicy*>(this)->write(buffer.str());
buffer.str("");
}
};
One of the concrete logging class is logging to file, which looks like this:
#include "Logger.hpp"
class FileLogPolicy : public Logger<FileLogPolicy>
{
public:
FileLogPolicy(std::string fileName) : logFile(new std::ofstream)
{
logFile->open(fileName, std::ofstream::out | std::ofstream::binary);
if (logFile->is_open())
{
std::cout << "Opening stream with addr " << (logFile.get()) << std::endl;
}
}
void write(const std::string content)
{
std::cout << "Writing stream with addr " << (logFile.get()) << std::endl;
(*logFile) << " " << content << std::endl;
loggingMutex.unlock();
}
virtual ~FileLogPolicy()
{
}
private:
std::unique_ptr<std::ofstream> logFile; //Pointer to logging stream
static const char *const S_FILE_NAME; //File name used to store logging
size_t d_maxLogFileSize; //File max size used to store logging
};
Basically I create object of policy class and would like to log stuff, depending on the policy chosen. So for example I create logger like this:
FileLogPolicy log("log.txt");
In this case it should use Logger to save logs to file, by calling static_cast<LogPolicy*>(this)->write(buffer.str()). Apparently calling write function works fine but stream object is changing to null. How is that possible if FileLogPolicy destructor has not been called yet? When I change logFile to be normal pointer all works well. I don't get it where is the difference.
~Logger()
{
print_impl(END_OF_LOGGING);
}
this code runs after the descendend class has been destroyed.
void print_impl()
{
static_cast<LogPolicy*>(this)->write(buffer.str());
buffer.str("");
}
it then casts this to be a pointer to a class that this is no longer.
The unique ptr is gone, and even accessing the member is UB.
This question already has answers here:
Using boost thread and a non-static class function
(5 answers)
Closed 5 years ago.
Here is my class.h
class threads_queue{
private:
boost::condition_variable the_condition_variable;
public:
//boost::atomic<bool> done;
boost::lockfree::spsc_queue<std::pair<char, std::string>> q{100};
//threads_queue() : done(false) {};
void static run_function();
void add_query(std::string, std::string);
void get_query(void);
};
& Here is class.cpp
void threads_queue::get_query(void){
std::pair<char, std::string> value;
//to do..
}
void threads_queue::add_query(std::string str, std::string work){
//to do ..
}
void run_function(){
//Here I want to create two threads
//First thread like
boost::thread producer_thread(add_query);
boost::thread consumer_thread(get_query);
producer_thread.join();
//done = true;
consumer_thread.join()
}
I'm following this example:
http://www.boost.org/doc/libs/1_54_0/doc/html/lockfree/examples.html
But the problem is when I want to create a thread I always get an error, it does not work
Here were my attempts to solve the error:
1.
boost::thread consumer_thread(&threads_queue::get_query);
I got this error:
Called object type 'void (threads_queue::*)()' is not a function or
function pointer
2.
boost::thread consumer_thread(&threads_queue::get_query, this);
I got this error:
Invalid use of 'this' outside of a non-static member function
3.
boost::thread* thr = new boost::thread(boost::bind(&threads_queue::get_query));
I got this error:
/usr/local/include/boost/bind/bind.hpp:75:22: Type 'void
(threads_queue::*)()' cannot be used prior to '::' because it has no
members
I am not how could this problem be solved, any help?
UPDATE
This topic has great discussion of the problem:
Using boost thread and a non-static class function
My main problem was I forgot to add
threads_queue::
before the run() in my cpp file, there are Mikhail comments below where were a great help: .
There is a lot wrong with your code.
Non-static members are specific to a class. In addition to the function you need to pass an instance of the class to boost::thread's constructor. This has nothing to do with threads or boost.
threads_queue should "own" the threads, and probably should be renamed to something like thread container. The whole class should at minimal be non-copyable.
Here is a complete example written by somebody other than me.
/ Copyright (C) 2009 Tim Blechmann
//
// Distributed under the Boost Software License, Version 1.0. (See
// accompanying file LICENSE_1_0.txt or copy at
// http://www.boost.org/LICENSE_1_0.txt)
//[spsc_queue_example
#include <boost/thread/thread.hpp>
#include <boost/lockfree/spsc_queue.hpp>
#include <iostream>
#include <boost/atomic.hpp>
int producer_count = 0;
boost::atomic_int consumer_count (0);
boost::lockfree::spsc_queue<int, boost::lockfree::capacity<1024> > spsc_queue;
const int iterations = 10000000;
void producer(void)
{
for (int i = 0; i != iterations; ++i) {
int value = ++producer_count;
while (!spsc_queue.push(value))
;
}
}
boost::atomic<bool> done (false);
void consumer(void)
{
int value;
while (!done) {
while (spsc_queue.pop(value))
++consumer_count;
}
while (spsc_queue.pop(value))
++consumer_count;
}
int main(int argc, char* argv[])
{
using namespace std;
cout << "boost::lockfree::queue is ";
if (!spsc_queue.is_lock_free())
cout << "not ";
cout << "lockfree" << endl;
boost::thread producer_thread(producer);
boost::thread consumer_thread(consumer);
producer_thread.join();
done = true;
consumer_thread.join();
cout << "produced " << producer_count << " objects." << endl;
cout << "consumed " << consumer_count << " objects." << endl;
}
//]
I've tried to implement a very basic Thread Local Singleton class in C++ - it's a template class that other classes then inherit from. The problem is that it almost always works, but every now and again (say, 1 run in 15), it will fail with an error along the lines of:
* glibc detected * ./myExe: free(): invalid next size (fast): 0x00002b61a40008c0 ***
please forgive the rather contrived example below, but it serves to demonstrate the problem.
#include <thread>
#include <atomic>
#include <iostream>
#include <memory>
#include <vector>
using namespace std;
template<class T>
class ThreadLocalSingleton
{
public:
/// Return a reference to an instance of the object
static T& instance();
typedef unique_ptr<T> UPtr;
protected:
ThreadLocalSingleton() {}
ThreadLocalSingleton(ThreadLocalSingleton const&);
void operator=(ThreadLocalSingleton const&);
};
template<class T>
T& ThreadLocalSingleton<T>::instance()
{
thread_local T m_instance;
return m_instance;
}
// Create two atomic variables to keep track of the number of times the
// TLS class is created and accessed.
atomic<size_t> creationCount(0);
atomic<size_t> accessCount(0);
// Very simple class which derives from TLS
class MyClass : public ThreadLocalSingleton<MyClass>
{
friend class ThreadLocalSingleton<MyClass>;
public:
MyClass()
{
++creationCount;
}
string getType() const
{
++accessCount;
return "MyClass";
}
};
int main(int,char**)
{
vector<thread> threads;
vector<string> results;
threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&]() { results.emplace_back(MyClass::instance().getType()); MyClass::instance().getType(); });
for (auto& t : threads)
{
t.join();
}
// Expecting 4 creations and 8 accesses.
cout << "CreationCount: " << creationCount << " AccessCount: " << accessCount << endl;
}
I can replicate this on coliru, using the build command:
g++ -std=c++11 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
Many thanks!
Thanks to both molbdnilo and Damon, who quickly pointed out the obvious - vector::emplace_back isn't thread safe, so there would be no guarantees on whether or not this code would actually work. I've replaced the main() function with the following, which seems to be more reliable.
int main(int,char**)
{
vector<thread> threads;
vector<string> results;
auto addToResult = [&results](const string& val)
{
static mutex m_mutex;
unique_lock<mutex> lock(m_mutex);
results.emplace_back(val);
};
threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
threads.emplace_back([&addToResult]() { addToResult(MyClass::instance().getType()); MyClass::instance().getType(); });
for (auto& t : threads)
{
t.join();
}
// Expecting 4 creations and 8 accesses.
cout << "CreationCount: " << creationCount << " AccessCount: " << accessCount << endl;
}
Thanks!
Can someone give me a TBB example how to:
set the maximum count of active threads.
execute tasks that are independent from each others and presented in the form of class, not static functions.
Here's a couple of complete examples, one using parallel_for, the other using parallel_for_each.
Update 2014-04-12: These show what I'd consider to be a pretty old fashioned way of using TBB now; I've added a separate answer using parallel_for with a C++11 lambda.
#include "tbb/blocked_range.h"
#include "tbb/parallel_for.h"
#include "tbb/task_scheduler_init.h"
#include <iostream>
#include <vector>
struct mytask {
mytask(size_t n)
:_n(n)
{}
void operator()() {
for (int i=0;i<1000000;++i) {} // Deliberately run slow
std::cerr << "[" << _n << "]";
}
size_t _n;
};
struct executor
{
executor(std::vector<mytask>& t)
:_tasks(t)
{}
executor(executor& e,tbb::split)
:_tasks(e._tasks)
{}
void operator()(const tbb::blocked_range<size_t>& r) const {
for (size_t i=r.begin();i!=r.end();++i)
_tasks[i]();
}
std::vector<mytask>& _tasks;
};
int main(int,char**) {
tbb::task_scheduler_init init; // Automatic number of threads
// tbb::task_scheduler_init init(2); // Explicit number of threads
std::vector<mytask> tasks;
for (int i=0;i<1000;++i)
tasks.push_back(mytask(i));
executor exec(tasks);
tbb::parallel_for(tbb::blocked_range<size_t>(0,tasks.size()),exec);
std::cerr << std::endl;
return 0;
}
and
#include "tbb/parallel_for_each.h"
#include "tbb/task_scheduler_init.h"
#include <iostream>
#include <vector>
struct mytask {
mytask(size_t n)
:_n(n)
{}
void operator()() {
for (int i=0;i<1000000;++i) {} // Deliberately run slow
std::cerr << "[" << _n << "]";
}
size_t _n;
};
template <typename T> struct invoker {
void operator()(T& it) const {it();}
};
int main(int,char**) {
tbb::task_scheduler_init init; // Automatic number of threads
// tbb::task_scheduler_init init(4); // Explicit number of threads
std::vector<mytask> tasks;
for (int i=0;i<1000;++i)
tasks.push_back(mytask(i));
tbb::parallel_for_each(tasks.begin(),tasks.end(),invoker<mytask>());
std::cerr << std::endl;
return 0;
}
Both compile on a Debian/Wheezy (g++ 4.7) system with g++ tbb_example.cpp -ltbb (then run with ./a.out)
(See this question for replacing that "invoker" thing with a std::mem_fun_ref or boost::bind).
Here's a more modern use of parallel_for with a lambda; compiles and runs on Debian/Wheezy with g++ -std=c++11 tbb_example.cpp -ltbb && ./a.out:
#include "tbb/parallel_for.h"
#include "tbb/task_scheduler_init.h"
#include <iostream>
#include <vector>
struct mytask {
mytask(size_t n)
:_n(n)
{}
void operator()() {
for (int i=0;i<1000000;++i) {} // Deliberately run slow
std::cerr << "[" << _n << "]";
}
size_t _n;
};
int main(int,char**) {
//tbb::task_scheduler_init init; // Automatic number of threads
tbb::task_scheduler_init init(tbb::task_scheduler_init::default_num_threads()); // Explicit number of threads
std::vector<mytask> tasks;
for (int i=0;i<1000;++i)
tasks.push_back(mytask(i));
tbb::parallel_for(
tbb::blocked_range<size_t>(0,tasks.size()),
[&tasks](const tbb::blocked_range<size_t>& r) {
for (size_t i=r.begin();i<r.end();++i) tasks[i]();
}
);
std::cerr << std::endl;
return 0;
}
If you just want to run a couple of tasks concurrently, it might be easier to just use a tbb::task_group. Example taken from tbb:
#include "tbb/task_group.h"
using namespace tbb;
int Fib(int n) {
if( n<2 ) {
return n;
} else {
int x, y;
task_group g;
g.run([&]{x=Fib(n-1);}); // spawn a task
g.run([&]{y=Fib(n-2);}); // spawn another task
g.wait(); // wait for both tasks to complete
return x+y;
}
}
Note however that
Creating a large number of tasks for a single task_group is not scalable, because task creation becomes a serial bottleneck.
In those cases, use timday's examples with a parallel_for or alike.
1-
//!
//! Get the default number of threads
//!
int nDefThreads = tbb::task_scheduler_init::default_num_threads();
//!
//! Init the task scheduler with the wanted number of threads
//!
tbb::task_scheduler_init init(nDefThreads);
2-
Maybe if your code permits, the best way to run independent task with TBB is the parallel_invoke. In the blog of intel developers zone there is a post explaining some cases of how helpfull parallel_invoke could be. Check out this