How to allocate thread local storage? - c++

I have a variable in my function that is static, but I would like it to be static on a per thread basis.
How can I allocate the memory for my C++ class such that each thread has its own copy of the class instance?
AnotherClass::threadSpecificAction()
{
// How to allocate this with thread local storage?
static MyClass *instance = new MyClass();
instance->doSomething();
}
This is on Linux. I'm not using C++0x and this is gcc v3.4.6.

#include <boost/thread/tss.hpp>
static boost::thread_specific_ptr< MyClass> instance;
if( ! instance.get() ) {
// first time called by this thread
// construct test element to be used in all subsequent calls from this thread
instance.reset( new MyClass);
}
instance->doSomething();

It is worth noting that C++11 introduces the thread_local keyword.
Here is an example from Storage duration specifiers:
#include <iostream>
#include <string>
#include <thread>
#include <mutex>
thread_local unsigned int rage = 1;
std::mutex cout_mutex;
void increase_rage(const std::string& thread_name)
{
++rage;
std::lock_guard<std::mutex> lock(cout_mutex);
std::cout << "Rage counter for " << thread_name << ": " << rage << '\n';
}
int main()
{
std::thread a(increase_rage, "a"), b(increase_rage, "b");
increase_rage("main");
a.join();
b.join();
return 0;
}
Possible output:
Rage counter for a: 2
Rage counter for main: 2
Rage counter for b: 2

boost::thread_specific_ptr is the best way as it portable solution.
On Linux & GCC you may use __thread modifier.
So your instance variable will look like:
static __thread MyClass *instance = new MyClass();

If you're using Pthreads you can do the following:
//declare static data members
pthread_key_t AnotherClass::key_value;
pthread_once_t AnotherClass::key_init_once = PTHREAD_ONCE_INIT;
//declare static function
void AnotherClass::init_key()
{
//while you can pass a NULL as the second argument, you
//should pass some valid destrutor function that can properly
//delete a pointer for your MyClass
pthread_key_create(&key_value, NULL);
}
void AnotherClass::threadSpecificAction()
{
//Initialize the key value
pthread_once(&key_init_once, init_key);
//this is where the thread-specific pointer is obtained
//if storage has already been allocated, it won't return NULL
MyClass *instance = NULL;
if ((instance = (MyClass*)pthread_getspecific(key_value)) == NULL)
{
instance = new MyClass;
pthread_setspecific(key_value, (void*)instance);
}
instance->doSomething();
}

If you're working with MSVC++, you can read Thread Local Storage (TLS)
And then you can see this example.
Also, be aware of the Rules and Limitations for TLS

C++11 specifies a thread_local storage type, just use it.
AnotherClass::threadSpecificAction()
{
thread_local MyClass *instance = new MyClass();
instance->doSomething();
}
One optional optimization is to also allocate on thread local storage.

On Windows you can use TlsAlloc and TlsFree to allocate storage in the threads local storage.
To set and retrieve values in with TLS, you can use TlsSetValue and TlsGetValue, respectively
Here you can see an example on how it would be used.

Just a side note...
MSVC++ supports declspec(thread) from VSC++2005
#if (_MSC_VER >= 1400)
#ifndef thread_local
#define thread_local __declspec(thread)
#endif
#endif
Main problem is(which is solved in boost::thread_specific_ptr) variables marked with it can't contain ctor or dtor.

Folly (Facebook Open-source Library) has a portable implementation of Thread Local Storage.
According its authors:
Improved thread local storage for non-trivial types (similar speed as
pthread_getspecific but only consumes a single pthread_key_t, and 4x faster
than boost::thread_specific_ptr).
If your looking for a portable implementation of Local Storage Thread, this library is a good option.

Related

Is managing resources in destructor for monostate classes/static members a bad idea in C++?

I'm trying to implement monostate class which manages some std::thread. Thread is running until flag become equals to false. After flag changes to false - thread stops. But looks like I have to call stoping method explicitly. Calling it in destructor brings me runtime errors (Tested on GCC 4.8 for ARM, GCC 4.9 for x86_64 and MSVC 2017).
Am I right that such behavior is due to
"Static members of a class are not associated with the objects of the
class: they are independent objects with static storage duration or
regular functions defined in namespace scope, only once in the
program."
so destructor call is omited?
Code sample:
#include <iostream>
#include <chrono>
#include <thread>
#include <atomic>
void runThread(const std::atomic<bool> &_isRunning) {
while (_isRunning) {
std::cout << "Me running.." << std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(30));
}
}
class test {
static std::thread thread;
static std::atomic<bool> isRunning;
public:
test();
~test();
static void go();
static void stop();
};
std::thread test::thread;
std::atomic<bool> test::isRunning{ false };
test::test() {}
void test::go() {
isRunning = true;
thread = std::thread(runThread, std::ref(isRunning));
}
void test::stop() {
isRunning = false;
if (thread.joinable()) {
thread.join();
}
}
test::~test() {
stop();
}
int main() {
test::go();
std::this_thread::sleep_for(std::chrono::seconds(5));
std::cout << "Done here!!!!!!!!!!!!!!!!!";
// Will not crash anymore if uncomment
//test::stop();
return 0;
}
Using std::async with std::feature gives same result but without error. Thread just keeps running.
P.S.
Making the class non-monostate solves runtime errors but leaves me with this question. Is managing resources a bad practice for monostate classes/static members?
~test();
should be called before destroying any "test" object. You do not create "test" objects in your code, so you are right,
Static members of a class are not associated with the objects of the
class: they are independent objects with static storage duration or
regular functions defined in namespace scope, only once in the
program.
The constructor of a static object is called before main is executed, and the destructor is called after main is completed (from within atexit, typically).
Put a breakpoint in the destructor, it's easy to see.

C++ constexpr thread_local id

Is there any way to get a different value in a constexpr thread_local variable for every thread?
constexpr thread_local someType someVar = ......;
It seems like constexpr thread_local is supported but the thread_local indicator doesnt seem to do anything in this case.
If you think about your question, you yourself can see why this is not possible.
What is constexpr?
According to the informal standard site cppreference:
The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.
The compiler has to resolve the value at compile time and this value should not change throughout the execution of the program.
Thread-local storage
A thread, on the contrary, is a run-time concept. C++11 introduced the thread concept into the language, and thus you could say that a compiler can be "aware" of the thread concept.
But, the compiler can't always predict if a thread is going to be executed (Maybe you run the thread only upon specific configuration), or how many instances are going to be spawn, etc.
Possible implementation
Instead of trying to enforce access to a specific module/method to a single thread using hacks and tricks, why not use a very primitive feature of the language?
You could just as well implement this using simple encapsulation. Just make sure that the only object that "sees" this method you are trying to protect is the thread object itself, for example:
#include <iostream>
#include <thread>
#include <chrono>
using namespace std;
class SpecialWorker
{
public:
void start()
{
m_thread = std::move(std::thread(&SpecialWorker::run, this));
}
void join()
{
m_thread.join();
}
protected:
virtual void run() { protectedTask(); }
private:
void protectedTask()
{
cout << "PROTECT ME!" << endl;
}
std::thread m_thread;
};
int main(int argc, char ** argv)
{
SpecialWorker a;
a.start();
a.join();
return 0;
}
Please note that this example is lacking in error handling and is not production grade code! Make sure to refine it if you intend to use it.

Do thread_local variables need to be locked using a mutex?

I considered thread_local variables as private variables for each thread, just with the same name. But all examples I found use a mutex variable to lock the thread_local variable when accessing it. This confused me. If thread_local is private for each thread, there is no need to take care of the concurrency problem, or my acknowledgement of the "private" idea is wrong?
Example taken from here:
#include <iostream>
#include <string>
#include <thread>
#include <mutex>
thread_local unsigned int rage = 1;
std::mutex cout_mutex;
void increase_rage(const std::string& thread_name)
{
++rage;
std::lock_guard<std::mutex> lock(cout_mutex);
std::cout << "Rage counter for " << thread_name << ": " << rage << '\n';
}
int main()
{
std::thread a(increase_rage, "a"), b(increase_rage, "b");
increase_rage("main");
a.join();
b.join();
}
In this case, is it necessary to lock the thread_local variable?
If you take a pointer to a thread_local object, and pass the pointer to another thread, in some way, the other thread can still access the original thread's thread_local object using the pointer (until the originating thread terminates, after which point this becames undefined behavior).
So, if this can happen in your application, you will still need to arrange for mutex protection, or something equivalent, in order to access the thread_local object in a thread-safe manner.
Naming thread_local variables private variables is a bit unfortunate.
A thread_local declared variable is owned by its thread and not accessible by other threads unless the owner thread (for some reason) gives them a pointer to that variable.
A thread_local variable is shared among all functions of its thread; i.e. it has its lifetime. If a thread_local variable is constructed it will be destroyed when its thread exits.
A thread_local variable can be static, in which case some care should be taken to make sure the program executes as expected. I won't go into this, since it is not part of the question.
The mutex in your example, as pointed out in the comments, is not for a data race condition. It is to synchronize the console output: the mutex is called cout_mutex - self explaining.

Singleton multithread code in C++

I have a doubt related to Singleton and multithread programming in C++
Following you can see an example code of a Singleton class with a variable named shared.
I create 1000 threads that modify (+1) that variable of my Singleton global instance. The final value of shared is 1000 but I would expect this value to be under 1000 since I am not protecting this variable for concurrency.
Is the code really thread safe because the class is Singleton or it just happened to be lucky and the value is 1000 but it can perfectly be less than 1000?
#include <iostream>
using namespace std;
class Singleton {
private:
Singleton() {shared = 0;};
static Singleton * _instance;
int shared;
public:
static Singleton* Instance();
void increaseShared () { shared++; };
int getSharedValue () { return shared; };
};
// Global static pointer used to ensure a single instance of the class.
Singleton* Singleton::_instance = NULL;
Singleton * Singleton::Instance() {
if (!_instance) {
_instance = new Singleton;
}
return _instance;
}
void * myThreadCode (void * param) {
Singleton * theInstance;
theInstance = Singleton::Instance();
theInstance->increaseShared();
return NULL;
}
int main(int argc, const char * argv[]) {
pthread_t threads[1000];
Singleton * theInstance = Singleton::Instance();
for (int i=0; i<1000; i++) {
pthread_create(&threads[i], NULL, &myThreadCode, NULL);
}
cout << "The shared value is: " << theInstance->getSharedValue() << endl;
return 0;
}
Is the code really thread safe because the class is Singleton or it just happened to be lucky and the value is 1000 but it can perfectly be less than 1000?
You got lucky...
In reality, the most likely issue with what you're observing has to-do with the fact that the time it takes to increment the value of your singleton on your specific machine is less than the time it takes the operating system to allocate the resources to launch an individual pthread. Thus you never ended up with a scenario where two threads contend for the unprotected resources of the singleton.
A much better test would have been to launch all of your pthreads first, have them block on a barrier or condition variable, and then perform the increment on the singleton once the barrier's condition of all the threads being "active" is met ... at that point you would have been much more likely to have seen the sorts of data-races that occur with non-atomic operations like an increment operation.
If you implement your Singleton like this, the singleton creation will be thread safe:
Singleton & Singleton::Instance() {
static Singleton instance;
return instance;
}
Since the instance can never be null, and no memory to manager, a reference is returned instead of a pointer.
The increment operation can be made atomic by using platform specific operations (g++ provides built-ins, e.g. __sync_fetch_and_add), or C++11 atomic from STL, or Boost.Atomic, or with mutex guards.
std::atomic<int> shared;
void increaseShared () { ++shared; };

efficient thread-safe singleton in C++

The usual pattern for a singleton class is something like
static Foo &getInst()
{
static Foo *inst = NULL;
if(inst == NULL)
inst = new Foo(...);
return *inst;
}
However, it's my understanding that this solution is not thread-safe, since 1) Foo's constructor might be called more than once (which may or may not matter) and 2) inst may not be fully constructed before it is returned to a different thread.
One solution is to wrap a mutex around the whole method, but then I'm paying for synchronization overhead long after I actually need it. An alternative is something like
static Foo &getInst()
{
static Foo *inst = NULL;
if(inst == NULL)
{
pthread_mutex_lock(&mutex);
if(inst == NULL)
inst = new Foo(...);
pthread_mutex_unlock(&mutex);
}
return *inst;
}
Is this the right way to do it, or are there any pitfalls I should be aware of? For instance, are there any static initialization order problems that might occur, i.e. is inst always guaranteed to be NULL the first time getInst is called?
If you are using C++11, here is a right way to do this:
Foo& getInst()
{
static Foo inst(...);
return inst;
}
According to new standard there is no need to care about this problem any more. Object initialization will be made only by one thread, other threads will wait till it complete.
Or you can use std::call_once. (more info here)
Your solution is called 'double checked locking' and the way you've written it is not threadsafe.
This Meyers/Alexandrescu paper explains why - but that paper is also widely misunderstood. It started the 'double checked locking is unsafe in C++' meme - but its actual conclusion is that double checked locking in C++ can be implemented safely, it just requires the use of memory barriers in a non-obvious place.
The paper contains pseudocode demonstrating how to use memory barriers to safely implement the DLCP, so it shouldn't be difficult for you to correct your implementation.
Herb Sutter talks about the double-checked locking in CppCon 2014.
Below is the code I implemented in C++11 based on that:
class Foo {
public:
static Foo* Instance();
private:
Foo() {}
static atomic<Foo*> pinstance;
static mutex m_;
};
atomic<Foo*> Foo::pinstance { nullptr };
std::mutex Foo::m_;
Foo* Foo::Instance() {
if(pinstance == nullptr) {
lock_guard<mutex> lock(m_);
if(pinstance == nullptr) {
pinstance = new Foo();
}
}
return pinstance;
}
you can also check complete program here: http://ideone.com/olvK13
Use pthread_once, which is guaranteed that the initialization function is run once atomically.
(On Mac OS X it uses a spin lock. Don't know the implementation of other platforms.)
TTBOMK, the only guaranteed thread-safe way to do this without locking would be to initialize all your singletons before you ever start a thread.
Your alternative is called "double-checked locking".
There could exist multi-threaded memory models in which it works, but POSIX does not guarantee one
ACE singleton implementation uses double-checked locking pattern for thread safety, you can refer to it if you like.
You can find source code here.
Does TLS work here? https://en.wikipedia.org/wiki/Thread-local_storage#C_and_C++
For example,
static _thread Foo *inst = NULL;
static Foo &getInst()
{
if(inst == NULL)
inst = new Foo(...);
return *inst;
}
But we also need a way to delete it explicitly, like
static void deleteInst() {
if (!inst) {
return;
}
delete inst;
inst = NULL;
}
The solution is not thread safe because the statement
inst = new Foo();
can be broken down into two statements by compiler:
Statement1: inst = malloc(sizeof(Foo));
Statement2: inst->Foo();
Suppose that after execution of statement 1 by one thread context switch occurs. And 2nd thread also executes the getInstance() method. Then the 2nd thread will find that the 'inst' pointer is not null. So 2nd thread will return pointer to an uninitialized object as constructor has not yet been called by the 1st thread.