Is accessing two different class members of the same object from two different POSIX threads at the same time considered to be thread-safe in C++ 03?
No. (with a little voice of "yes")
From the point of view of the C++03 standard, no such thing as threads exists, so there exist no conditions whatsoever under which the standard would consider anything involving concurrency as "safe".
While this is often no problem (with a little care and proper synchronization primitives that are outside the scope of C++, it will "work anyway"), there are a few things to be aware of, among these:
errno (and other structures) might not be thread-local. The -pthread command line option mostly addresses this.
Class members may alias each other through references, pointers, or unions, so mutating different members might indeed mutate the same member concurrently
Without memory model, the compiler is allowed to (and will!) reorder loads and stores, which means that for example the "obvious" way of communicating by first writing a piece of data, and then setting a "data is ready" flag may not work as expected.
Under Windows, there exist some not-immediately-obvious static-dynamic CRT issues in presence of threading when your program loads DLLs. Be sure all components do "the same thing" (whatever it is).
Also, some old versions of the CRT may leak a few hundred bytes of memory per thread (usually not an issue).
Immutable objects are inherently thread-safe, as is read-only access from several threads.
Related
I havea class, used for data storage, of which there is only a single instance.
The caller is message driven and has become too large and is a prime candidate for refactoring, such that each message is handled by a separate thread. However, these could then compete to read/write the data.
If I were using mutexes (mutices?), I would only use them on write operations. I don't think that matters here, as the data are atomic, not the functions which access the data.
Is there any easy way to make all of the data atomic? Currently it consists of simple types, vectors and objects of other classes. If I have to add std::atomic<> to every sub-field, I may as well use mutexes.
std::atomic requires the type to be trivially copyable. Since you are saying std::vector is involved, that makes it impossible to use it, either on the whole structure or the std::vector itself.
The purpose of std::atomic is to be able to atomically replace the whole value of the object. You cannot do something like access individual members or so on.
From the limited context you gave in your question, I think std::mutex is the correct approach. Each object that should be independently accessible should have its own mutex protecting it.
Also note that the mutex generally needs to protect writes and reads, since a read happening unsynchronized with a write is a data race and causes undefined behavior, not only unsynchronized writes.
I read C++ Standard (n4713)'s § 32.6.1 3:
Operations that are lock-free should also be address-free. That is,
atomic operations on the same memory location via two different
addresses will communicate atomically. The implementation should not
depend on any per-process state. This restriction enables
communication by memory that is mapped into a process more than once
and by memory that is shared between two processes.
So it sounds like it is possible to perform a lock-free atomic operation on the same memory location. I wonder how it can be done.
Let's say I have a named shared memory segment on Linux (via shm_open() and mmap()). How can I perform a lockfree operation on the first 4 bytes of the shared memory segment for example?
At first, I thought I could just reinterpret_cast the pointer to std::atomic<int32_t>*. But then I read this. It first points out that std::atomic might not have the same size of T or alignment:
When we designed the C++11 atomics, I was under the misimpression that
it would be possible to semi-portably apply atomic operations to data
not declared to be atomic, using code such as
int x; reinterpret_cast<atomic<int>&>(x).fetch_add(1);
This would clearly fail if the representations of atomic and int
differ, or if their alignments differ. But I know that this is not an
issue on platforms I care about. And, in practice, I can easily test
for a problem by checking at compile time that sizes and alignments
match.
Tho, it is fine with me in this case because I use a shared memory on the same machine and casting the pointer in two different processes will "acquire" the same location. However, the article states that the compiler might not treat the casted pointer as a pointer to an atomic type:
However this is not guaranteed to be reliable, even on platforms on
which one might expect it to work, since it may confuse type-based
alias analysis in the compiler. A compiler may assume that an int is
not also accessed as an atomic<int>. (See 3.10, [Basic.lval], last
paragraph.)
Any input is welcome!
The C++ standard doesn't concern itself with multiple processes and no guarantees were given outside of a multi-threaded environment.
However, the standard does recommend that implementations of lock-free atomics be usable across processes, which is the case in most real implementations.
This answer will assume atomics behave more or less the same with processes as with threads.
The first solution requires C++20 atomic_ref
void* shared_mem = /* something */
auto p1 = new (shared_mem) int; // For creating the shared object
auto p2 = (int*)shared_mem; // For getting the shared object
std::atomic_ref<int> i{p2}; // Use i as if atomic<int>
You need to make sure the shared int has std::atomic_ref<int>::required_alignment alignment; typically the same as sizeof(int). Normally you'd use alignas() on a struct member or variable, but in shared memory the layout is up to you (relative to a known page boundary).
This prevents the presence of opaque atomic types existing in the shared memory, which gives you precise control over what exactly goes in there.
A solution prior C++20 would be
auto p1 = new (shared_mem) atomic<int>; // For creating the shared object
auto p2 = (atomic<int>*)shared_mem; // For getting the shared object
auto& i = *p2;
Or using C11 atomic_load and atomic_store
_Atomic int* i = (_Atomic int*)shared_mem;
atomic_store(i, 42);
int i2 = atomic_load(i);
Alignment requirements are the same here, alignof(std::atomic<int>) or _Alignof(atomic_int).
Yes, the C++ standard is a bit mealy-mouthed about all this.
If you are on Windows (which you probably aren't) then you can use InterlockedExchange() etc, which offer all the required semantics and don't care where the referenced object is (it's a LONG *).
On other platforms, gcc has some atomic builtins which might help with this. They might free you from the tyranny of the standards writers. Trouble is, it's hard to test if the resulting code is bullet-proof.
On all mainstream platforms, std::atomic<T> does have the same size as T, although possibly higher alignment requirement if T has alignof < sizeof.
You can check these assumptions with:
static_assert(sizeof(T) == sizeof(std::atomic<T>),
"atomic<T> isn't the same size as T");
static_assert(std::atomic<T>::is_always_lock_free, // C++17
"atomic<T> isn't lock-free, unusable on shared mem");
auto atomic_ptr = static_cast<atomic<int>*>(some_ptr);
// beware strict-aliasing violations
// don't also access the same memory via int*
// unless you're aware of possible issues
// also make sure that the ptr is aligned to alignof(atomic<T>)
// otherwise you might get tearing (non-atomicity)
On exotic C++ implementations where these aren't true, people that want to use your code on shared memory will need to do something else.
Or if all accesses to shared memory from all processes consistently use atomic<T> then there's no problem, you only need lock-free to guarantee address-free. (You do need to check this: std::atomic uses a hash table of locks for non-lock-free. This is address-dependent, and separate processes will have separate hash tables of locks.)
I have a few objects I need to perform actions on from different threads in c++. I known it is necessary to lock any variable that may be used by more than one thread at the same time, but what if each thread is accessing (writing to) a different data member of the same object? For example, each thread is calling a different method of the object and none of the methods called modify the same data member. Is it safe as long as I don't access the same data member or do I need to lock the whole object anyway?
I've looked around for explanations and details on this topic but every example seems to focus on single variables or non-member functions.
To summarize:
Can I safely access 2 different data members of the same object from 2 different thread without placing a lock on the whole object?
It is effectively safe, but will strongly reduce the performance of your code if you do that often. Computers use things called "cache lines" and if two processors are working on the same cache line they'll have to pass it back & forth all the time, slowing your work down.
Yes, it is safe to access different members of one object by different thread.
I think you can do that fine. But you better be sure that that the method internals never change to access the same data or the calling program doesn't decide to call another method that another thread is already using etc.
So possible, but potentially dangerous. But then it will also be quicker because you'll be avoiding calls to get mutexes. Pick your poison.
Well, yes, OK you can do it but, as others have pointed out, you should not have to. IMHO, access to data members should be via getter/setter methods so that any necessary mutexing/criticalSectioning/semaphoring/whatever is encapsulated within the object.
Is it safe as long as I don't access the same data member or do I need to lock the whole object anyway?
The answer totally depends upon the design of the class, However I would still say that it is always recommended to think 100 times before allowing multiple threads to access same object. Given the fact, If you are sure that the data is really independent their is NO need to lock the whole object.
Then a different question arises, "If variables are indeed independent Why they are in same class ?" Be careful threading kills if mistaken.
You might want to be careful. See for example http://gcc.gnu.org/ml/gcc/2012-02/msg00032.html
Depending on how the fields are accessed, you might run across similar hard to find problems.
I open a piece of shared memory and get a handle of it. I'm aware there are several vectors of data stored in the memory. I'd like to access those vectors of data and perform some actions on them. How can I achieve this? Is it appropriate to treat the shared memory as an object so that we can define those vectors as fields of the object and those needed actions as member functions of the object?
I've never dealt with shared memory before. To make things worse, I'm new to C++ and POSIX. Could someone please provide some guidance? Simple examples would be greatly appreciated.
int my_shmid = shmget(key,size,shmflgs);
...
void* address_of_my_shm1 = shat(my_shmid,0,shmflags);
Object* optr = static_cast<Object*>(address_of_my_shm1);
...or, in some other thread/process to which you arranged to pass the address_of_my_shm1
...by some other means
void* address_of_my_shm2 = shat(my_shmid,address_of_my_shm1,shmflags);
You may want to assert that address_of_shm1 == address_of_shm2. But note that I say "may" - you don't actually have to do this. Some types/structs/classes can be read equally well at different addresses.
If the object will appear in different address spaces, then pointers outside the shhm in process A may not point to the same thing as in process B. In general, pointers outside the shm are bad. (Virtual functions are pointers outside the object, and outside the shm. Bad, unless you have other reason to trust them.)
Pointers inside the shm are usable, if they appear at the same address.
Relative pointers can be quite usable, but, again, so long as they point only inside the shm. Relative pointers may be relative to the base of an object, i.e. they may be offsets. Or they may be relative to the pointer itself. You can define some nice classes/templates that do these calculations, with casting going on under the hood.
Sharing of objects through shmem is simplest if the data is just POD (Plain Old Data). Nothing fancy.
Because you are in different processes that are not sharing the whole address space, you may not be guaranteed that things like virtual functions will appear at the same address in all processes using the shm shared memory segment. So probably best to avoid virtual functions. (If you try hard and/or know linkage, you may in some circumstances be able to share virtual functions. But that is one of the first things I would disable if I had to debug.)
You should only do this if you are aware of your implementation's object memory model. And if advanced (for C++) optimizations like splitting structs into discontiguous hot and cold parts are disabled. Since such optimizations rae arguably not legal for C++, you are probably safe.
Obviously you are better off if you are casting to the same object type/class on all sides.
You can get away with non-virtual functions. However, note that it can be quite easy to have the same class, but different versions of the class - e.g. differing in size, e.g. adding a new field and changing the offsets of all of the other fields - so you need to be quite careful to ensure all sides are using the same definitions and declarations.
Is there a way to implement a singleton object in C++ that is:
Lazily constructed in a thread safe manner (two threads might simultaneously be the first user of the singleton - it should still only be constructed once).
Doesn't rely on static variables being constructed beforehand (so the singleton object is itself safe to use during the construction of static variables).
(I don't know my C++ well enough, but is it the case that integral and constant static variables are initialized before any code is executed (ie, even before static constructors are executed - their values may already be "initialized" in the program image)? If so - perhaps this can be exploited to implement a singleton mutex - which can in turn be used to guard the creation of the real singleton..)
Excellent, it seems that I have a couple of good answers now (shame I can't mark 2 or 3 as being the answer). There appears to be two broad solutions:
Use static initialisation (as opposed to dynamic initialisation) of a POD static variable, and implementing my own mutex with that using the builtin atomic instructions. This was the type of solution I was hinting at in my question, and I believe I knew already.
Use some other library function like pthread_once or boost::call_once. These I certainly didn't know about - and am very grateful for the answers posted.
Basically, you're asking for synchronized creation of a singleton, without using any synchronization (previously-constructed variables). In general, no, this is not possible. You need something available for synchronization.
As for your other question, yes, static variables which can be statically initialized (i.e. no runtime code necessary) are guaranteed to be initialized before other code is executed. This makes it possible to use a statically-initialized mutex to synchronize creation of the singleton.
From the 2003 revision of the C++ standard:
Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization. Objects of POD types (3.9) with static storage duration initialized with constant expressions (5.19) shall be initialized before any dynamic initialization takes place. Objects with static storage duration defined in namespace scope in the same translation unit and dynamically initialized shall be initialized in the order in which their definition appears in the translation unit.
If you know that you will be using this singleton during the initialization of other static objects, I think you'll find that synchronization is a non-issue. To the best of my knowledge, all major compilers initialize static objects in a single thread, so thread-safety during static initialization. You can declare your singleton pointer to be NULL, and then check to see if it's been initialized before you use it.
However, this assumes that you know that you'll use this singleton during static initialization. This is also not guaranteed by the standard, so if you want to be completely safe, use a statically-initialized mutex.
Edit: Chris's suggestion to use an atomic compare-and-swap would certainly work. If portability is not an issue (and creating additional temporary singletons is not a problem), then it is a slightly lower overhead solution.
Unfortunately, Matt's answer features what's called double-checked locking which isn't supported by the C/C++ memory model. (It is supported by the Java 1.5 and later — and I think .NET — memory model.) This means that between the time when the pObj == NULL check takes place and when the lock (mutex) is acquired, pObj may have already been assigned on another thread. Thread switching happens whenever the OS wants it to, not between "lines" of a program (which have no meaning post-compilation in most languages).
Furthermore, as Matt acknowledges, he uses an int as a lock rather than an OS primitive. Don't do that. Proper locks require the use of memory barrier instructions, potentially cache-line flushes, and so on; use your operating system's primitives for locking. This is especially important because the primitives used can change between the individual CPU lines that your operating system runs on; what works on a CPU Foo might not work on CPU Foo2. Most operating systems either natively support POSIX threads (pthreads) or offer them as a wrapper for the OS threading package, so it's often best to illustrate examples using them.
If your operating system offers appropriate primitives, and if you absolutely need it for performance, instead of doing this type of locking/initialization you can use an atomic compare and swap operation to initialize a shared global variable. Essentially, what you write will look like this:
MySingleton *MySingleton::GetSingleton() {
if (pObj == NULL) {
// create a temporary instance of the singleton
MySingleton *temp = new MySingleton();
if (OSAtomicCompareAndSwapPtrBarrier(NULL, temp, &pObj) == false) {
// if the swap didn't take place, delete the temporary instance
delete temp;
}
}
return pObj;
}
This only works if it's safe to create multiple instances of your singleton (one per thread that happens to invoke GetSingleton() simultaneously), and then throw extras away. The OSAtomicCompareAndSwapPtrBarrier function provided on Mac OS X — most operating systems provide a similar primitive — checks whether pObj is NULL and only actually sets it to temp to it if it is. This uses hardware support to really, literally only perform the swap once and tell whether it happened.
Another facility to leverage if your OS offers it that's in between these two extremes is pthread_once. This lets you set up a function that's run only once - basically by doing all of the locking/barrier/etc. trickery for you - no matter how many times it's invoked or on how many threads it's invoked.
Here's a very simple lazily constructed singleton getter:
Singleton *Singleton::self() {
static Singleton instance;
return &instance;
}
This is lazy, and the next C++ standard (C++0x) requires it to be thread safe. In fact, I believe that at least g++ implements this in a thread safe manner. So if that's your target compiler or if you use a compiler which also implements this in a thread safe manner (maybe newer Visual Studio compilers do? I don't know), then this might be all you need.
Also see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2513.html on this topic.
You can't do it without any static variables, however if you are willing to tolerate one, you can use Boost.Thread for this purpose. Read the "one-time initialisation" section for more info.
Then in your singleton accessor function, use boost::call_once to construct the object, and return it.
For gcc, this is rather easy:
LazyType* GetMyLazyGlobal() {
static const LazyType* instance = new LazyType();
return instance;
}
GCC will make sure that the initialization is atomic. For VC++, this is not the case. :-(
One major issue with this mechanism is the lack of testability: if you need to reset the LazyType to a new one between tests, or want to change the LazyType* to a MockLazyType*, you won't be able to. Given this, it's usually best to use a static mutex + static pointer.
Also, possibly an aside: It's best to always avoid static non-POD types. (Pointers to PODs are OK.) The reasons for this are many: as you mention, initialization order isn't defined -- neither is the order in which destructors are called though. Because of this, programs will end up crashing when they try to exit; often not a big deal, but sometimes a showstopper when the profiler you are trying to use requires a clean exit.
While this question has already been answered, I think there are some other points to mention:
If you want lazy-instantiation of the singleton while using a pointer to a dynamically allocated instance, you'll have to make sure you clean it up at the right point.
You could use Matt's solution, but you'd need to use a proper mutex/critical section for locking, and by checking "pObj == NULL" both before and after the lock. Of course, pObj would also have to be static ;)
.
A mutex would be unnecessarily heavy in this case, you'd be better going with a critical section.
But as already stated, you can't guarantee threadsafe lazy-initialisation without using at least one synchronisation primitive.
Edit: Yup Derek, you're right. My bad. :)
You could use Matt's solution, but you'd need to use a proper mutex/critical section for locking, and by checking "pObj == NULL" both before and after the lock. Of course, pObj would also have to be static ;) . A mutex would be unnecessarily heavy in this case, you'd be better going with a critical section.
OJ, that doesn't work. As Chris pointed out, that's double-check locking, which is not guaranteed to work in the current C++ standard. See: C++ and the Perils of Double-Checked Locking
Edit: No problem, OJ. It's really nice in languages where it does work. I expect it will work in C++0x (though I'm not certain), because it's such a convenient idiom.
read on weak memory model. It can break double-checked locks and spinlocks. Intel is strong memory model (yet), so on Intel it's easier
carefully use "volatile" to avoid caching of parts the object in registers, otherwise you'll have initialized the object pointer, but not the object itself, and the other thread will crash
the order of static variables initialization versus shared code loading is sometimes not trivial. I've seen cases when the code to destruct an object was already unloaded, so the program crashed on exit
such objects are hard to destroy properly
In general singletons are hard to do right and hard to debug. It's better to avoid them altogether.
I suppose saying don't do this because it's not safe and will probably break more often than just initializing this stuff in main() isn't going to be that popular.
(And yes, I know that suggesting that means you shouldn't attempt to do interesting stuff in constructors of global objects. That's the point.)