Is atomic<T*> always lock free? - c++

On my MAC OS, atomic<T*> is lock free.
#include <iostream>
#include <atomic>
int main() {
std::cout << std::atomic<void*>().is_lock_free() << std::endl;
return 0;
}
output: 1
I want to know if atomic<T*> is always lock free?
Is there a reference to introduce it?

The standard allows implementing any atomic type (with exception of std::atomic_flag) to be implemented with locks. Even if the platform would allow lock-free atomics for some type, the standard library developers might not have implemented that.
If you need to implement something differently when locks are used, this can be checked at compile time using ATOMIC_POINTER_LOCK_FREE macro.

No, it is not safe to assume that any particular platform's implementation of std::atomic is always lock free.
The standard specifies some marker macros, including ATOMIC_POINTER_LOCK_FREE, which indicates either pointers are never, sometimes or always lock free, for the platform in question.
You can also get an answer from std::atomic<T *>::is_always_lock_free, for your particular T.1
Note 1: A given pointer type must be consistent, so the instance method std::atomic<T *>::is_lock_free() is redundant.

Related

Do dependent reads require a load-acquire?

Does the following program expose a data race, or any other concurrency concern?
#include <cstdio>
#include <cstdlib>
#include <atomic>
#include <thread>
class C {
public:
int i;
};
std::atomic<C *> c_ptr{};
int main(int, char **) {
std::thread t([] {
auto read_ptr = c_ptr.load(std::memory_order_relaxed);
if (read_ptr) {
// does the following dependent read race?
printf("%d\n", read_ptr->i);
}
});
c_ptr.store(new C{rand()}, std::memory_order_release);
return 0;
}
Godbolt
I am interested in whether reads through pointers need load-acquire semantics when loading the pointer, or whether the dependent-read nature of reads through pointers makes that ordering unnecessary. If it matters, assume arm64, and please describe why it matters, if possible.
I have tried searching for discussions of dependent reads and haven't found any explicit recognition of their implicit load-reordering-barriers. It looks safe to me, but I don't trust my understanding enough to know it's safe.
Your code is not safe, and can break in practice with real compilers for DEC Alpha AXP (which can violate causality via tricky cache bank shenanigans IIRC).
As far as the ISO C++ standard guaranteeing anything in the C++ abstract machine, no, there's no guarantee because nothing creates a happens-before relationship between the init of the int and the read in the other thread.
But in practice C++ compilers implement release the same way regardless of context and without checking the whole program for the existence of a possible reader with at least consume ordering.
In some but not all concrete implementations into asm for real machines by real compilers, this will work. (Because they choose not to look for that UB and break the code on purpose with fancy inter-thread analysis of the only possible reads and writes of that atomic variable.)
DEC Alpha could famously break this code, not guaranteeing dependency ordering in asm, so needing barriers for memory_order_consume, unlike all(?) other ISAs.
Given the current deprecated state of consume, the only way to get efficient asm on ISAs with dependency ordering (not Alpha), but which don't do acquire for free (x86) is to write code like this. The Linux kernel does this in practice for things like RCU.
That requires keeping it simple enough that compilers can't break the dependency ordering by e.g. proving that any non-NULL read_ptr would have a specific value, like the address of a static variable.
See also
What does memory_order_consume really do?
C++11: the difference between memory_order_relaxed and memory_order_consume
Memory order consume usage in C11 - more about the hardware mechanism / guarantee that consume is intended to expose to software. Out-of-order exec can only reorder independent work anyway, not start a load before the load address is known, so on most CPUs enforcing dependency ordering happens for free anyway: only a few models of DEC Alpha could violate causality and effectively load data from before it had the pointer that gave it the address.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0371r1.html - and other C++ wg21 documents linked from that about why consume is discouraged.

Is there any potential problem with double-check lock for C++?

Here is a simple code snippet for demonstration.
Somebody told me that the double-check lock is incorrect. Since the variable is non-volatile, the compiler is free to reorder the calls or optimize them away(For details, see codereview.stackexchange.com/a/266302/226000).
But I really saw such a code snippet is used in many projects indeed. Could somebody shed some light on this matter? I googled and talked about it with my friends, but I still can't find out the answer.
#include <iostream>
#include <mutex>
#include <fstream>
namespace DemoLogger
{
void InitFd()
{
if (!is_log_file_ready)
{
std::lock_guard<std::mutex> guard(log_mutex);
if (!is_log_file_ready)
{
log_stream.open("sdk.log", std::ofstream::out | std::ofstream::trunc);
is_log_file_ready = true;
}
}
}
extern static bool is_log_file_ready;
extern static std::mutex log_mutex;
extern static std::ofstream log_stream;
}
//cpp
namespace DemoLogger
{
bool is_log_file_ready{false};
std::mutex log_mutex;
std::ofstream log_stream;
}
UPDATE:
Thanks to all of you. There is better implementation for InitFd() indeed, but it's only a simple demo indeed, what I really want to know is that whether there is any potential problem with double-check lock or not.
For the complete code snippet, see https://codereview.stackexchange.com/questions/266282/c-logger-by-template.
The double-checked lock is incorrect because is_log_file_ready is a plain bool, and this flag can be accessed by multiple threads one of which is a writer - that is a race
The simple fix is to change the declaration:
std::atomic<bool> is_log_file_ready{false};
You can then further relax operations on is_log_file_ready:
void InitFd()
{
if (!is_log_file_ready.load(std::memory_order_acquire))
{
std::lock_guard<std::mutex> guard(log_mutex);
if (!is_log_file_ready.load(std::memory_order_relaxed))
{
log_stream.open("sdk.log", std::ofstream::out | std::ofstream::trunc);
is_log_file_ready.store(true, std::memory_order_release);
}
}
}
But in general, double-checked locking should be avoided except in low-level implementations.
As suggested by Arthur P. Golubev, C++ offers primitives to do this, such as std::call_once
Update:
Here's an example that shows one of the problems a race can cause.
#include <thread>
#include <atomic>
using namespace std::literals::chrono_literals;
int main()
{
int flag {0}; // wrong !
std::thread t{[&] { while (!flag); }};
std::this_thread::sleep_for(20ms);
flag = 1;
t.join();
}
The sleep is there to give the thread some time to initialize.
This program should return immediately, but compiled with full optimization -O3, it probably doesn't. This is caused by a valid compiler transformation, that changes the while-loop into something like this:
if (flag) return; while(1);
And if flag is (still) zero, this will run forever (changing the flag type to std::atomic<int> will solve this).
This is only one of the effects of undefined behavior, the compiler does not even have to commit the change to flag to memory.
With a race, or incorrectly set (or missing) barriers, operations can also be re-ordered causing unwanted effects, but these are less likely to occur on X86 since it is a generally more forgiving platform than weaker architectures (although re-ordering effects do exist on X86)
Somebody told me that the double-check lock is incorrect
It usually is.
IIRC double-checked locking originated in Java (whose more strongly-specified memory model made it viable).
From there it spread a plague of ill-informed and incorrect C++ code, presumably because it looks enough like Java to be vaguely plausible.
Since the variable is non-volatile
Double-checked locking cannot be made correct by using volatile for synchronization, because that's not what volatile is for.
Java is perhaps also the source of this misuse of volatile, since it means something entirely different there.
Thanks for linking to the review that suggested this, I'll go and downvote it.
But I really saw such a code snippet is used in many projects indeed. Could somebody shed some light on this matter?
As I say, it's a plague, or really I suppose a harmful meme in the original sense.
I googled and talked about it with my friends, but I still can't find out the answer.
... Is there any potential problem with double-check lock for C++?
There are nothing but problems with double-checked locking for C++. Almost nobody should ever use it. You should probably never copy code from anyone who does use it.
In preference order:
Just use a static local, which is even less effort and still guaranteed to be correct - in fact:
If multiple threads attempt to initialize the same static local variable concurrently, the initialization occurs exactly once (similar behavior can be obtained for arbitrary functions with std::call_once).
Note: usual implementations of this feature use variants of the double-checked locking pattern, which reduces runtime overhead for already-initialized local statics to a single non-atomic boolean comparison.
so you can get correct double-checked locking for free.
Use std::call_once if you need more elaborate initialization and don't want to package it into a class
Use (if you must) double-checked locking with a std::atomic_flag or std::atomic_bool flag and never volatile.
There is nothing to optimize away here (no commands to be excluded, see the details below), but there are the following:
It is possible that is_log_file is set to true before log_stream opens the file; and then another thread is possible to bypass the outer if block code and start using the stream before the std::ofstream::open has completed.
It could be solved by using std::atomic_thread_fence(std::memory_order_release); memory barrier before setting the flag to true.
Also, a compiler is forbidden to reorder accesses to volatile objects on the same thread (https://en.cppreference.com/w/cpp/language/as_if), but, as for the code specifically, the available set of operator << functions and write function of std::ofstream just is not for volatile objects - it would not be possible to write in the stream if make it volatile (and making volatile only the flag would not permit the reordering).
Note, a protection for is_log_file flag from a data race with C++ standard library means releasing std::memory_order_release or stronger memory order - the most reasonable would be std::atomic/std::atomic_bool (see LWimsey's answer for the sample of the code) - would make reordering impossible because the memory order
Formally, an execution with a data race is considered to be causing undefined behaviour - which in the double-checked lock is actual for is_log_file flag. In a conforming to the standard of the language code, the flag must be protected from a data race (the most reasonable way to do it would be using std::atomic/std::atomic_bool).
Though, in practice, if the compiler is not insane so that intentionally spoils your code (some people wrongly consider undefined behaviour to be what occurs in run-time and does not relate to compilation, but standard operates undefined behaviour to regulate compilation) under the pretext it is allowed everything if undefined behavior is caused (by the way, must be documented; see details of compiling C++ code with a data race in: https://stackoverflow.com/a/69062080/1790694
) and at the same time if it implements bool reasonably, so that consider any non-zero physical value as true (it would be reasonable since it must convert arithmetics, pointers and some others to bool so), there will never be a problem with partial setting the flag to true (it would not cause a problem when reading); so the only memory barrier std::atomic_thread_fence(std::memory_order_release); before setting the flag to true, so that reordering is prevented, would make your code work without problems.
At https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables you can read that implementations of initialization of static local variables since C++11 (which you also should consider to use for one-time actions in general, see the note about what to consider for one-time actions in general below) usually use variants of the double-checked locking pattern, which reduces runtime overhead for already-initialized local statics to a single non-atomic boolean comparison.
This is an examples of exactly that environment-dependent safety of a non-atomic flag which I stated above. But it should be understood that these solutions are environment-dependent, and, since they are parts of implementations of the compilers themselves, but not a program using the compilers, there is no concern of conforming to the standard there.
To make your program corresponding to the standard of the language and be protected (as far as the standard is implemented) against a compiler implementation details liberty, you must protect the flag from data races, and the most reasonable then, would be using std::atomic or std::atomic_bool.
Note, even without protection of the flag from data races:
because of the mutex, it is not possible that any thread would not get updates after changing values (both the bool flag and the std::ofstream object) by some thread.
The mutex implements the memory barrier, and if we don’t have the update when checking the flag in the first condition clause, we then get it then come to the mutex, and so guaranteedly have the updated value when checking the flag in the second condition clause.
because the flag can unobservably be potentially accessed in unpredictable ways from other translation units, the compiler would not be able to avoid writes and reads to the flag under the as-if rule even if the other code of translation unit would be so senseless (such as setting the flag to true and then starting the threads so that no resets to false accessible) that it would be permitted in case the flag is not accessible from other translation units.
For one-time actions in general besides raw protection with flags and mutexes consider using:
std::call_once (https://en.cppreference.com/w/cpp/thread/call_once);
calling a function for initializing a static local variable (https://en.cppreference.com/w/cpp/language/storage_duration#Static_local_variables) if its lifetime suits since its initialization is data race safety (be careful in regards to the fact that data race safety of initialization of static local variables is present only since C++11).
All the mentioned multi-threading functionality is available since C++11 (but, since you are already using std::mutex which is available starting since it too, this is the case).
Also, you should correctly handle the cases of opening the file failure.
Also, everyone must protect your std::ofstream object from concurrent operations of writing to the stream.
Answering the additional question from the update of the question, there are no problems with properly implemented double-check lock, and the proper implementation is possible in C++.

C++ std::atomic vs. Boost atomic

In my application, I have an int and a bool variable, which are accessed (multiple write/read) by multiple threads. Currently, I am using two mutexes, one for int and one for bool to protect those variables.
I heard about using atomic variables and operators to write lock-free multi-thread program. My questions are
What's the definition of atomic variables and operators?
What's the main difference between std::atomic and
boost/atomic.hpp? Which one is more standard or popular?
Are these libraries platform-dependent? I am using gnu gcc 4.6 on
Linux at the moment, but ideally it shall be cross-platform. I heard that the definition of "atomic" actually depends on the hardware as well. Can anyone explain that as well?
What's the best way to share a bool variable among multiple threads? I would prefer not to use the "volatile" keyword.
Are these code thread-safe?
double double_m; // double_m is only accessed by current thread.
std::atomic<bool> atomic_bool_x;
atomic_bool_x = true && (double_m > 12.5);
int int_n; // int_n is only accessed by current thread.
std::atomic<int> atomic_int_x;
std::atomic<int> atomic_int_y;
atomic_int_y = atomic_int_x * int_n;
I'm not an expert or anything, but here's what I know:
std::atomic simply says that calling load and store (and a few other operations) concurrently is well-defined. An atomic operation is indivisible - nothing can happen 'in-between'.
I assume std::atomic is based off of boost::atomic. If you can, use std, otherwise use boost.
They are both portable, with the std being completely so, however your compiler will need to support C++11
Likely std::atomic_bool. You should not need to use volatile.
Also, I believe load/store differs from operator=/operator T only load/store are atomic.
Nevermind. I checked the standard and it appears that the operators are defined in terms of load/store/etc, however they may return different things.
Further reading:
http://en.cppreference.com/w/cpp/atomic/atomic
C++11 Standard
C++ Concurrency in Action
Volatile is orthogonal to what you use to implement atomics. In C++ it tells the compiler that certain it is not safe to perform optimizations with that variable. Herb Sutters lays it out:
To safely write lock-free code that communicates between threads without using locks, prefer to use ordered atomic variables: Java/.NET volatile, C++0x atomic, and C-compatible atomic_T.
To safely communicate with special hardware or other memory that has unusual semantics, use unoptimizable variables: ISO C/C++ volatile. Remember that reads and writes of these variables are not necessarily atomic, however.
Finally, to express a variable that both has unusual semantics and has any or all of the atomicity and/or ordering guarantees needed for lock-free coding, only the ISO C++0x draft Standard provides a direct way to spell it: volatile atomic.
(from http://drdobbs.com/article/print?articleId=212701484&siteSectionName=parallel)
See std::atomic class template
std::atomic is standard since C++11, and the Boost stuff is older. But since it is standard now, I would prefer std::atomic.
?? You can use std::atomic with each C++11 compiler on each platform you want.
Without any further information...
std::atomic;
I believe std::atomic (C++11) and boost.atomic are equivalent. If std::atomic is not supported by your compiler yet, use boost::atomic.

C++ Multithread with global variables

Anyone know whether primitive global variable is thread safe or not?
// global variable
int count = 0;
void thread1()
{
count++;
}
void thread2()
{
count--;
if (count == 0) print("Stuff thing");
}
Can I do it this way without any lock protection for count?
Thank you.
This is not threadsafe. You have a race-condition here. The reason for that is, that count++ is not necessarily atomic (means not a single processor operation). The value is first loaded, then incremented, and then written back. Between each of these steps, the other thread can also modify the value.
No, it's not. It may be, depending on the implementation, the compile-time options and even the phase of the moon.
But the standard doesn't mandate something is thread-safe, specifically because there's nothing about threading in the current standard.
See also here for a more detailed analysis of this sort of issue.
If you're using an environment where threads are supported, you can use a mutex: examine the pthread_mutex_* calls under POSIX threads, for example.
If you're coding for C++0x/C++11, use either a mutex or one of the atomic operations detailed in that standard.
It will be thread safe only if you have 1 CPU with ++ and -- atomic operations on your PC.
If you want to make it thread safe this is the way for Windows:
LONG volatile count = 0;
void Thread1()
{
::InterlockedIncrement( &count );
}
void Thread2()
{
if(::InterlockedDecrement( &count ) == 0 )
{
printf("Stuf");
}
}
You need two things to safely use an object concurrently by two threads or more: atomicity of operations and ordering guarantees.
Some people will pretend that on some platforms what you're attempting here is safe because e.g. operations on whatever type int stands for those platforms are atomic (even incrementing or whatever). The problem with this is that you don't necessarily have ordering guarantees. So while you want and know that this particular variable is going to be accessed concurrently, the compiler doesn't. (And the compiler is right to assume that this variable is going to be used by only one thread at a time: you don't want every variable to be treated as being potentially shared. The performance consequences would be terrible.)
So don't use primitive types this way. You have no guarantees from the language and even if some platforms have their own guarantees (e.g. atomicity) you have no way of telling the compiler that the variable is shared with C++. Either use compiler extensions for atomic types, the C++0x atomic types, or library solutions (e.g. mutexes). And don't let the name mislead you: to be truly useful, an atomic type has to provide ordering guarantees along with the atomicity that comes with the name.
It is a Global variable and hence multiple threads can race to change it. It is not thread safe.
Use a mutex lock.
In general, no, you can't get away with that. In your trivial example case, it might happen to work but it's not reliable.

assignment in pthreads application

I have a linux multithread application in C++.
In this application in class App offer variable Status:
class App {
...
typedef enum { asStop=0, asStart, asRestart, asWork, asClose } TAppStatus;
TAppStatus Status;
...
}
All threads are often check Status by calling GetStatus() function.
inline TAppStatus App::GetStatus(){ return Status };
Other functions of the application can assign a different values to a Status variable by calling SetStatus() function and do not use Mutexes.
void App::SetStatus( TAppStatus aStatus ){ Status=aStatus };
Edit: All threads use Status in switch operator:
switch ( App::GetStatus() ){ case asStop: ... case asStart: ... };
Is the assignment in this case, an atomic operation?
Is this correct code?
Thanks.
There is no portable way to implement synchronized variables in C99 or C++03 and pthread library does not provide one either. You can:
Use C++0x <atomic> header (or C1x <stdatomic.h>). Gcc does support it for C++ if given -std=c++0x or -std=gnu++0x option since version 4.4.
Use the Linux-specific <linux/atomic.h> (this is implementation used by kernel, but it should be usable from userland as well).
Use GCC-specific __sync_* builtin functions.
Use some other library that provides atomic operations like glib.
Use locks, but that's orders of magnitude slower compared to the fast operation itself.
Note: As Martinho pointed out, while they are called "atomic", for store and load it's not the atomic property (operation cannot be interrupted and load always sees or does not see the whole store, which is usually true of 32-bit stores and loads) but the ordering property (if you store a and than b, nobody may get new value of b and than old value of a) that is hard to get but necessary in this case.
This depends entirely upon the enum representation chosen. For x86, I believe, then all assignment operations of the operating system's word size (so 32bit for x86 and 64bit for x64) and alignment of that size as well are atomic, so a simple read and write is atomic.
Even assuming that it is the correct size and alignment, this doesn't mean that these functions are thread-safe, depends on what the status is used for.
Edit: In addition, the compiler's optimizer may well wreak havoc if there are no uses of atomic operations or other volatile accesses.
Edit to your edit: No, that's not thread safe at all. If you converted it manually into a jump table then you might be thread-safe, I'll need to think about it for a little while.
On certain architecture this assignment might be atomic (by accident), but even if it is, this code is wrong. Compiler and hardware may perform various optimizations, with might break this "atomicity". Look at: http://video.google.com/videoplay?docid=-4714369049736584770#
Use locks or atomic http://www.stdthread.co.uk/doc/headers/atomic/atomic.html variable to fix it.