C++ atomic with non-trivial type? - c++

Reading the docs on boost::atomic and on std::atomic leaves me confused as to whether the atomic interface is supposed to support non-trivial types?
That is, given a (value-)type that can only be written/read by enclosing the read/write in a full mutex, because it has a non-trivial copy-ctor/assignment operator, is this supposed to be supported by std::atomic (as boost clearly states that it is UB).
Am I supposed to provide the specialization the docs talk about myself for non-trivial types?
Note: I was hitting on this because I have a cross-thread callback object boost::function<bool (void)> simpleFn; that needs to be set/reset atomically. Having a separate mutex / critical section or even wrapping both in a atomic-like helper type with simple set and get seem easy enough, but is there anything out of the box?

Arne's answer already points out that the Standard requires trivially copyable types for std::atomic.
Here's some rationale why atomics might not be the right tool for your problem in the first place: Atomics are the fundamental building primitives for building thread-safe data structures in C++. They are supposed to be the lowest-level building blocks for constructing more powerful data structures like thread-safe containers.
In particular, atomics are usually used for building lock-free data structures. For locking data structures primitives like std::mutex and std::condition_variable are a way better match, if only for the fact that it is very hard to write blocking code with atomics without introducing lots of busy waiting.
So when you think of std::atomic the first association should be lock-free (despite the fact that most of the atomic types are technically allowed to have blocking implementations). What you describe is a simple lock-based concurrent data structure, so wrapping it in an atomic should already feel wrong from a conceptual point of view.
Unfortunately, it is currently not clear how to express in the language that a data structure is thread-safe (which I guess was your primary intent for using atomic in the first place). Herb Sutter had some interesting ideas on this issue, but I guess for now we simply have to accept the fact that we have to rely on documentation to communicate how certain data structures behave with regards to thread-safety.

The standard specifies (ยง29.5,1) that
The type of the template argument T shall be trivially copyable
Meaning no, you cannot use types with non-trivial copy-ctor or assignment-op.
However, like with any template in namespace std, you are free to specialize the template for any type that it has not been specialized for by the implementation. So if you really want to use std::atomic<MyNonTriviallyCopyableType>, you have to provide the specialization yourself. How that specialization behaves is up to you, meaning, you are free to blow off your leg or the leg of anyone using that specialization, because it's just outside the scope of the standard.

Related

Partial template specialization of std::atomic for smart pointers

Background
Since C++11, atomic operations on std::shared_ptr can be done via std::atomic_... methods found here, because the partial specialization as shown below is not possible:
std::atomic<std::shared_ptr<T>>
This is due to the fact that std::atomic only accepts TriviallyCopyable types, and std::shared_ptr (or std::weak_ptr) is not trivially copyable.
However, as of C++20, these methods have been deprecated, and got replaced by the partial template specialization of std::atomic for std::shared_ptr as described here.
Question
I am not sure of
Why std::atomic_... got replaced.
Techniques used to enable the partial template specialization of std::atomic for smart pointers.
Several proposals for atomic<shared_ptr> or something of that nature explain a variety of reasons. Of particular note is P0718, which tells us:
The C++ standard provides an API to access and manipulate specific shared_ptr objects atomically, i.e., without introducing data races when the same object is manipulated from multiple threads without further synchronization. This API is fragile and error-prone, as shared_ptr objects manipulated through this API are indistinguishable from other shared_ptr objects, yet subject to the restriction that they may be manipulated/accessed only through this API. In particular, you cannot dereference such a shared_ptr without first loading it into another shared_ptr object, and then dereferencing through the second object.
N4058 explains a performance issue with regard to how you have to go about implementing such a thing. Since shared_ptr is typically bigger than a single pointer in size, atomic access typically has to be implemented with a spinlock. So either every shared_ptr instance has a spinlock even if it never gets used atomically, or the implementation of those atomic functions has to have a lookaside table of spinlocks for individual objects. Or use a global spinlock.
None of these are problems if you have a type dedicated to being atomic.
atomic<shared_ptr> implementations can use the usual techniques for atomic<T> when T is too large to fit into a CPU atomic operation. They get to get around the TriviallyCopyable restriction by fiat: the standard requires that they exist and be atomic, so the implementation makes it so. C++ implementations don't have to play by the same rules as regular C++ programs.

Why doesn't C++11 standard provide other lock free atomic structure [duplicate]

From C++ Concurrency in Action:
difference between std::atomic and std::atomic_flag is that std::atomic may not be lock-free; the implementation may have to acquire a mutex internally in order to ensure the atomicity of the operations
I wonder why. If atomic_flag is guaranteed to be lock-free, why isn't it guaranteed for atomic<bool> as well?
Is this because of the member function compare_exchange_weak? I know that some machines lack a single compare-and-exchange instruction, is that the reason?
First of all, you are perfectly allowed to have something like std::atomic<very_nontrivial_large_structure>, so std::atomic as such cannot generally be guaranteed to be lock-free (although most specializations for trivial types like bool or int probably could, on most systems). But that is somewhat unrelated.
The exact reasoning why atomic_flag and nothing else must be lock-free is given in the Note in N2427/29.3:
Hence the operations must be address-free. No other type requires lock-free operations, and hence the atomic_flag type is the minimum hardware-implemented type needed to conform to this standard. The remaining types can be emulated with atomic_flag, though with less than ideal properties.
In other words, it's the minimum thing that must be guaranteed on every platform, so it's possible to implement the standard correctly.
The standard does not garantee atomic objects are lock-free. On a platform that doesn't provide lock-free atomic operations for a type T, std::atomic<T> objects may be implemented using a mutex, which wouldn't be lock-free. In that case, any containers using these objects in their implementation would not be lock-free either.
The standard provide an opportunity to check if an std::atomic<T> variable is lock-free: you can use var.is_lock_free() or atomic_is_lock_free(&var). For basic types such as int, there is also macros provided (e.g. ATOMIC_INT_LOCK_FREE) which specify if lock-free atomic access to that type is available.
std::atomic_flag is an atomic boolean type. Almost always for boolean type it's not needed to use mutex or another way for synchronization.

Use constructor in place of atomic.store() when atomicity is not currently needed

I use std::atomic for atomicity. Still, somewhere in the code, atomicity is not needed by program logic. In this case, I'm wondering whether it is OK, both pedantically and practically, to use constructor in place of store() as an optimization. For example,
// p.store(nullptr, std::memory_order_relaxed);
new(p) std::atomic<node*>(nullptr);
In accord with the standard, whether this works depends entirely on the implementation of std::atomic<T>. If it is lock-free for that T, then the implementation probably just stores a T. If it isn't lock-free, things get more complex, since it may store a mutex or some other thing.
The thing is, you don't know what std::atomic<T> stores. This matters because if it stores a const-qualified object or a reference type, then reusing the storage here will cause problems. The pointer returned by placement-new can certainly be used, but if a const or reference type is used, the original object name p cannot.
Why would std::atomic<T> store a const or reference type? Who knows; my point is that, because its implementation is not under your control, then pedantically you cannot know how any particular implementation behaves.
As for "practically", it's unlikely that this will cause a problem. Especially if the atomic<T> is always lock-free.
That being said, "practically" should also include some notion of how other users will interpret this code. While people experienced with doing things like reusing storage will be able to understand what the code is doing, they will likely be puzzled by why you're doing it. That means you'll need to either stick a comment on that line or make a (template) function non_atomic_reset.
Also, it should be noted that std::shared_ptr uses atomic increments/decrements for its reference counter. I bring that up because there is no std::single_threaded_shared_ptr that doesn't use atomics, or a special constructor that doesn't use atomics. So even in cases where you're using shared_ptr in pure single-threaded code, those atomics are still firing. This was considered a reasonable tradeoff by the C++ standards committee.
Atomics aren't cheap, but they're not that expensive (most of the time) that using unusual mechanisms like this to bypass an atomic store is a good idea. As always, profile to see if the code obfuscation is worth it.

Why atomic overloads for shared_ptr exist

Why are there are atomic overloads for shared_ptr as described here rather than there being a specialization for std::atomic which deals with shared_ptrs. Seems inconsistent with the object oriented patterns employed by the rest of the C++ standard library..
And just to make sure I am getting this right, when using shared_ptrs to implement the read copy update idiom we need to do all accesses (reads and writes) to shared pointers through these functions right?
Because:
std::atomic may be instantiated with any TriviallyCopyable type T.
Source: http://en.cppreference.com/w/cpp/atomic/atomic
And
std::is_trivially_copyable<std::shared_ptr<int>>::value == false;
Thus, you cannot instantiate std::atomic<> with std::shared_ptr<>. However, automatic memory management is useful in multi-threading, thus those overloads were provided. Those overloads are most likely not lock-free however (one of the big draws of using std::atomic<> in the first place); they probably use a lock to provide synchronicity.
As for your second question: yes.

How to do something like "is_atomically_assignable"?

Of course, there's no such thing in std, but I need equivalent functionality.
I have a lock-free data structure templated on a type T, where T is provided by the user, and what I need to statically assert is that T is a type that is atomically assignable on x86 or x86-64 (which includes all built-in integral constants and floating point types, and any typedef thereof, but I think is not necessarily limited to those). I'm guessing that merely checking that the type is trivially assignable and that its sizeof is <= 8 is not sufficient. What's the best way to do this? Forcing T to be an std::atomic<> and then checking is_lock_free() is out of the question.
"atomically assignable" is not sufficient condition for using a type to implement a lock-free data structure, so this idea is going down to the wrong path from the start.
Using std::atomic (and friends) is the only way in C++ to have both the atomicity and the ordering guarantees necessary to implement a lock-free data structure. Atomic assignment is useless if no other thread will ever see it.