Parallel implementations of std algorithms and side effects

Parallel implementations of std algorithms and side effects - c++

Going through the standard documentation for std::transform, I noticed that until C++11 the functor argument was required not to have side effects, while from C++11 onwards the requirement has been made less restrictive - "op and binary_op shall not invalidate iterators or subranges, or modify elements in the ranges". See
http://en.cppreference.com/w/cpp/algorithm/transform
and section 25.3.4 of the standard. The webpage on cppreference.com mentions as well that "The intent of these requirements is to allow parallel or out-of-order implementations of std::transform".
I do not understand then if this snippet of code is legal or not in C++11:
std::vector<int> v(/* fill it with something */), v_transformed;
int foo = 0;
std::transform(v.begin(),v.end(),std::back_inserter(v_transformed),[&foo](const int &n) -> int {
foo += 1;
return n*2;
});
Clearly, if std::transform is parallelised behind the scenes, we will have multiple concurrent calls to foo += 1, which is going to be UB. But the functor itself does not seem to violate the requirements outlined in the standard.
This question can be asked for other standard algorithms (except I think std::for_each, which explicitly states that the iteration is to be performed in-order).
Did I misunderstand something?

As far as I understand the C++11 spec, all standard library functions have to perform all their operations sequentially, if their effects are visible to the user. In particular, all "mutating sequence operations" have to be perfomed sequentially.
The relevant part of the standard is §17.6.5.9/8:
Unless otherwise specified, C++ standard library functions shall perform all operations solely within the current thread if those operations have effects that are visible (1.10) to users.

The way the algorithms are currently defined they have to be executed sequentially unless the implementation can prove that executing it concurrently doesn't change the semantics. I could imagine a future addition for algorithms which are explicitly allowed to be executed concurrently but they would be different algorithms.

So C++11 now allows std::transform to be parallelised, but that's not a guarantee that your own code will be made parallelisation-safe. Now, yes, I suppose you have to protect your data variables. I can imagine a lot of MT bugs arising from this, if implementations ever do actually paralellise std::transform.

Related

Const-ness affecting concurrent algorithms?

Are there any examples of the absence or presence of const affecting the concurrency of any C++ Standard Library algorithms or containers? If there aren't, is there any reason using the const-ness in this way is not permitted?
To clarify, I'm assuming concurrent const access to objects is data-race free, as advocated in several places, including GotW 6.
By analogy, the noexcept-ness of move operations can affect the performance of std::vectors methods such as resize.
I skimmed through several of the C++17 concurrent Algorithms, hoping to find an example, but I didn't find anything. Algorithms like transform don't require the unary_op or binary_op function objects to be const. I did find for_each takes a MoveConstructible function object for the original, non execution policy version, and take a CopyConstructible function object for the C++17 execution policy version; I thought this might be an example where the programmer was manually forced to select one or the other based on if the function object could be safely copied (a const operation)... but at the moment I suspect this requirement is just there to support the return type (UnaryFunction vs. void).

Can std::transform or std::generate without ExecutionPolicy be parallel?

In C++17 parallel std algorithms were introduced (overloads with ExecutionPolicy arguments), where strict rules of execution order, interleaving and paralelization were defined, for example ([algorithm.parallel.exec/3]):
The invocations of element access functions in parallel algorithms invoked with an execution policy object of
type execution::sequenced_policy all occur in the calling thread of execution. [ Note: The invocations are not interleaved; see 4.6. — end note ]
(same thing in current draft)
The problem is that I can't find any such requirement for old, non-parallel overloads of these algorithms.
Question: Can this mean that library implementers can, since C++11 when thread of execution term was introduced, implement std::transform and std::generate using SIMD/multithreading/other(?)? Is there a reason for that?

[res.on.data.races]/8 Unless otherwise specified, C++ standard library functions shall perform all operations solely within the current thread if those operations have effects that are visible (4.7) to users.
This precludes any kind of behind-the-scenes multithreading that touches any user-defined entities.
I suppose, in principle, something like std::sort working on a vector<int> can prove that no user-defined class is involved, and send work to multiple threads. That's rather far-fetched, it's difficult to imagine any implementation doing this in practice.

Does `std::sort()` use threading to increase its performance?

Does std::sort() typically use threading to increase its performance? I realize that this could vary from implementation to implementation. If not, why not?

[res.on.data.races]/8 Unless otherwise specified, C++ standard library functions shall perform all operations solely within the current thread if those operations have effects that are visible (4.7) to users.
/9 [ Note: This allows implementations to parallelize operations if there are no visible side effects. —end note ]
std::sort could, in principle, use parallel execution when sorting elements of fundamental type (it wouldn't be observable whether it does), but not user-defined type (unless explicitly given permission via execution policy parameter, of course). The type's operator< may not be thread-safe.

Are c++ standard library containers like std::queue guaranteed to be reentrant?

I've seen people suggest that I should wrap standard containers such as std::queue and std::vector in a mutex lock or similar if i wish to use them.
I read that as needing a lock for each individual instance of a container being accessed by multiple threads, not per type or any utilization of the c++ standard library. But this assumes that the standard containers and standard library is guaranteed to be re-entrant.
Is there such a guarantee in the language?

The standard says:
Except where explicitly specified in this standard, it is implementation-defined which functions in the Standard C++ library may be recursively reentered.
Then it proceeds to specify that a function must be reentrant in, if I count them correctly, zero cases.
If one is to strictly follow the standard in this regard, the standard library suddenly becomes rather limited in its usefulness. A huge number of library functions call user-supplied functions. Writers of these functions, especially if those are themselves released as a library, in general don't know where they will be called from.
It is completely reasonable to assume that e.g. any constructor may be called from emplace_back of any standard container; if the user wishes to eliminate any uncertainty, he must refrain from any calls to emplace_back in any constructor. Any copy constructor is callable from e.g. vector::resize or sort, so one cannot manage vectors or do sorting in copy constructors. And so on, ad libitum.
This includes calling any third party component that might reasonably be using the standard library.
All these restrictions together probably mean that a large part of the standard library cannot be used in real world programs at all.
Update: this doesn't even start taking threads into consideration. With multiple threads, at least functions that deal with containers and algorithms must be reentrant. Imagine that std::vector::operator[] is not reentrant. This would mean that one cannot access two different vectors at the same time from two different threads! This is clearly not what the standard intends. I understand that this is your primary interest. To reiterate, no, I don't think there is reentrancy guarantee; and no, I don't think absence of such guarantee is reasonable in any way. --- end update.
My conclusion is that this is probably an oversight. The standard should mandate that all standard functions must be reentrant, unless otherwise specified.
I would
completely ignore the possibility of of any standard function being non-reentrant, except when it is clear that the function cannot be reasonably made reentrant.
raise an issue with the standards committee.

[Answer left for historical purposes, but see n.m.'s answer. There's no requirement on individual functions, but there is a single global non-requirement]
Yes, the standard guarantees reentrancy of member functions of standard containers.
Let me define what (non)-reentrancy means for functions. A reentrant function can be called with well-defined behavior on a thread while it is already on the call stack of that thread, i.e. executing. Obviously, this can only happen if the control flow temporarily left the reentrant function via a function call. If the behavior is not well-defined, the function is not reentrant.
(Leaf functions can't be said to be reentrant or non-reentrant, as the flow of control can only leave a leaf function by returning, but this isn't critical to the analysis).
Example:
int fac(int n) { return n==0 ? 1 : n * fac(n-1); }
The behavior of fac(3) is to return 6, even while fac(4) is running. Therefore, fac is reentrant.
The C++ Standard does define the behavior of member functions of standard containers. It also defines all restrictions under which such behavior is guaranteed. None of the member functions of standard containers have restrictions with respect to reentrancy. Therefore, any implementation which would restrict reentrancy is non-conformant.

C++11 STL containers and thread safety

I have trouble finding any up-to-date information on this.
Do C++11 versions of STL containers have some level of thread safety guaranteed?
I do expect that they don't, due to performance reasons. But then again, that's why we have both std::vector::operator[] and std::vector::at.

Since the existing answers don't cover it (only a comment does), I'll just mention 23.2.2 [container.requirements.dataraces] of the current C++ standard specification which says:
implementations are required to avoid data races when the contents of the contained object in different elements in the same sequence, excepting vector<bool>, are modified concurrently.
i.e. it's safe to access distinct elements of the same container, so for example you can have a global std::vector<std::future<int>> of ten elements and have ten threads which each write to a different element of the vector.
Apart from that, the same rules apply to containers as for the rest of the standard library (see 17.6.5.9 [res.on.data.races]), as Mr.C64's answer says, and additionally [container.requirements.dataraces] lists some non-const member functions of containers that can be called safely because they only return non-const references to elements, they don't actually modify anything (in general any non-const member function must be considered a modification.)

I think STL containers offer the following basic thread-safety guarantee:
simultaneous reads of the same object are OK
simultaneous read/writes of different objects are OK
But you have to use some form of custom synchronization (e.g. critical section) if you want to do something different, like e.g. simultaneous writes on the same object.

No. Check out PPL or Intel TBB for thread safe STL-like containers.
Like others have noted they have usual "multiple reader thread safety" but that is even pre C++11. Ofc this doesnt mean single writer multiple readers. It means 0 writers. :)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Parallel implementations of std algorithms and side effects - c++

Related

Const-ness affecting concurrent algorithms?

Can std::transform or std::generate without ExecutionPolicy be parallel?

Does `std::sort()` use threading to increase its performance?

Are c++ standard library containers like std::queue guaranteed to be reentrant?

C++11 STL containers and thread safety

Categories

Resources