Restrictions for Fortran loops associated with the worksharing or simd loop constructs - fortran

OpenMP for Fortran has a few loop-related constructs like the work-sharing do loop construct, simd loop construct etc. These constructs can have one or more Fortran loops related to them. The latter is when there are collapse or ordered clauses. I am looking for the restrictions placed on these Fortran loops by the OpenMP constructs. Particularly whether these loops can be without control, DO WHILE or DO Concurrent.
The OpenMP 4.5 standard specifically states the following in Section 2.7.1. However, I could not find similar statements in the latest versions of the standard like OpenMP 5.2. Has this been reworded or removed from the standard?
"19 • The do-loop cannot be a DO WHILE or a DO loop without loop control."
The OpenMP 5.2 standard includes restrictions for DO CONCURRENT constructs. There are restrictions that prevent OpenMP directives from appearing in a WHERE, FORALL or DO CONCURRENT construct. But I could not find a restriction for DO CONCURRENT appearing as the associated loop of a work-sharing do loop construct or simd loop construct.
Could a clarification on these be provided or a pointer provided to the relevant section in the standard? Thanks in advance.
I have browsed through the standard but could not find relevant sections. It is possible I have missed it, if so apologies in advance.

Related

Is combining std::execution and OpenMP advisable?

I use OpenMP since some time now. Recently, on a new project, I choose to use c++17, for some features.
Because of that, I have been concerned by std::execution which allow to parallelize algorithms. That seems really powerful and elegant, But their are a lot of feature of OpenMP really useful that are not easy to use with algorithms (barrier, SIMD, critical, etc..).
So I think to mix the std::execution::par (or unseq_par) with OpenMP. Is it a good idea, or should i stay only with OpenMP?
Unfortunately this is not officially supported. It may or may not work, depending on the implementation, but it is not portable.
Only the most recent version, OpenMP 5.0, even defines the interaction with C++11. In general, using anything from C++11 and forward "may result in unspecified behavior". While future versions of the OpenMP specification are expected to address the following features, currently their use may result in unspecified behavior.
Alignment support
Standard layout types
Allowing move constructs to throw
Defining move special member functions
Concurrency
Data-dependency ordering: atomics and memory model
Additions to the standard library
Thread-local storage
Dynamic initialization and destruction with concurrency
C++11 library
While C++17 and its specific high-level parallelism support is not mentioned, it is clear from this list, that it is unsupported.

How to implement user-defined reduction with OpenACC?

Is there a way to implement a user-defined reduction with OpenACC similar to declare reduction in OpenMP?
So that I could write something like
#pragma acc loop reduction(my_function:my_result)
Or what would be the appropriate way to implement efficient reduction without the predefined operators?
User defined reductions aren't yet part of the OpenACC standard. While I'm not part of the OpenACC technical committee, I believe they have received requests for this but not sure if it's something being considered for the 3.0 standard.
Since the OpenACC standard is largely user driven, I'd suggest you send a note to the OpenACC folks requesting this support. The more folks that request it, the more likely it is to be adopted in the standard.
Contact info for OpenACC can be found at the bottom of https://www.openacc.org/about

Mixing C++11 atomics and OpenMP

OpenMP has its own support for atomic access, however, there are at least two reasons for preferring C++11 atomics: they are significantly more flexible and they are part of the standard. On the other hand, OpenMP is more powerful than the C++11 thread library.
The standard specifies the atomic operations library and the thread support library in two distinct chapters. This makes me to believe that the components for atomic access are kind of orthogonal to the thread library used. Can I indeed combine C++11 atomics and OpenMP?
There is a very similar question on Stack Overflow; however, it has been basically unanswered for three years, since its answer does not answer the actual question.
Update:
OpenMP 5.0 defines the interactions to C++11 and further. Among others, it says that using the following features may result in unspecified behavior:
Data-dependency ordering: atomics and memory model
Additions to the standard library
C++11 library
So clearly, mixing C++11 atomics and OpenMP 5.0 will result in unspecified behavior. At least the standard itself promises that "future versions of the OpenMP specification are expected to address [these] features".
Old discussion:
Interestingly, the OpenMP 4.5 standard (2.13.6) has a rather vague reference to C++11 atomics, or more specific std::memory_order:
The intent is that, when the analogous operation exists in C++11 or
C11, a sequentially consistent atomic construct has the same semantics
as a memory_order_seq_cst atomic operation in C++11/C11. Similarly, a
non-sequentially consistent atomic construct has the same semantics as
a memory_order_relaxed atomic operation in C++11/C11.
Unfortunately this is only a note, there is nothing that defines that they are playing nicely together. In particular, even the latest OpenMP 5.0 preview still refers to C++98 as the only normative reference for C++. So technically, OpenMP doesn't even support C++11 itself.
That aside, it will probably work most of the time in practice. I would agree that using std::atomic has less potential for trouble if used together with OpenMP than C++11 threading. But if there is any trouble, it may not be as obvious. Worst case would be a atomic that doesn't operate atomically, even though I have serious trouble imagining a realistic scenario where this may happen. At the end of the day, it may not be worth it and the safest thing is to stick with pure OpenMP or pure C++11 thread/atomics.
Maybe Hristo has something to say about this, in the mean time check out this answer for a more general discussion. While a bit dated, I'm afraid it still holds.
This is currently unspecified by OpenMP 4.5. In practice, you can use C++11 atomic operations with OpenMP threads in most compilers, but there is no formal guarentee that it will work.
Because of the unspecified behavior, GCC did not support C11 atomics (which are nearly identical in semantics to C++11 atomics) and OpenMP threads until recently. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65467 for details.
OpenMP 5.0 made an attempt to address this. The normative language references were updated to C11 and C++11. However, the atomics and memory model from these is "not supported", which means implementation-defined. I wish OpenMP 5.0 said more but it is extremely difficult to define the interaction of OpenMP and ISO language atomics.

Reentrantlock in openmp

I heard about Reentrantlock recently which are available in Java. But I was trying to implement parallel data structures like priority queues using openmp and C++.
I was curious to know whether a similar equivalent exists in openmp and C++ or whether it can be implemented using pthreads? If there exists such an equivalent, please tell how to use it.
See the description of omp_nest_lock on page 270 (PDF page 279) in the OpenMP 4.5 standard.
A meta-question is "Why are you doing this?"
Why aren't you simply using something like TBB's Concurrent Priority Queue?
Do you need to be using OpenMP for other reasons?
Is this is for your own education?
If not, then TBB might be a simpler approach (it is now Apache Licensed).
(FWIW I work for Intel, who wrote TBB, but I work on OpenMP, not TBB :-))

Can C++ attributes be used to replace OpenMP pragmas?

C++ attributes provide a convenient and standardized way to markup code with extra information to give to the compiler and/or other tools.
Using OpenMP involves adding a lot of #pragma omp... lines into the source (such as to mark a loop for parallel processing). These #pragma lines seem to be excellent candidates for a facility such as generalized attributes.
For example, #pragma omp parallel for might become [[omp::parallel(for)]].
The often inaccurate cppreference.com uses such an attribute as an example here, which confirms it has at least been considered (by someone).
Is there a mapping of OpenMP pragmas to C++ attributes currently available and supported by any/all of the major compilers? If not, are there any plans underway to create one?
This is definitely a possibility and it's even something the OpenMP language committee is looking at. Take a look at OpenMP Technical Report 8 (https://www.openmp.org/wp-content/uploads/openmp-TR8.pdf) page 36, where a syntax for using OpenMP via attributes is proposed. Inclusion in TR8 doesn't guarantee its inclusion in version 5.1, but it shows that it's being discussed. This syntax is largely based on the work done in the original proposal for C++ attributes.
If you have specific feedback on this, I'd encourage you to provide feedback on this via the OpenMP forum (http://forum.openmp.org/forum/viewforum.php?f=26).