Are local variables in procedures automatically private when using OpenMP? - fortran

I am relatively new to using OpenMP with Fortran 90. I know that local variables in called subroutines are automatically private when using a parallel do loop. Is the same true for functions that are called from the parallel do loop? Are there any differences between external functions and functions defined in the main program?
I would assume that external functions behave the same as subroutines, but I am specifically curious about functions in main program. Thanks!

The local variables of a procedure (function or subroutine) called in the OpenMP parallel region are private if the procedure is recursive, or an equivalent compiler option is enabled (mostly it is automatic when enabling OpenMP) provided the variable is not save.
If it has the save attribute (explicit or implicit from an initialization) it is shared between all invocations. It doesn't matter if you call it from a worksharing construct (omp do, omp sections,...) or directly from the omp parallel region.
It also doesn't matter whether the procedure is external, a module procedure or internal (which you confusingly call "in the main program").

Related

Nested paralleled regions OpenMP

What does it mean in OpenMP that
Nested parallel regions are serialized by default
Does it mean threads do it continuously? I also can not underestend this part:
A throw executed inside a parallel region must cause execution to resume within
the dynamic extent of the same structured block, and it must be caught by the
same thread that threw the exception.
As explained here (scroll down to "17.1 Nested parallelism", by default a nested parallel region will not be parallelized, thus run sequentially. Nested thread creation is possible using either OMP_NESTED=true (as environment variable) or omp_set_nested(1) (in your code).
EDIT: also see this answer to a similar question.

Usage of openMP Shared clause in C++

According to this
All variables defined outside a parallel construct become shared when the parallel region is encountered.
I am wondering what would be the usage of openMP Shared clause while developing in C++.
Even if variables are shared by default, the default can be changed by the default() clause. When you have default(none) or default(private) you have to declare shared variables explicitly.
There many many uses for shared variables.
A large array is typically used shared and different threads are operating on a different part of the array.
Or a configuration parameter which you are only reading, not modifying, that can be shared.
Or a global variable defining some state or a flag even if you are changing that under some condition. You would have it shared and change it in a critical or single section.

the behavior of omp critical with nested level of parallelism

Considering the following scenario:
Function A creates a layer of OMP parallel region, and each OMP thread make a call to a function B, which itself contain another layer of OMP parallel region.
Then if within the parallel region of function B, there is a OMP critcal region, then, does that region is critical "globally" with respect to all threads created by function A and B, or it is merely locally to function B?
And what if B is a pre-bulit function (e.g. static or dynamic linked libraries)?
Critical regions in OpenMP have global binding and their scope extends to all occurrences of the critical construct that have the same name (in that respect all unnamed constructs share the same special internal name), no matter where they occur in the code. You can read about the binding of each construct in the corresponding Binding section of the OpenMP specification. For the critical construct you have:
The binding thread set for a critical region is all threads. Region execution is restricted to a single thread at a time among all the threads in the program, without regard to the team(s) to which the threads belong.
(HI: emphasis mine)
That's why it is strongly recommended that named critical regions should be used, especially if the sets of protected resources are disjoint, e.g.:
// This one located inside a parallel region in fun1
#pragma omp critical(fun1)
{
// Modify shared variables a and b
}
...
// This one located inside a parallel region in fun2
#pragma omp critical(fun2)
{
// Modify shared variables c and d
}
Naming the regions eliminates the chance that two unrelated critical construct could block each other.
As to the second part of your question, to support the dynamic scoping requirements of the OpenMP specification, critical regions are usually implemented with named mutexes that are resolved at run-time. Therefore it is possible to have homonymous critical regions in a prebuilt library function and in your code and it will work as expected as long as both codes are using the same OpenMP runtime, e.g. both were built using the same compiler suite. Cross-suite OpenMP compatibility is usually not guaranteed. Also if in B() there is an unnamed critical region, it will interfere with all unnamed critical regions in the rest of the code, no matter if they are part the same library code of belong to the user code.

omp_set_max_active_levels() and function call

Anyone know the scope of omp_set_max_active_levels(), assuming function A has a omp parallel region, and within the region, each thread of A makes a call to library function B, and within library function B there are 2 levels of omp parallelism.
Then, if we set active omp level in function A to 3 (1 in A and 2 in B), can that ensure that library function B's parallel region work properly?
if omp_set_max_active_levels() is called from within an active parallel region, then the call will be (should be) ignored.
According to the OpenMP 4.0 standard (section 3.2.15):
When called from a sequential part of the program, the binding
thread set for an omp_set_max_active_levels region is the encountering
thread. When called from within any explicit parallel region, the
binding thread set (and binding region, if required) for the
omp_set_max_active_levels region is implementation defined.
and later on:
This routine has the described effect only when called from a
sequential part of the program. When called from within an explicit
parallel region, the effect of this routine is implementation defined.
Therefore if you set the maximum number of nested parallel region in the sequential part of your program, then you should be ensured that everything will work as expected on any compliant implementation of OpenMP.

Pragma omp parallel sections

Can I use pragma omp parallel sections to solve two concurrent parts of my code which are calling the same function by its address??
In this case, is it the case that the function being called has common variables for both the thread and hence the speedup is not happening?
Can I …?
Yes.
In this case, is it the case that the function being called has common variables for both the thread and hence the speedup is not happening?
Hmm? Local variables in that function are local to the thread. Whether you call it via its address or directly is irrelevant. You get problems only if the function modifies global state.