difference between !dir$ and !$omp - fortran

while trying to offload part of my code to MIC card, I encountered
some directives like !dir$
offload_transfer. Actually, in my code
I was using !$omp target map directives. It seems that !dir$ directives
can do similar job as OpenMP directives, for instance !dir$ simd
and !$omp simd. My questions:
1) what is the generic name of this directives (if there is one at all)?
Knowing this I could find some information on internet.
2) Do you know some site where I could find all available !dir$ directives?
3) Can this directives do the same job for offloading to MICs than
OpenMP (>4.0) ones?
4) Can I combine !dir$ directives with OpenMP directives?
Thanks.

Related

Manual SIMD vectorialization in Fortran

The question is simple but I still cannot find an answer:
How can I use SIMD Intrinsics in a Fortran code?
I don' mean to use use !$omp directives, and in this example post from Intel. Always from the same source, I have that Fortran does not allow SIMD calls at least with Intel's Fortran compiler, but that post is from 2006, quite old information.
What I mean is to explicitly call SIMD functions just like I do in C and C++. For instance given:
__m128i a;
a = _mm_lddqu_si128 ((__m128i*)(ptr)); // with ptr defined previously
how can one do the same in Fortran?
Be aware that I know I can write a wrapper in C and call it from Fortran, I will do this if there is no way of using just Fortran.

Should OpenACC pragmas or runtime routines be preferred?

OpenACC has some pragmas and runtime routines, which can be used to basically achieve the same thing.
For example, there is #pragma acc wait and acc_wait() or #pragma acc update [...] and acc_update_[...]().
I started to mostly use the runtime routines in my C++ code.
Is there a difference? Should I prefer one over the other or is it just a matter of style and personal preference?
In general, the pragma's are preferred since they will be ignored by other compilers and when compiling without OpenACC enabled. The runtime API calls would need to be guarded by a macro, like "#ifdef _OPENACC" to maintain portability.
Though, if you don't mind adding the macro guards or loosing portability, then it's mostly a matter of style. Functionally, they are equivalent.

Disabling OpenMP pragma statements everywhere in my c++ project

In my c++ project, there are several #pragma omp parallel for private(i) statements. When I try to track down bugs in my code using valgrind, the OpenMP adornments result in "possibly lost" memory leak messages. I would like to totally disable all of the aforementioned #pragma statements so that I can isolate the problem.
However, I use omp_get_wtime() in my code, and I do not wish to disable these function calls. So I don't want to totally disable all OpenMP functionality in my project.
How can I simply turn off all the #pragma omp parallel for private(i) statements?
I use Eclipse CDT to automatically manage makefiles, and so I normally compile in release mode by: make all -C release. Ideally, I would like a solution to my problem that permits me to compile using a statement such as make all -C release -TURN_OFF_PARALLEL which would result in all the aforementioned #pragma statements being turned off.
The simplest solution is to:
disable OpenMP
link the OpenMP stub library functions
In case your OpenMP implementation doesn't provide stub functions, you can create your own copying from Appendix B of the standard.
Following some dwelling around an interesting question about a non-working OpenMP code, it turns out that it is perfectly possible to get the equivalent of a stub OpenMP lib with GCC by only replacing the -fopenmp with -lgomp. I doubt it was a intended feature, but it works out of the box nonetheless.
For GCC I don't see an option to use only the stubs. Appendix B of the OpenMP standard says
double omp_get_wtime(void)
{
/* This function does not provide a working
* wallclock timer. Replace it with a version
* customized for the target machine.
*/
return 0.0;
}
That's useless if you actually want the time. With GCC, either you have to write your own time function or you search for "#pragma omp" and replace it with "//#pragma omp"
Rather than changing the whole code base you could implement your own time function for GCC only.
Computing time in linux :granularity and precision

What is the best way to use openmp with multiple subroutines in Fortran

I have a program written in Fortran and I have more than 100 subroutines. However, I have around 30 subroutines where there are open-mp codes present. I was wondering what is the best procedure to compile these subroutines. When I used the all the files to compile at once then I found that open mp compiled code runs even slower than the one without open-mp. Should I compile the subroutines with open-mp tags separately ? What is the best practice under these conditions ?
Thank you so much.
Best Regards,
Jdbaba
The OpenMP-aware compilers look for the OpenMP pragma (the open signs after a comment symbol at the begin of the line). Therefore, sources without OpenMP code compiled with an OpenMP-aware compiler should result on the exact or very close object files (and executable).
Edit: One should note that as stated by Hristo Iliev below, enabling OpenMP could affect the serial code, for example by using OpenMP versions of libraries that may differ in algorithm (to be more effective in parallel) and optimizations.
Most likely, the problem here is more related to your code algorithms.
Or perhaps you did not compile with the same optimization flags when comparing OpenMP and non-OpenMP versions.

Intermediate Code as a result of OpenMP pragmas

Is there a way to get my hands on the intermediate source code produced by the OpenMP pragmas?
I would like to see how each kind of pragmas is translated.
Cheers.
OpenMp pragmas is part of a C / C++ compiler's implementation. Therefore before using it, you need to ensure that your compiler will support the pragmas ! If they are not supported, then they are ignored, so you may get no errors at compilation, but multi-thread wont work. In any case, as mentioned above, since they are part of the compiler's implementation, the best intermediate result that you can get is a lower level code. OpenMp is language extension + libraries, macros etc opposed to Pthreads that arms you purely with libraries !