In my c++ project, there are several #pragma omp parallel for private(i) statements. When I try to track down bugs in my code using valgrind, the OpenMP adornments result in "possibly lost" memory leak messages. I would like to totally disable all of the aforementioned #pragma statements so that I can isolate the problem.
However, I use omp_get_wtime() in my code, and I do not wish to disable these function calls. So I don't want to totally disable all OpenMP functionality in my project.
How can I simply turn off all the #pragma omp parallel for private(i) statements?
I use Eclipse CDT to automatically manage makefiles, and so I normally compile in release mode by: make all -C release. Ideally, I would like a solution to my problem that permits me to compile using a statement such as make all -C release -TURN_OFF_PARALLEL which would result in all the aforementioned #pragma statements being turned off.
The simplest solution is to:
disable OpenMP
link the OpenMP stub library functions
In case your OpenMP implementation doesn't provide stub functions, you can create your own copying from Appendix B of the standard.
Following some dwelling around an interesting question about a non-working OpenMP code, it turns out that it is perfectly possible to get the equivalent of a stub OpenMP lib with GCC by only replacing the -fopenmp with -lgomp. I doubt it was a intended feature, but it works out of the box nonetheless.
For GCC I don't see an option to use only the stubs. Appendix B of the OpenMP standard says
double omp_get_wtime(void)
{
/* This function does not provide a working
* wallclock timer. Replace it with a version
* customized for the target machine.
*/
return 0.0;
}
That's useless if you actually want the time. With GCC, either you have to write your own time function or you search for "#pragma omp" and replace it with "//#pragma omp"
Rather than changing the whole code base you could implement your own time function for GCC only.
Computing time in linux :granularity and precision
Related
OpenACC has some pragmas and runtime routines, which can be used to basically achieve the same thing.
For example, there is #pragma acc wait and acc_wait() or #pragma acc update [...] and acc_update_[...]().
I started to mostly use the runtime routines in my C++ code.
Is there a difference? Should I prefer one over the other or is it just a matter of style and personal preference?
In general, the pragma's are preferred since they will be ignored by other compilers and when compiling without OpenACC enabled. The runtime API calls would need to be guarded by a macro, like "#ifdef _OPENACC" to maintain portability.
Though, if you don't mind adding the macro guards or loosing portability, then it's mostly a matter of style. Functionally, they are equivalent.
I want to write my own parallel code or at least try whether manually parallelizing some of my code is faster than having Eigen use its own internal parallel routines.
I have been following this guide and added at the top of a header file the following directive (but also tried it at the top of main):
#define EIGEN_DONT_PARALLELIZE
Yet, when I ask Eigen to print the number of threads it's been using, via Eigen::nbThreads I consistently get two. I have tried to force the issue with the initParallel() method which is designed for user-defined parallel regions but to no avail. Could it be that I need to place my pre-processor token somewhere else? I am using gcc 8.1, CLion with CMake. I have also tried to force the issue with setNbThreads(0). To eventually include OpenMP in my own code, I have followed the inclusion of OpenMP as recommended here as well as added this in my CMakeLists.txt: target_link_libraries(OpenMP::OpenMP_CXX).
Or could it be that Eigen just tells me how many cores are in principle available, which doesn't sound like what is written in the documentation.
Edit
I am not sure if this is important but CLion (editor) complains MACRO EIGEN_DONT_PARALLELIZE is never used. I looked in Eigen/Core and saw that it is used only in the form of a condition for an if statement, so I ignored this editor warning, but maybe I should not have?
I have now reproduced this behaviour with a much smaller example.
I want to know the behaviour of the VC++ compiler with /openmp. I'm using a third party library (OpenMVG) that comes with the cmakefilelist. So I generated the Visual Studio solution to compile it.
CMake recognizes the openmp capability of the compiler and in VS everithing compiles fine.
But when it comes to execution, I get different results everythime I run the program. And if I run 2 instances of the program at the same time, the results are even worse.
So I looked a little bit inside the source code and I found out that openmp is used with list and map iterators
#pragma omp parallel
for (Views::const_iterator iter = sfm_data.GetViews().begin(); iter != sfm_data.GetViews().end() && bContinue; ++iter)
{
pragma omp single nowait
{
... process ...
}
}
I searched on the web and it seems that Visual Studio only supports openMP 2.0. So does it support list iterators? Can this be the problem? How does openMP 2.0 behave with list iterators?
Thanks in advance fo any answer
The code doesn't do what you probably think it does. It creates a set of threads, each of which executes that same loop.
Note that OpenMP in Visual Studio doesn't really support C++, it treats it as a C dialect. In particular, /openmp doesn't support iterators, since that's C++ only. It only supports (some) C loops.
Also note that OpenMP is an old standard, predating even C++98. Since C++11, C++ has native threading capabilities.
I have a program written in Fortran and I have more than 100 subroutines. However, I have around 30 subroutines where there are open-mp codes present. I was wondering what is the best procedure to compile these subroutines. When I used the all the files to compile at once then I found that open mp compiled code runs even slower than the one without open-mp. Should I compile the subroutines with open-mp tags separately ? What is the best practice under these conditions ?
Thank you so much.
Best Regards,
Jdbaba
The OpenMP-aware compilers look for the OpenMP pragma (the open signs after a comment symbol at the begin of the line). Therefore, sources without OpenMP code compiled with an OpenMP-aware compiler should result on the exact or very close object files (and executable).
Edit: One should note that as stated by Hristo Iliev below, enabling OpenMP could affect the serial code, for example by using OpenMP versions of libraries that may differ in algorithm (to be more effective in parallel) and optimizations.
Most likely, the problem here is more related to your code algorithms.
Or perhaps you did not compile with the same optimization flags when comparing OpenMP and non-OpenMP versions.
Is there a way to get my hands on the intermediate source code produced by the OpenMP pragmas?
I would like to see how each kind of pragmas is translated.
Cheers.
OpenMp pragmas is part of a C / C++ compiler's implementation. Therefore before using it, you need to ensure that your compiler will support the pragmas ! If they are not supported, then they are ignored, so you may get no errors at compilation, but multi-thread wont work. In any case, as mentioned above, since they are part of the compiler's implementation, the best intermediate result that you can get is a lower level code. OpenMp is language extension + libraries, macros etc opposed to Pthreads that arms you purely with libraries !