compatibility of c++11 and MPI library - c++

After installing gcc and mpich library in my linux I can compile my codes with mpicxx compiler. Is it possible to use c++11 with mpi library with just upgrading gcc compiler?

Changing the compiler with a newer version should work in general unless some strong code generation changes are observed (e.g. different data alignment or different ABIs). MPI is a library and as such it doesn't care what language constructs you are using as long as those constructs don't mess up with its internals. Since you are going to use C++11 for the threading it provides, there are some things that you should be aware of.
First, multithreading doesn't always play nice with MPI. Most MPI implementations are internally threaded themselves but are not thread safe by default.
Second, MPI defines four levels of threading support:
MPI_THREAD_SINGLE: no threading support - MPI would function safely only when used by a single-threaded application;
MPI_THREAD_FUNNELED: partial threading support - MPI can be used in a multithreaded application but only the main thread may call to MPI;
MPI_THREAD_SERIALIZED: partial threading support - MPI can be used in a multithreaded application but no concurrent calls in different threads are allowed. That is, each thread can call into MPI but a serialisation mechanism has to be in place;
MPI_THREAD_MULTIPLE: full threading support - MPI can be called freely from many threads.
Truth is most MPI implementations support out of the box MPI_THREAD_FUNNELED at max with most of them supporting only MPI_THERAD_SINGLE. Open MPI for example has to be compiled with a non-default option in order to get the full threading support.
Multithreaded applications should initialise the MPI library using MPI_Init_thread() instead of MPI_Init() and the thread that makes the initialisation call becomes the main thread - the very same main thread that is only allowed to call into MPI when the supported level is MPI_THREAD_FUNNELED. One gives MPI_Thread_init() the desired level of threading support and the function returns the supported level which might be lower than desired. In the latter case correct and portable programs are supposed to act accordingly and either switch to non-threaded operation or abort with the respective error message to the user.
More information about how MPI works together with threads can be found in the MPI Standard v2.2.

No problem as far as I can think of, since you shouldn't be able to tamper with the MPI directives in any way, and other than that, MPI and C++11 concerns are orthogonal.
By the way, issuing mpic++ or mpicxx on my machine (gcc 4.6.3, MPICH2 1.4.1) simply translates into
c++ -Wl,-Bsymbolic-functions -Wl,-z,relro -I/usr/include/mpich2 -L/usr/lib -lmpichcxx -lmpich -lopa -lmpl -lrt -lcr -lpthread
You can check that on your own machine with mpic++ -show.

It is no problem to combine C++11 with MPI.
mpic++ and mpicxx are only wrappers and use either the standard compiler or the user speciifed compiler. So you can define that mpic++ and mpicxx use a compiler with is compatible to C++11.
I do not know the exact command for mpich. For opemmpi you need to set these environment flags:
export OMPI_CC='gcc-mp-4.7'
export OMPI_CXX='g++-mp-4.7'
In my case I use openmpi 1.5.5 with the gcc 4.7 compiler from macports.

Related

Compiling a Fortran 2003 program with MVAPICH2

Can you use MVAPICH2 to compile a fortran 2003 programme?
MVAPICH2 states that a) it provides its own compilers and b) it provides the mpif77 and mpif90 wrappers (which both point to e.g. /opt/mvapich2-2.3.1/bin/mpifort). I can't find any docs which help with this.
MPI implementations (MVAPICH is one of many MPI implementations) only provide wrappers around other compilers. They do not provide any "own" compilers. You can compile whichever Fortran does your compiler support.
The build of MVAPICH you download somewhere may be already compiled to be used with some specific compiler, but that does not mean MVAPICH provides that compiler. Similarly, if you buy a compiler suite, it can come with an MPI library (like MVAPICH) pre-compiled.
It is customary to call the mpif90 wrapper to compile any modern Fortran but often the difference from mpif77 is very small, if any at all. Some compilers also provide mpifort or some other wrapper name, which does not explicitly contain any Fortran standard version.
Most modern compilers you will find support most, if not all, features of Fortran 2003. It depends on the exact version you have.

Multithreaded MKL + OpenMP compiled with GCC

My understanding, from reading the Intel MKL documentation and posts such as this--
Calling multithreaded MKL in from openmp parallel region --
is that building OpenMP parallelization into your own code AND MKL internal OpenMP for MKL functions such as DGESVD or DPOTRF is impossible unless building with the Intel compiler. For example, I have a large linear system I'd like to solve using MKL, but I'd also like to take advantage of parallelization to build the system matrix (my own code independent of MKL), in the same binary executable.
Intel states in the MKL documentation that 3rd party compilers "may have to disable multithreading" for MKL functions. So the options are:
openmp parallelization of your own code (standard #pragma omp ... etc) and single-thread calls to MKL
multi-thread calls to MKL functions ONLY, and single-threaded code everywhere else
use the Intel compiler (I would like to use gcc, so not an option for me)
parallelize both your code and MKL with Intel TBB? (not sure if this would work)
Of course, MKL ships with it's own openmp build libiomp*, which gcc can link against. Is it possible to use this library to achieve parallelization of your own code in addition to MKL functions? I assume some direct management of threads would be involved. However as far as I can tell there are no iomp dev headers included with MKL, which may answer that question (--> NO).
So it seems at this point like the only answer is Intel TBB (Thread Building Blocks). Just wondering if I'm missing something or if there's a clever workaround.
(Edit:) Another solution might be if MKL has an interface to accept custom C++11 lambda functions or other arbitrary code (e.g., containing nested for loops) for parallelization via whatever internal threading scheme is being used. So far I haven't seen anything like this.
Intel TBB will also enable better nested parallelism, which might help in some cases. If you want to enable GNU OpenMP with MKL, there are following options:
Dynamically Selecting the Interface and Threading Layer. Links against mkl_rt library and then
set env var MKL_THREADING_LAYER=GNU prior to loading MKL
or call mkl_set_threading_layer(MKL_THREADING_GNU);
Linking with Threading Libraries directly (though, the link has no mentioning of GNU OpenMP explicitly). This is not recommended when you are building a library, a plug-in, or an extension module (e.g. Python's package), which can be mixed with other components that might use MKL differently. Link against mkl_gnu_thread.

MinGW vs MinGW-W64 vs MSVC (VC++) in cross compiling

Let's put like this: We are going to create a library that needs to be cross platform and we choose GCC as compiler, it works awesomely on Linux and we need to compile it on Windows and we have the MinGW to do the work.
MinGW tries to implement a native way to compile C++ on Windows but it doesn't support some features like mutex and threads.
We have the MinGW-W64 that is a fork of MinGW that supports those features and I was wondering, which one to use? Knowing that GCC is one of the most used C++ compilers. Or it's better to use the MSVC (VC++) on Windows and GCC on Linux and use CMake to handle with the independent compiler?
Thanks in advance.
Personally, I prefer a MinGW based solution that cross compiles on Linux, because there are lots of platform independent libraries that are nearly impossible (or a huge PITA) to build on Windows. (For example, those that use ./configure scripts to setup their build environment.) But cross compiling all those libraries and their dependencies is also annoying even on Linux, if you have to ./configure and make each of them yourself. That's where MXE comes in.
From the comments, you seem to worry about dependencies. They are costly in terms of build environment setup when cross compiling, if you have to cross compile each library individually. But there is MXE. It builds a cross compiler and a large selection of platform independent libraries (like boost, QT, and lots of less notable libraries). With MXE, boost becomes a lot more attractive as a solution. I've used MXE to build a project that depends on Qt, boost, and libexiv2 with nearly no trouble.
Boost threads with MXE
To do this, first install mxe:
git clone -b master https://github.com/mxe/mxe.git
Then build the packages you want (gcc and boost):
make gcc boost
C++11 threads with MXE
If you would still prefer C++11 threads, then that too is possible with MXE, but it requires a two stage compilation of gcc.
First, checkout the master (development) branch of mxe (this is the normal way to install it):
git clone -b master https://github.com/mxe/mxe.git
Then build gcc and winpthreads without modification:
make gcc winpthreads
Now, edit mxe/src/gcc.mk. Find the line that starts with $(PKG)_DEPS := and add winpthreads to the end of the line. And find --enable-threads=win32 and replace it with --enable-threads=posix.
Now, recompile gcc and enjoy your C++11 threads.
make gcc
Note: You have to do this because the default configuration supports Win32 threads using the WINAPI instead of posix pthreads. But GCC's libstdc++, the library that implements std::thread and std::mutex, doesn't have code to use WINAPI threads, so they add a preprocessor block that strips std::thread and std::mutex from the library when Win32 threads are enabled. By using --enable-threads=posix and the winpthreads library, instead of having GCC try to interface with Win32 in it's libraries, which it doesn't fully support, we let the winpthreads act as glue code that presents a normal pthreads interface for GCC to use and uses the WINAPI functions to implement the pthreads library.
Final note
You can speed these compilations up by adding -jm and JOBS=n to the make command. -jm, where m is a number that means to build m packages concurrently. JOBS=n, where n is a number that means to use n processes building each package. So, in effect, they multiply, so only pick m and n so that m*n is at most not much more than the number of processor cores you have. E.g. if you have 8 cores, then m=3, n=4 would be about right.
Citations
http://blog.worldofcoding.com/2014_05_01_archive.html#windows
If you want portability, Use standard ways - <thread> library of C++11.
If you can't use C++11, pthread can be solution, although VC++ could not compile it.
Do you want not to use both of these? Then, just write your abstract layer of threading. For example, you can write class Thread, like this.
class Thread
{
public:
explicit Thread(int (*pf)(void *arg));
void run(void *arg);
int join();
void detach();
...
Then, write implementation of each platform you want to support. For example,
+src
|---thread.h
|--+win
|--|---thread.cpp
|--+linux
|--|---thread.cpp
After that, configure you build script to compile win/thread.cpp on windows, and linux/thread.cpp on linux.
You should definitely use Boost. It's really great and does all things.
Seriously, if you don't want to use some synchronization primitives that Boost.Thread doesn't support (such as std::async) take a look on the Boost library. Of course it's an extra dependency, but if you aren't scared of this, you will enjoy all advantages of Boost such as cross-compiling.
Learn about differences between Boost.Thread and the C++11 threads here.
I think this is a fairly generic list of considerations when you need to choose multi-platform tools or sets of tools, for a lot of these you probably already have an answer;
Tool support, who and how are you going to get support from if something doesn't work; how strong is the community and the vendor?
Native target support, how well does the tool understand the target platform?
Optimization potential?
Library support (now and in the intermediate future)?
Platform SDK support, if needed?
Build tools (although not directly asked here, do they work on both platforms; most of the popular ones do).
One thing I see that seems to not really have been dealt with is;
What is the target application expecting?
You mention you are building a library, so what application is going to use it and what does that application expect.
The constraint here being the target application dictates the most fundamental aspect of the system, the very tool used to built it. How is the application going to use the library;
What API and what kind of API is needed by or for that application?
What kind of API do you want to offer (C-style, vs. C++ classes, or a combination)?
What runtime is it using, will it be the same, or will there be conflicts?
Given these, and possible fact that the target application may still be unknown; maintain as much flexibility as possible. In this case, endeavour to maintain compatibility with gcc, mingw-w64 and msvc. They all offer a broad spectrum of C++11 language support (true, some more than others) and generally supported by other popular libraries (even if these other libraries are not needed right now).
I thought the comment by Hans Passant...
Do what works first
... really does apply here.
Since you mentioned it; the mingw-builds for mingw-w64 supports thread etc. with the posix build on Windows, both 64 bit and 32 bit.

Difference between linking OpenMP with -fopenmp and -lgomp

I've been struggling a weird problem the last few days. We create some libraries using GCC 4.8 which link some of their dependencies statically - eg. log4cplus or boost. For these libraries we have created Python bindings using boost-python.
Every time such a library used TLS (like log4cplus does in it's static initialization or stdlibc++ does when throwing an exception - not only during initialization phase) the whole thing crashed in a segfault - and every time the address of the thread local variable has been 0.
I tried everything like recompiling, ensuring -fPIC is used, ensuring -tls-model=global-dynamic is used, etc. No success. Then today I found out that the reason for these crashes has been our way of linking OpenMP in. We did this using "-lgomp" instead of just using "-fopenmp". Since I changed this everything works fine - no crashes, no nothing. Fine!
But I'd really like to know what the cause of the problem was. So what's the difference between these two possibilities to link in OpenMP?
We have a CentOS 5 machine here where we have installed a GCC-4.8 in /opt/local/gcc48 and we are also sure that the libgomp coming from /opt/local/gcc48 had been used as well as the libstdc++ from there (DL_DEBUG used).
Any ideas? Haven't found anything on Google - or I used the wrong keywords :)
OpenMP is an intermediary between your code and its execution. Each #pragma omp statement are converted to calls to their according OpenMP library function, and it's all there is to it. The multithreaded execution (launching threads, joining and synchronizing them, etc.) is always handled by the Operating System (OS). All OpenMP does is handling these low-level OS-dependent threading calls for us portably in a short and sweet interface.
The -fopenmp flag is a high-level one that does more than include GCC's OpenMP implementation (gomp). This gomp library will require more libraries to access the threading functionality of the OS. On POSIX-compliant OSes, OpenMP is usually based on pthread, which needs to be linked. It may also need the realtime extension library (librt) to work on some OSes, while not on some other. When using dynamic linking, everything should be discovered automatically, but when you specified -static, I think you've fallen in the situation described by Jakub Jelinek here. But nowadays, pthread (and rt if needed) should be automatically linked when -static is used.
Aside from linking dependencies, the -fopenmp flag also activates some pragma statement processing. You can see throughout the GCC code (as here and here) that without the -fopenmp flag (which isn't trigged by only linking the gomp library), multiple pragmas won't be converted to the appropriate OpenMP function call. I just tried with some example code, and both -lgomp and -fopenmp produce a working executable that links against the same libraries. The only difference in my simple example that the -fopenmp has a symbol that the -lgomp doesn't have: GOMP_parallel##GOMP_4.0+ (code here) which is the function that initializes the parallel section performing the forks requested by the #pragma omp parallel in my example code. Thus, the -lgomp version did not translate the pragma to a call to GCC's OpenMP implementation. Both produced a working executable, but only the -fopenmp flag produced a parallel executable in this case.
To wrap up, -fopenmp is needed for GCC to process all the OpenMP pragmas. Without it, your parallel sections won't fork any thread, which could wreak havoc depending on the assumptions on which your inner code was done.

In g++ is C++ 11 thread model using pthreads in the background?

I am just trying my hands on g++ 4.6 and C++11 features.
Every time I compile a simple threading code using -std=c++0x flag, either it crashes with segmentation fault or it just throws some weird exception.
I read some questions related to C++11 threads and I realized that, I also need to use -pthread flag to compile the code properly. Using -pthread worked fine and I was able to run the threaded code.
My question is, whether the C++11 multi-threading model uses Pthreads in the background?
Or is it written from the scratch?
I don't know if any of the members are gcc contributors but I am just curious.
If you run g++ -v it will give you a bunch of information about how it was configured. One of those things will generally be a line that looks like
Thread model: posix
which means that it was configured to use pthreads for its threading library (std::thread in libstdc++), and which means you also need to use any flags that might be required for pthreads on your system (-pthread on Linux).
This has nothing specific to do with the standard, its just a detail of how the standard is implemented by g++
C++ doesn't specify how threads are implemented. In practice C++ threads are generally implemented as thin wrappers over pre-existing system thread libraries (like pthreads or windows threads). There is even a provision to access the underlying thread object with std::thread::native_handle().
The reason that it crashes is that if you do not specify -pthreads or -lpthreads, a number of weakly defined pthreads stub functions from libc are linked. These stub functions are enough to get your program to link without error. However, actually creating a pthread requires the full on libpthread.a library, and when the dynamic linker (dl) tries to resolve those missing functions, you get a segmentation violation.