Sharing data between processes when using MPI on windows - c++

I have done a significant amount of testing trying to use windows named shared memory between multiple independently run programs, across two MPI hosts. The result has been MPI with admin rights not having windows privileges to access the Global\ shared memory.
If the MPI was to launch the EXE would they be considered child processes, and windows would allow memory access to them?
One of the processes contains DirectX, seems like it would be messy to incorporate DirectX directly into a MPI program, therefore I have kept them as independent EXE.
Previously asked about windows privileges for Intel MPI, on Intel's forms but no solutions found yet. ( https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/635157 ) Asking here in a more general sense to see if there are other approaches to this that I haven't been able to find.
Still looking for solutions to:
Gaining windows privileges for programs run with mpiexec
Running a Direct3D app with mpiexec
Decided to go with WinSockets seems promising.

How MPI launches the processes is highly implementation and configuration defined. Any assumptions will result in a poorly portable program. Evidently you cannot even assume that the processes are launched on the same node.
I see no reason not to incorporate MPI into a DirectX program. One solution to keep separate exe (although all using MPI), is to rely on MPMD.
See the OpenMPI FAQ, this should work similarly for IntelMPI or MPICH:
mpirun -np 2 app.exe : -np 1 dx-app.exe
Then use MPI communicators to separate the different kinds of processes and normal MPI facilities for communication (e.g. MPI-3 RMA, if you don't want messages).
You could even go for inter-communicators with MPI_COMM_SPAWN / MPI_COMM_CONNECT, but I see little benefit the way you describe the use-case.

Related

Running MPI in a cluster of hosts of different platforms [duplicate]

In my lab, we have several servers used for the simulation programs, but they worked independently. Now I want to combine them to become a cluster using MPICH to make them communicate. But there exists a problem, which is that these servers have different OSs. Some of them are Redhat, and some of them are Ubuntu. And on the homepage of MPICH, I saw that download sites of these two different operating systems are different, so will it be possible to set up a cluster with different operating system? And how to do it?
The reason why I don't want to reinstall these servers is that there are too many data on them and they are under used when I ask this question.
It is not feasible to get this working properly. You should be able to get the same version of an MPI implementation manually installed on different distributions. They might even talk to each other properly. But as soon you try to run actual applications, with dynamic libraries, you will get into trouble with different versions of shared libraries, glibc etc. You will be tempted to link everything statically or build different binaries for the different distributions. At the end of the day, you will just chase one issue you run into after another.
As a side node, combining some servers together with MPI does not make a High Performance Computing cluster. For instance an HPC system has sophisticated high performance interconnects and a high performance parallel file system.
Also note that your typical HPC application is going to run poorly on heterogeneous hardware (as in each node has different CPU / memory configurations).

Detecting not using MPI when running with mpirun/mpiexec

I am writing a program (in C++11) that can optionally be run in parallel using MPI. The project uses CMake for its configuration, and CMake automatically disables MPI if it cannot be found and displays a warning message about it.
However, I am worrying about a perfectly plausible use case whereby a user configures and compiles the program on an HPC cluster, forgets to load the MPI module, and does not notice the warning. That same user might then try to run the program, notice that mpirun is not found, include the MPI module, but forget to recompile. If the user then runs the program with mpirun, this will work, but the program will just run a number of times without any parallelization, as MPI was disabled at compile time. To prevent the user from thinking the program is running in parallel, I would like to make the program display an error message in this case.
My questions is: how can I detect that my program is being run in parallel without using MPI library functions (as MPI was disabled at compile time)? mpirun just launches the program a number of times, but does not tell the processes it launches about them being run in parallel, as far as I know.
I thought about letting the program write some test file, and then check if that file already exists, but apart from the fact that this might be tricky to do due to concurrency problems, there is no guarantee that mpirun will even launch the various processes on nodes that share a file system.
I also considered using a system variable to communicate between the two processes, but as far as I know, there is no system independent way of doing this (and again, this might cause concurrency issues, as there is no way to coordinate system calls between the various processes).
So at the moment, I have run out of ideas, and I would very much appreciate any suggestions that might help me achieve this. Preferred solutions should by operating system independent, although a UNIX-only solution would already be of great help.
Basically, you want to run a a detection of whether you are being run by mpirun etc. in your non-MPI code-path. There is a very similar question: How can my program detect, whether it was launch via mpirun that already presents one non-portable solution.
Check for environment variables that are set by mpirun. See e.g.:
http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables
As another option, you could get the process id of the parent process and it's process name and compare it with a list of known MPI launcher binaries such as orted,slurmstepd,hydra??1. Everything about that is unfortunately again non-portable.
Since launching itself is not clearly defined by the MPI standard, there cannot be a standard way to detect it.
1: Only from my memory, please don't take the list literally.
From a user experience point of view, I would argue that always showing a clear message how the program is being run, such as:
Running FancySimulator serially. If you see this as part of mpirun, rebuild FancySimuilator with FANCYSIM_MPI=True.
or
Running FancySimulator in parallel with 120 MPI processes.
would "solve" the problem. A user getting 120 garbled messages will hopefully notice.

Is there an issue in unix running one C++ binary multiple times and parallel?

I have a compiled binary of a server written in C++ on my unix system. I want to dynamically start servers using a small script. I can specify the port the server will be running on as an argument. My question is if there is a problem with starting one binary multiple times and let them run parrallel instead of copying the binary. In my test it worked as expected but I want to be sure that there are no problems.
First you can run multiple instances of the same binary on almost all operating systems, you do not need to copy it. However there is a deeper issue.
It all depends on how the application is written. In a perfect world no, you would not have any issue, but the world isn't perfect. An application might use a system wide resource and assume it has exclusive use of that resource. This is not unheard of for larger applications such as servers. You already mentioned one thing, the port, but as you said you can change that but are you sure that is the only thing? If you are sure than you can run multiple instances without an issue. However there are other resources the application might assume it has exclusive use over, files could be one, that if you run multiple copies this assumption is broken. The application would then most likely not behave as expected.
Most operating systems allow you to run as many instances of the same program as you wish. It is program's responsibility to enforce any limit on the number of instances, if any.

MPI Fundamentals

I have a basic question regarding MPI, to get a better understanding of it (I am new to MPI and multiple processes so please bear with me on this one). I am using a simulation environment in C++ (RepastHPC) that makes extensive use of MPI (using the Boost libraries) to allow parallel operations. In particular, the simulation consists of multiple instances of the respective classes (i.e. agents), that are supposed to interact with each other, exchange information etc. Now given that this takes place on multiple processes (and given my rudimentary understanding of MPI) the natural question or fear I have is, that agents on different processes don't intereact with each other anymore because they cannot connect (I know, this contradicts the entire idea of MPI).
After reading the manual my understanding is this: the available libraries of Boost.MPI (and also the libaries of the above mentionend package) take care of all of the communication and sending packages back and forth between processes, i.e. each process has copies of the instances from other processes (I guess this is some form of call by value, b/c the original instance cannot be changed from a process that has only a copy), then an updating takes place, to ensure that the copies of the instances have the same information as the originals and so on.
Does this mean, that in terms of the final outcomes of the simulations runs, I get the same as if I would be doing the entire thing on one process? Put differently, the multiple processes are just supposed to speed up things but not change the design of simulation (thus I don't have to worry about it)?
I think you have a fundamental misunderstanding of MPI here. MPI is not an automatic parallelization library. It isn't a distributed shared memory mechanism. It doesn't do any magic for you.
What it does do is make it simpler to communicate between different processes on the same or different machines. Each process has its own address space which does not overlap with the other processes (unless you're doing something else outside of MPI). Assuming you set up your MPI installation correctly, it will do all of the pain of setting up the communication channels between your processes for you. It also gives you some higher level abstractions like collective communication.
When you use MPI, you compile your code differently than normal. Instead of using g++ -o code code.cpp (or whatever your compiler is), you use mpicxx -o code code.cpp. This will automatically link with all of the MPI stuff necessary. Then when you run your application, you use mpiexec -n <num_processes> ./code (other arguments aren't required, but are probably necessary) . The argument num_processes will tell MPI how many processes to launch. This isn't done at compile/link time.
You will also have to rewrite your code to use MPI. MPI has lots of functions (the standard is available here and there are lots of tutorials available on the web that are easier to understand) that you can use. The basics are MPI_Send() and MPI_Recv(), but there's lots and lots more. You'll have to find a tutorial for that.

Possible to distribute an MPI (C++) program accross the internet rather than within a LAN cluster?

I've written some MPI code which works flawlessly on large clusters. Each node in the cluster has the same cpu architecture and has access to a networked (i.e. 'common') file system (so that each node can excecute the actual binary). But consider this scenario:
I have a machine in my office with a dual core processor (intel).
I have a machine at home with a dual core processor (amd).
Both machines run linux, and both machines can successfully compile and run the MPI code locally (i.e. using 2 cores).
Now, is it possible to link the two machines together via MPI, so that I can utilise all 4 cores, bearing in mind the different architectures, and bearing in mind the fact that there are no shared (networked) filesystems?
If so, how?
Thanks,
Ben.
Its possible to do this. Most MPI implementations allow you to specify the location of the binary to be run on different machines. Alternatively, make sure that it is in your path on both machines. Since both machines have the same byte order, that shouldn't be a problem. You will have to make sure that any input data that the individual processes read is available in both locations.
There are lots of complications with doing this. You need to make sure that the firewalls between the systems will allow process startup and communication. Communication between the machines is going to be much slower, so if you code is communication heavy or latency intolerant, it probably will be quite slow. Most likely your execution time running on all 4 cores will be longer than just running with 2 on a single machine.
There is no geographical limitation on where the processes are located. And as KeithB said, there is no need to have common path or even the same binary on both the machines. Depending on what MPI implementation you are using, you dont even need the same endian-ness.
You can specify exactly the path to the binary on each machine and have two independent binaries as well. However, you should note the program will run slow if the communication infrastructure between the two nodes is not fast enough.