Fatal error in PMPI_Comm_rank: Invalid communicator - fortran

program main
use mpi
character * (MPI_MAX_PROCESSOR_NAME) processor_name
integer myid, numprocs, namelen, rc, ierr
integer comm
call MPI_INIT( ierr )
call MPI_COMM_RANK( comm, myid, ierr )
call MPI_COMM_SIZE( comm, numprocs, ierr )
call MPI_GET_PROCESSOR_NAME(processor_name, namelen, ierr)
write(*,*) "Hello World! Process ",myid," of ", numprocs, " on ", processor_name
call MPI_FINALIZE(rc)
end program main
This is an example from a textbook. But the original one use MPI_COMM_WORLD to replace the comm in MPI_COMM_RANK and MPI_COMM_SIZE. I did this change only because I find that the prototype says comm should be an integer. After I did this change, I use mpifort test_mpi.f90 to compile and create a.out file. Next I use mpirun -n 4 ./a.out to execute it, it shows the following error.
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(110): MPI_Comm_rank(comm=0x0, rank=0x7ffd9b870564)
failed PMPI_Comm_rank(68).: Invalid communicator
I did some search on the SO, and find someone said that mpi.h is from one version while the binary library files are from another one. But I only install once mpich and never use mpi before. So what's the problem here?

Your variable comm has never been initialized and has an undefined value.
You must give it a value. At the start, the global communicator is MPI_COMM_WORLD.
comm = MPI_COMM_WORLD
Of course, MPI_COMM_WORLD is integer too, it is an integer constant.

Related

MPI_Finalize() won't finalize if stdout and stderr are redirected via freopen

I have a problem using MPI and redirection of stdout and stderr.
When launched with multiple processes, if both stdout and stderr are redirected in (two different) files, then every processes will get stucked in MPI_Finalize(), waiting indefinitely. But if only stdout or stderr is redirected, then there is no problem and the program stops normaly.
I'm working on windows 7 with intel MPI on visual studio 2013.
Thanks for your help!
Below is a simple code that fails on my computer with 2 processes (mpiexec -np 2 mpitest.exe)
#include <iostream>
#include <string>
#include <mpi.h>
int main(int argc, char *argv[])
{
int ierr = MPI_Init(&argc, &argv);
int rank, size;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("[%d/%d] This is printed on screen\n", rank, size-1);
// redirect the outputs and errors if necessary
std::string log_file = "log_" + std::to_string(rank) + ".txt";
std::string error_file = "err_" + std::to_string(rank) + ".txt";
//If one of the two following line is commented, then everything works fine
freopen(log_file.c_str(), "w", stdout);
freopen(error_file.c_str(), "w", stderr);
printf("[%d/%d] This is printed on the logfile\n", rank, size - 1);
ierr = MPI_Finalize();
return 0;
}
EDIT : For those who are interested, we submit this error to the INTEL developper forum. They were able to reproduce the bug and are working to fix it. In the meantime, we redirect every stderr message on the stdout (ugly but working).

Copy large data file using parallel I/O

I have a fairly big data set, about 141M lines with .csv formatted. I want to use MPI commands with C++ to copy and manipulate a few columns, but I'm a newbie on both C++ and MPI.
So far my code looks like this
#include <stdio.h>
#include "mpi.h"
using namespace std;
int main(int argc, char **argv)
{
int i, rank, nprocs, size, offset, nints, bufsize, N=4;
MPI_File fp, fpwrite; // File pointer
MPI_Status status;
MPI_Offset filesize;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_File_get_size(fp, &filesize);
int buf[N];
for (i = 0; i<N; i++)
buf[i] = i;
offset = rank * (N/size)*sizeof(int);
MPI_File_open(MPI_COMM_WORLD, "new.csv", MPI_MODE_RDONLY, MPI_INFO_NULL, &fp);
MPI_File_open(MPI_COMM_WORLD, "Ntest.csv", MPI_MODE_CREATE|MPI_MODE_WRONLY, MPI_INFO_NULL, &fpwrite);
MPI_File_read(fp, buf, N, MPI_INT, &status);
// printf("\nrank: %d, buf[%d]: %d\n", rank, rank*bufsize, buf[0]);
printf("My rank is: %d\n", rank);
MPI_File_write_at(fpwrite, offset, buf, (N/size), MPI_INT, &status);
/* // repeat the process again
MPI_Barrier(MPI_COMM_WORLD);
printf("2/ My rank is: %d\n", rank); */
MPI_File_close(&fp);
MPI_File_close(&fpwrite);
MPI_Finalize();
}
I'm not sure where to start, and I've seen a few examples with lustre stripes. I would like to go that direction if possible. Additional options include HDF5 and T3PIO.
You are way too early to worry about lustre stripes, aside from the fact that lustre stripes are by default something ridiculously small for a "parallel file system". Increase the stripe size of the directory where you will write and read these files with lfs setstripe
Your first challenge will be how to decompose this CSV file. What does a typical row look like? If the rows are of variable length, you're going to have a bit of a headache. Here's why:
consider a CSV file with 3 rows and 3 MPI processes.
One row is aa,b,c (8 bytes).
row is aaaaaaa,bbbbbbb,ccccccc (24 bytes).
third row is ,,c (4 bytes) .
(darnit, markdown, how do I make this list start at zero?)
Rank 0 can read from the beginning of the file, but where will rank 1 and 2 start? If you simply divide total size (8+24+4=36) by 3, then the decomposistion is
0 ends up reading aa,b,c\naaaaaa,
1 reads a,bbbbbbb,ccc, and
reads cccc\n,,c\n
The two approaches to unstructured text input are as follows. One option is to index your file, either after the fact or as the file is being generated. This index would store the beginning offset of every row. Rank 0 reads the offset then broadcasts to everyone else.
The second option is to do this initial decomposition by file size, then fix up the splits. In the above simple example, rank 0 would send everything after the newline to rank 1. Rank 1 would receive the new data and glue it to the beginning of its row and send everything after its own newline to rank 2. This is extremely fiddly and I would not suggest it for someone just starting MPI-IO.
HDF5 is a good option here! Instead of trying to write your own parallel CSV parser, have your CSV creator generate an HDF5 dataset. HDF5, among other features, will keep that index i mentioned for you, so you can set up hyperslabs and do parallel reading and writing.

Passing non-NULL argv to MPI_Comm_spawn

Suppose that my program (let's call it prog_A) starts as a single MPI process.
And later I want program prog_A to spawn n MPI processes (let's call them prog_B) using MPI_Comm_spawn with the same arguments I passed to prog_A.
For example, if I run prog_A with the arguments 200 100 10
mpiexec -n 1 prog_A 200 100 10
I want prog_B to be provided with the same argments 200 100 10.
How can I do this? I tried the following but it does not work.
char ** newargv = new char*[3];//create new argv for childs
newargv[0] = new char[50];
newargv[1] = new char[50];
newargv[2] = new char[50];
strcpy(newargv[0],argv[1]);//copy argv to newargv
strcpy(newargv[1],argv[2]);
strcpy(newargv[2],argv[3]);
MPI_Comm theother;
MPI_Init(&argc, &argv);
MPI_Comm_spawn("prog_B",newargv,numchildprocs,
MPI_INFO_NULL, 0, MPI_COMM_SELF, &theother,
MPI_ERRCODES_IGNORE);
MPI_Finalize();
Your problem is that you didn't NULL terminate your argv list. Here's the important part of the MPI standard (emphasis added):
The argv argument argv is an array of strings containing arguments
that are passed to the program. The first element of argv is the first
argument passed to command, not, as is conventional in some contexts,
the command itself. The argument list is terminated by NULL in C and
C++ and an empty string in Fortran. In Fortran, leading and trailing
spaces are always stripped, so that a string consisting of all spaces
is considered an empty string. The constant MPI_ARGV_NULL may be used
in C, C++ and Fortran to indicate an empty argument list. In C and
C++, this constant is the same as NULL.
You just need to add a NULL to the end of your list. Here's the corrected code (translated to C since I didn't have the C++ bindings installed on my laptop):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "mpi.h"
int main(int argc, char ** argv) {
char ** newargv = malloc(sizeof(char *)*4);//create new argv for childs
int numchildprocs = 1;
MPI_Comm theother;
MPI_Init(&argc, &argv);
MPI_Comm_get_parent(&theother);
if (MPI_COMM_NULL != theother) {
fprintf(stderr, "SPAWNED!\n");
} else {
newargv[0] = (char *) malloc(sizeof(char)*50);
newargv[1] = (char *) malloc(sizeof(char)*50);
newargv[2] = (char *) malloc(sizeof(char)*50);
newargv[3] = NULL;
strncpy(newargv[0],argv[1], 50);//copy argv to newargv
strncpy(newargv[1],argv[2], 50);
strncpy(newargv[2],argv[3], 50);
fprintf(stderr, "SPAWNING!\n");
MPI_Comm_spawn("./prog_B",newargv,numchildprocs,
MPI_INFO_NULL, 0, MPI_COMM_SELF, &theother,
MPI_ERRCODES_IGNORE);
}
MPI_Comm_free(&theother);
MPI_Finalize();
}
You do not need to copy the argument vector at all. All you have to do is make use of the provisions of the C99 standard, which requires that argv should be NULL-terminated:
MPI_Comm theother;
// Passing &argc and &argv here is a thing of the past (MPI-1)
MPI_Init(NULL, NULL);
MPI_Comm_spawn("prog_B", argv+1, numchildprocs,
MPI_INFO_NULL, 0, MPI_COMM_SELF, &theother,
MPI_ERRCODES_IGNORE);
MPI_Finalize();
Note the use of argv+1 in order to skip over the first argument (the program name). The benefit of that code is that it works with any number of arguments passed to the original program.

Calling Fortran from C++; String on return corrupted

I am calling a Fortran 77 Function from C++ that passes a file handle, a string, and the length. The files opens successfully and the Fortran subroutine exits. However, back in the C++ Code the string, which was passed to fortran, is corrupted. When the bottom of the function openFile is reached the program crashes.
The crash only appears in release but not in debug. Plotting the strings, I see that in release the variable fileNameToFortran is full of trash.
Thanks for your help
I use ifort with following compiler flags in release (windows 7 machine (32 bit)):
/names:lowercase /f77rtl /traceback /iface:cref /threads /recursive /LD
and in debug:
/names:lowercase /f77rtl /traceback /iface:cref /threads /recursive /LDd /Zi /debug:full /check:all /traceback
Here is the C-Code:
typedef void (FORTCALL *sn_openfile_func) (int *,
char[],
int *,
int);
void openFile(const int fileHandle, const std::string fileName)
{
int fileHandleToFortran = fileHandle;
char fileNameToFortran[20];
assert(fileName.size() < 20);
strcpy(fileNameToFortran, fileName.c_str());
int lstr = strlen(fileNameToFortran);
openfile_func_handle(&fileHandleToFortran, fileNameToFortran, &lstr, lstr);
}
Here is the Fortran Code:
SUBROUTINE SN_OPENFILE(FILENR,FILENAME,FSIZE)
!DEC$ ATTRIBUTES DLLEXPORT :: SN_OPENFILE
IMPLICIT NONE
INTEGER FILENR, FSIZE
CHARACTER FILENAME*FSIZE
OPEN (FILENR,FILE = FILENAME,
& ACCESS = 'SEQUENTIAL' , STATUS = 'REPLACE', ERR=222)
GOTO 333
222 WRITE(*,*) 'Error opening file'
333 END
OK, I found the answer myself.
The macro FORTCALL was defined as __STDCALL
Now, when using iface:cref it only crashes in release. That is strange, but after I have removed it, it works for release and debug.

I am not able to compile with MPI compiler with C++

I was trying to compile a very simple MPI hello_world:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
printf("Process %d on %s out of %d\n", rank, processor_name, numprocs);
MPI_Finalize();
}
And got the following problem:
Catastrophic error: could not set locale "" to allow processing of multibyte characters
I really don't know how to figure it out.
Try defining environment variables
LANG=en_US.utf8
LC_ALL=en_US.utf8
Assuming you're on unix, also try man locale and locale -a at command line, and google for "utf locale" and similar searches.
Re-defining the environment variable LANG solved the problem for me, as pointed out (setting LANG=en_US.utf8).
I may say that I'm conecting to a foreign server, and there's where I get the problem compiling code with Intel compilers.