OpenMPI cross nodes Allreduce operator failed and connection timed out - c++

I am new to MPI programming and I write two simple programs to check if I can do the cross nodes communication. I first run the following program by using my own hostfile
int main(int argc, char** argv) {
// Initialize the MPI environment
MPI_Init(NULL, NULL);
// Get the number of processes
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
// Get the rank of the process
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
// Get the name of the processor
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
// Print off a hello world message
printf("Hello world from processor %s, rank %d out of %d processors\n",
processor_name, world_rank, world_size);
// Finalize the MPI environment.
MPI_Finalize();
}
And then I got the follow results. It seems that everything works will and I can do the communication cross nodes. However, when I tried to do the Allreduce operation, I got the error message.
Hello world from processor lifs1.math.ust.hk, rank 4 out of 8 processors
Hello world from processor lifs1.math.ust.hk, rank 6 out of 8 processors
Hello world from processor lifs1.math.ust.hk, rank 5 out of 8 processors
Hello world from processor lifs1.math.ust.hk, rank 7 out of 8 processors
Hello world from processor lifs2.math.ust.hk, rank 0 out of 8 processors
Hello world from processor lifs2.math.ust.hk, rank 1 out of 8 processors
Hello world from processor lifs2.math.ust.hk, rank 2 out of 8 processors
Hello world from processor lifs2.math.ust.hk, rank 3 out of 8 processors
[lifs1.math.ust.hk:29387] 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[lifs1.math.ust.hk:29387] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
The allreduce programming is
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int total_process;
int id;
MPI_Comm_size(MPI_COMM_WORLD, &total_process);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
if (id == 0) {
std::cout << "========== Testing MPI Across Nodes ==========" << std::endl;
}
int N = 1000;
float* vec = new float[N];
for (int i = 0; i < N; ++i) {
vec[i] = 12.345;
}
float* global_sum = new float[N];
MPI_Allreduce(vec, global_sum, N, MPI_FLOAT, MPI_SUM, MPI_COMM_WORLD);
std::cout << "Result " << global_sum[0] << std::endl;
delete [] vec;
delete [] global_sum;
MPI_Finalize();
return 0;
}
I think 1000 single-precision floating point should not take too much time to finish the communication. The error message is
[lifs1.math.ust.hk:30146] 7 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[lifs1.math.ust.hk:30146] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[lifs2][[13004,1],0][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.207 failed: Connection timed out (110)
[lifs2][[13004,1],2][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.207 failed: Connection timed out (110)
[lifs2][[13004,1],1][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.207 failed: Connection timed out (110)
[lifs2][[13004,1],3][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.207 failed: Connection timed out (110)
[lifs1][[13004,1],4][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.208 failed: Connection timed out (110)
[lifs1][[13004,1],5][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] [lifs1][[13004,1],6][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.208 failed: Connection timed out (110)
[lifs1][[13004,1],7][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_complete_connect] connect() to 143.89.16.208 failed: Connection timed out (110)
connect() to 143.89.16.208 failed: Connection timed out (110)
My questions are
Why can't I do the allreduce operation cross nodes?
My program is very simple, but I still get this message 7 more processes have sent help message help-mpi-btl-base.txt. What's this?

Related

PMPI_Waitall error

i've some problems with this piece of code.
NEIGHBOORDHOOD_FLAG = 1234;
int number_of_nodes=90;
MPI_Request neig_request[2*number_of_nodes];
MPI_Status neig_status[2*number_of_nodes];
int neig_counter=0;
int is_neighbour_send[number_of_nodes];
int is_neighbour_receive[number_of_nodes];
for(int v=2;v<number_of_nodes+2;++v){
is_neighbour_send[v-2] = (util.check_targets(neighbours,v)==true)?1:0;
this->communicate_neighborhood(&(is_neighbour_send[v-2]),v,&(neig_request[neig_counter]));
++neig_counter;
}
//ricevo la comunicazione dai nodi per sapere di chi sono vicino
for(int v=2;v<number_of_nodes+2;++v){
this->receive_neighborhood(&(is_neighbour_receive[v-2]),v,&(neig_request[neig_counter]));
++neig_counter;
}
MPI_Waitall(neig_counter,neig_request,neig_status);
The two methods above are essentially two wrapped versions of the MPI send and receive:
void NodeAgent::communicate_neighborhood(int *neigh,int dest_node,MPI_Request *req){
MPI_Class::send_message(neigh,MPI_INT,1,dest_node,NEIGHBOORDHOOD_FLAG,req);
}
void NodeAgent::receive_neighborhood(int *is_neighbour,int node_mitt,MPI_Request *req){
MPI_Class::receive_message(is_neighbour,MPI_INT,1,node_mitt,NEIGHBOORDHOOD_FLAG,req);
}
with send and receive static methods:
void MPI_Class::send_message(void *content,MPI_Datatype datatype,int length,int dest,int tag,MPI_Request *request){
int np = MPI_Class::count();
MPI_Isend(content, length, datatype, (dest % np), tag, MPI_COMM_WORLD, request);
}
void MPI_Class::receive_message(void *buffer,MPI_Datatype datatype,int length,int dest,int tag,MPI_Request *request){
int np = MPI_Class::count();
MPI_Irecv(buffer,length,datatype,(dest % np),tag,MPI_COMM_WORLD,request);
}
The number of nodes is the number of processes - 2 (there are two processes that i use for others operations) and i've used the " % " in send and receive to ensure that the sender and the recipient are always in the range that i want.
The interesting thing is that this code executes properly also with large number of process (100). The problem arises when i launch the program on two machines or more...
It gives me: Fatal error in PMPI_Waitall: See the MPI_ERROR field in MPI_Status for the error code
I tried to print the MPI_ERROR of each status, but are all zeroes...
Can someone help me?
many thanks in advance
Edit, here the MCVE:
#include <mpi.h>
#include <stdlib.h>
int main(int argc,char **argv){
MPI_Init(&argc,&argv);
int my_id;
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Status * statuses = (MPI_Status *)malloc(2*world_size * sizeof(MPI_Status));
MPI_Request * requestes = (MPI_Request *)malloc(2*world_size * sizeof(MPI_Request));
int counter =0;
int *tosend = (int *) malloc((world_size - 2) * sizeof(int));
int *toreceive = (int *) malloc((world_size - 2) * sizeof(int));
for(int i=0;i<world_size-2;++i){
tosend[i]=0;
}
if(my_id>1){
for(int i=2;i<world_size;++i){
MPI_Isend(&tosend[i-2],1,MPI_INT,i,0,MPI_COMM_WORLD,&requestes[counter]);
++counter;
}
}
if(my_id>1){
for(int i=2;i<world_size;++i){
MPI_Irecv(&(toreceive[i-2]),1,MPI_INT,i,0,MPI_COMM_WORLD,&(requestes[counter]));
++counter;
}
}
MPI_Waitall(counter,requestes,statuses);
free(statuses);
free(requestes);
free(tosend);
free(toreceive);
MPI_Finalize();
}
Also this code executes correctly on a single machine. The problem arises when i launch it on 2 machines mutual authenticated (ssh). I just discovered that after the MPI_Waitall fatal error, if i try to connect via ssh from a machine to other, it ask me the password....
Why? At this point i think the problem is the communication via ssh and not the code..
Edit2: solved! it was a ssh problem that i solved in some way (i don't know how honestly, i redid the authentication procedures ten times as long as it worked....

MPI Client fails at looking up Server port (MPI_ERR_NAME: invalid name argument)

I'm currently trying to setup a MPI-Client connecting to a server which publishes a certain name but it doesn't work and I have no clue about it.
MPI is OpenMPI 1.6 using g++-4.7, where /usr/lib64/mpi/gcc/openmpi/etc/openmpi-default-hostfile contains 1 line:
MY_IP
The following "minimal" (I don't like questions using too much code but I think I should include it here) example illustrates the problem:
mpi_srv.cc
#include <iostream>
#include <mpi.h>
int main (void)
{
int rank(0);
MPI_Init(0, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &rank);
std::cout << "Rank: " << rank << std::endl;
char port_name[MPI_MAX_PORT_NAME];
MPI_Open_port(MPI_INFO_NULL, port_name);
char publish_name[1024] = {'t','e','s','t','_','p','o','r','t','\0'};
MPI_Publish_name(publish_name, MPI_INFO_NULL, port_name);
std::cout << "Port: " << publish_name << " (" << port_name << ")" << std::endl;
MPI_Comm client;
std::cout << "Wating for Comm..." << std::endl;
MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client);
std::cout << "Comm accepted" << std::endl;
MPI_Comm_free(&client);
MPI_Unpublish_name(publish_name, MPI_INFO_NULL, port_name);
MPI_Close_port(port_name);
MPI_Finalize();
return 1;
}
compiled and executed via
mpic++ mpi_src.cc -o mpi_srv.x
mpirun mpi_srv.x
prints
Rank: 1
Port: test_port (2428436480.0;tcp://MY_IP:33573+2428436481.0;tcp://MY_IP:43172:300)
Wating for Comm...
and blocks as required.
My client
mpi_client.cc
#include <iostream>
#include <mpi.h>
int main (void)
{
int rank(0);
MPI_Init(0, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &rank);
std::cout << "Rank: " << rank << std::endl;
char port_name[MPI_MAX_PORT_NAME];
char publish_name[1024] = {'t','e','s','t','_','p','o','r','t','\0'};
MPI_Lookup_name(publish_name, MPI_INFO_NULL, port_name);
MPI_Comm client;
MPI_Comm_connect(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client);
MPI_Comm_disconnect(&client);
MPI_Finalize();
return 1;
}
compiled and executed via
mpic++ mpi_client.cc -o mpi_client.x
mpirun mpi_client.x
prints
Rank: 1
[MY_HOST:24870] *** An error occurred in MPI_Lookup_name
[MY_HOST:24870] *** on communicator MPI_COMM_WORLD
[MY_HOST:24870] *** MPI_ERR_NAME: invalid name argument
[MY_HOST:24870] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
with the server still running.
I removed the error checking in the exmaples above but the function return values indicate successful publication of the port name in the server executable.
I found out that this problem can arise because of the published port being invisible to the client when using different mpirun but I used the same mpirun executable to execute both.
Why doesn't the client connect to the server as I'd expect here?
When you run two separate MPI sessions, e.g.:
$ mpirun mpi_server.x
...
and
$ mpirun mpi_client.x
...
the second (client) MPI session has to be told where the naming service that holds the name/port mapping is located. With Open MPI you have several choices of naming service:
an instance of the dedicated naming service daemon ompi-server, or
the mpirun process of the server session.
In both cases the client session has to be provided with the location of the naming service. See this question and my answer to it for more information on how to deal with this in Open MPI.
Name publishing is a tricky thing and can behave a little differently from one implementation to the next. It's up to the implementation to decide what level of support it will provide. For Open MPI (https://www.open-mpi.org/doc/v1.5/man3/MPI_Publish_name.3.php), it appears that you can set an MPI_Info key to specify that the name should be published locally or globally. You should make sure that you're publishing globally if you won't be starting your clients via MPI_Comm_spawn (which you're not).
Beyond that, this isn't a feature that I've used a lot so it may be that there's something else going on here.

client/server application using MPI

i have two questions ; the first one is :
i'm gonna use msmpi and i meant by "only mpi" that we mustn't use sockets, my application is about a scalable distributed data structure; initially, we have a server contain a file which has a variable size (the size could be increased by insertions and decreased by deletion) and when the size of the file exceed certain limit the file will be splitted, the half remain in the first server and the second half will be moved to a new server and so on... and the client need to be always informed by the address of the data he want to retrieve so he should have an image of the split operation of the file. finally, i hope i make it clearer.
and the second one is:
i've tried to compile simple client/server application(the code source is bellow) with msmpi or mpich2 and it doesn't work and gives me the error message "fatal error in mpi_open_port() and other errors of stack", so i installed open mpi on ubunto 11.10, and tried to run the same example it worked with server side and it gave me a port name but on the client side it gave me the error message:
[user-Compaq-610:03833] [[39604,1],0] ORTE_ERROR_LOG: Not found in file ../../../../../../ompi/mca/dpm/orte/dpm_orte.c at line 155
[user-Compaq-610:3833] *** An error occurred in MPI_Comm_connect
[user-Compaq-610:3833] *** on communicator MPI_COMM_WORLD
[user-Compaq-610:3833] *** MPI_ERR_INTERN: internal error
[user-Compaq-610:3833] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 3833 on
node toufik-Compaq-610 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
so i'm confused what the problem is, and i spent a while trying to fix it,
i'd be greatfull if any body could help me with it, and thank u in advance.
the source code is here:
/* the server side */
#include <stdio.h>
#include <mpi.h>
main(int argc, char **argv)
{
int my_id;
char port_name[MPI_MAX_PORT_NAME];
MPI_Comm newcomm;
int passed_num;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
passed_num = 111;
if (my_id == 0)
{
MPI_Open_port(MPI_INFO_NULL, port_name);
printf("%s\n\n", port_name); fflush(stdout);
} /* endif */
MPI_Comm_accept(port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &newcomm);
if (my_id == 0)
{
MPI_Send(&passed_num, 1, MPI_INT, 0, 0, newcomm);
printf("after sending passed_num %d\n", passed_num); fflush(stdout);
MPI_Close_port(port_name);
} /* endif */
MPI_Finalize();
exit(0);
} /* end main() */
and at the client side:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv)
{
int passed_num;
int my_id;
MPI_Comm newcomm;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_id);
MPI_Comm_connect(argv[1], MPI_INFO_NULL, 0, MPI_COMM_WORLD, &newcomm);
if (my_id == 0)
{
MPI_Status status;
MPI_Recv(&passed_num, 1, MPI_INT, 0, 0, newcomm, &status);
printf("after receiving passed_num %d\n", passed_num); fflush(stdout);
} /* endif */
MPI_Finalize();
return 0;
//exit(0);
} /* end main() */
How exactly do you run the application? It seems that provided client and server codes are the same.
Usually the code is the same for all MPI processes and program decides what to execute basing on rank as in this snippet if (my_id == 0) { ... }. The application is executed with mpiexec. For example mpiexec -n 2 ./application would run two MPI processes with ranks 1 and 2 in one MPI_COMM_WORLD communicator. Where exactly the prococesses would be executed (on the same node or on different ones) depends on configuration.
Nevertheless, you should create port with MPI_Open_port and the pass it to MPI_Comm_connect. Here is an example on how to use these functions: MPI_Comm_connect
Moreover, for MPI_Recv there must be corresponding MPI_Send. Otherwise receiving process would wait forever.

MPI communicator error

I had a problem with a program that uses MPI and I have just fixed it, however, I don't seem to understand what was wrong in the first place. I'm quite green with programming relates stuff, so please be forgiving.
The program is:
#include <iostream>
#include <cstdlib>
#include <mpi.h>
#define RNumber 3
using namespace std;
int main() {
/*Initiliaze MPI*/
int my_rank; //My process rank
int comm_sz; //Number of processes
MPI_Comm GathComm; //Communicator for MPI_Gather
MPI_Init(NULL, NULL);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
/*Initialize an array for results*/
long rawT[RNumber];
long * Times = NULL; //Results from threads
if (my_rank == 0) Times = (long*) malloc(comm_sz*RNumber*sizeof(long));
/*Fill rawT with results at threads*/
for (int i = 0; i < RNumber; i++) {
rawT[i] = i;
}
if (my_rank == 0) {
/*Main thread recieves data from other threads*/
MPI_Gather(rawT, RNumber, MPI_LONG, Times, RNumber, MPI_LONG, 0, GathComm);
}
else {
/*Other threads send calculation results to main thread*/
MPI_Gather(rawT, RNumber, MPI_LONG, Times, RNumber, MPI_LONG, 0, GathComm);
}
/*Finalize MPI*/
MPI_Finalize();
return 0;
};
On execution the program returns the following message:
Fatal error in PMPI_Gather: Invalid communicator, error stack:
PMPI_Gather(863): MPI_Gather(sbuf=0xbf824b70, scount=3, MPI_LONG,
rbuf=0x98c55d8, rcount=3, MPI_LONG, root=0, comm=0xe61030) failed
PMPI_Gather(757): Invalid communicator Fatal error in PMPI_Gather:
Invalid communicator, error stack: PMPI_Gather(863):
MPI_Gather(sbuf=0xbf938960, scount=3, MPI_LONG, rbuf=(nil), rcount=3,
MPI_LONG, root=0, comm=0xa6e030) failed PMPI_Gather(757): Invalid
communicator
After I remove GathComm altogether and substitute it with MPI_COMM_WORLD default communicator everything works fine.
Could anyone be so kind to explain what was I doing wrong and how did this adjustment made everything work?
That's because GathComm has not been assigned a valid communicator. "MPI_Comm GathComm;" only declares the variable to hold a communicator but doesn't create one.
You can use the default communicator (MPI_COMM_WORLD) if you simply want to include all procs in the operation.
Custom communicators are useful when you want to organised your procs in separate groups or when using virtual communication topologies.
To find out more, check out this article which describes Groups, Communicator and Topologies.

send() crashes my program

I'm running a server and a client. i'm testing my program on my computer.
this is the funcion in the server that sends data to the client:
int sendToClient(int fd, string msg) {
cout << "sending to client " << fd << " " << msg <<endl;
int len = msg.size()+1;
cout << "10\n";
/* send msg size */
if (send(fd,&len,sizeof(int),0)==-1) {
cout << "error sendToClient\n";
return -1;
}
cout << "11\n";
/* send msg */
int nbytes = send(fd,msg.c_str(),len,0); //CRASHES HERE
cout << "15\n";
return nbytes;
}
when the client exits it sends to the server "BYE" and the server is replying it with the above function. I connect the client to the server (its done on one computer, 2 terminals) and when the client exits the server crashes - it never prints the 15.
any idea why ? any idea how to test why?
thank you.
EDIT: this is how i close the client:
void closeClient(int notifyServer = 0) {
/** notify server before closing */
if (notifyServer) {
int len = SERVER_PROTOCOL[bye].size()+1;
char* buf = new char[len];
strcpy(buf,SERVER_PROTOCOL[bye].c_str()); //c_str - NEED TO FREE????
sendToServer(buf,len);
delete[] buf;
}
close(_sockfd);
}
btw, if i skipp this code, meaning just leave the close(_sockfd) without notifying the server everything is ok - the server doesn't crash.
EDIT 2: this is the end of strace.out:
5211 recv(5, "BYE\0", 4, 0) = 4
5211 write(1, "received from client 5 \n", 24) = 24
5211 write(1, "command: BYE msg: \n", 19) = 19
5211 write(1, "BYEBYE\n", 7) = 7
5211 write(1, "response = ALALA!!!\n", 20) = 20
5211 write(1, "sending to client 5 ALALA!!!\n", 29) = 29
5211 write(1, "10\n", 3) = 3
5211 send(5, "\t\0\0\0", 4, 0) = 4
5211 write(1, "11\n", 3) = 3
5211 send(5, "ALALA!!!\0", 9, 0) = -1 EPIPE (Broken pipe)
5211 --- SIGPIPE (Broken pipe) # 0 (0) ---
5211 +++ killed by SIGPIPE +++
broken pipe can kill my program?? why not just return -1 by send()??
You may want to specify MSG_NOSIGNAL in the flags:
int nbytes = send(fd,msg.c_str(), msg.size(), MSG_NOSIGNAL);
You're getting SIGPIPE because of a "feature" in Unix that raises SIGPIPE when trying to send on a socket that the remote peer has closed. Since you don't handle the signal, the default signal-handler is called, and it aborts/crashes your program.
To get the behavior your want (i.e. make send() return with an error, instead of raising a signal), add this to your program's startup routine (e.g. top of main()):
#include <signal.h>
int main(int argc, char ** argv)
{
[...]
signal(SIGPIPE, SIG_IGN);
Probably the clients exits before the server has completed the sending, thus breaking the socket between them. Thus making send to crash.
link
This socket was connected but the
connection is now broken. In this
case, send generates a SIGPIPE signal
first; if that signal is ignored or
blocked, or if its handler returns,
then send fails with EPIPE.
If the client exits before the second send from the server, and the connection is not disposed of properly, your server keeps hanging and this could provoke the crash.
Just a guess, since we don't know what server and client actually do.
I find the following line of code strange because you define int len = msg.size()+1;.
int nbytes = send(fd,msg.c_str(),len,0); //CRASHES HERE
What happens if you define int len = msg.size();?
If you are on Linux, try to run the server inside strace. This will write lots of useful data to a log file.
strace -f -o strace.out ./server
Then have a look at the end of the log file. Maybe it's obvious what the program did and when it crashed, maybe not. In the latter case: Post the last lines here.