MPI_Send to multiple POSIX threads running on the same process - c++

I start n POSIX threads on process #0 to listen to incoming calls from processes #0...#n. However, my code is not working, instead I get a segfault. I think the problem might be because of overlapping buffers. I am new to C++. Can you suggest a solution?
void *listener(void *arg) {
...
int status_switch;
while (true) {
MPI_Recv(&status_switch, 1, MPI_INT, fd->id, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
...
}
}
int main(int argc, char * argv[])
{
MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
if (world_rank == root)
{
int nthreads = world_size;
threads=(pthread_t *)malloc(nthreads*sizeof(threads));
fd=(struct_t *)malloc(sizeof(struct_t)*nthreads);
//Start listeners for each node on node0
for (int i = 0; i < world_size; i++) {
fd[i].id=i;
pthread_create(&threads[i], NULL, listener, (void *)(fd+i) );
}
}
int status_switch;
status_switch = 0;
MPI_Send(&status_switch, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
...
}

I'm not familiar with MPI, but you seem to be allocating the wrong amount of space here:
threads = (pthread_t *) malloc(nthreads*sizeof(threads));
It is very likely that your code segfaults because of this.
If threads is a pthread_t *, you want to allocate nthreads*sizeof(*threads) - enough space to hold nthreads instances of pthread_t.
Also, you shouldn't cast the result of malloc.

Related

How to create a new datatype in mpi and take effect in all scope?

I define a new MPI data type in the main function in my code, but it seems that it can't be used in other functions.
typedef struct {
int row;
int col;
double val;
} unit;
void sendTest() {
unit val;
val.row = val.col = val.val = 1;
MPI_Send(&val, 1, valUnit, 1, 0, MPI_COMM_WORLD);
}
void recvTest() {
unit val;
MPI_Recv(&val, 1, valUnit, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
}
int main(int argc, char* argv[]) {
int comm_sz,my_rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
int blockcount[3]={1,1,1};
MPI_Aint offsets[3] = {offsetof(unit, row), offsetof(unit, col), offsetof(unit, val)};
MPI_Datatype dataType[3] = {MPI_INT, MPI_INT, MPI_DOUBLE};
MPI_Datatype valUnit;
MPI_Type_create_struct(3, blockcount, offsets, dataType, &valUnit);
MPI_Type_commit(&valUnit);
if(my_rank == 0)
sendTest();
else
recvTest();
MPI_Finalize();
return 0;
}
When I compile the program, I got an error:
error: ‘valUnit’ was not declared in this scope
I was wondering how to define the new mpi data type once and can be used in all scope?
Simply declare valUnit as a global variable (e.g. right after the typedef ... declaration).
Note send() and recv() are functions from the glibc so you should rename these subroutines in your program, otherwise you might experience some really weird side effects.

MPI - mpirun noticed that process... exited on signal 6

int proc_cnt, rank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &proc_cnt);
if (rank == 0) {
std::vector<int> segment_ids = read_segment_ids(argv[kParDataIx]);
std::map<int, ParameterSet> computed_par_sets;
int buf_send[kBufMsToSlSize];
double buf_recv[kBufSlToMsSize];
MPI_Status status;
int curr_segment_ix = 0;
int recv_par_sets = 0;
//inits workers
for (int i = 1; i < proc_cnt; i++) {
buf_send[0] = segment_ids[curr_segment_ix++];
MPI_Send(
buf_send, kBufMsToSlSize * sizeof (int), MPI_INT,
i, 0, MPI_COMM_WORLD);
}
//sends slaves what to do and receives answers
while(recv_par_sets < segment_ids.size()) {
//receives answer
MPI_Recv(buf_recv, kBufSlToMsSize * sizeof (double), MPI_DOUBLE, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, &status);
recv_par_sets++;
if (curr_segment_ix < segment_ids.size()) {
//there are still segments to process
buf_send[0] = segment_ids[curr_segment_ix++];
} else {
//there is no segment to process, sends to slave termination char
buf_send[0] = -1;
}
//sends back to source which segment to process as next
MPI_Send(
buf_send, kBufMsToSlSize * sizeof (int), MPI_INT,
status.MPI_SOURCE, 0, MPI_COMM_WORLD);
std::pair<int,ParameterSet> computed_seg_par_set = convert_array_to_seg_par_set(buf_recv);
computed_par_sets.insert(computed_seg_par_set);
}
print_parameter_sets(computed_par_sets);
std::cout << "[Master] was termianted" << std::endl;
} else {
int bufToSl[kBufMsToSlSize];
double bufToMs[kBufSlToMsSize];
Bounds bounds = read_bounds_file(argv[kParBoundsIx]);
Config config = read_config_file(kConfigFileName);
while (true) {
MPI_Recv(bufToSl, kBufMsToSlSize * sizeof (int), MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUSES_IGNORE);
int segment_id = bufToSl[0];
if (segment_id == -1) {
//termination character was found
break;
}
Segment segment = read_segment(argv[kParDataIx], segment_id);
std::map<int, Segment> segment_map;
segment_map.insert(std::pair<int, Segment>(segment.GetId(), segment));
SimplexComputer simplex_computer(segment_map, bounds, config);
ParameterSet par_set = simplex_computer.ComputeSegment(&segment);
convert_seg_par_set_to_array(segment_id, par_set, bufToMs);
MPI_Send(
bufToMs, kBufSlToMsSize * sizeof (double), MPI_DOUBLE,
0, 0, MPI_COMM_WORLD);
}
std::cout << "[SLAVE] " << rank << " was terminated" << std::endl;
}
MPI_Finalize();
I just don't get it. When I try to run this with mpirun and process count set to 5, all processes finish, control outputs saying that master or slave was terminated are printed, but in the end there is this statement:
mpirun noticed that process rank 0 with PID 1534 on node Jan-MacBook exited on signal 6 (Abort trap: 6).
What am I doing wrong? Thank you guys in advance.
According to both Send and Recv definitions, the second parameter count is a number of elements you're sending or receiving. Datatype of those elements is then specified as a third parameter of both calls:
int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)
and the same goes for Recv:
int MPI_Recv(void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)
You can find defitions here: Send and Recv.
Hope it helps.

MPI code hangs at finalize with more than 2 nodes/processes

The following code is suppose to go send a series of messages to each node, and report the time it takes for each communication. At the moment it exits fine with processes, but if I run with more than 2 processes it hangs on the last exchange.
I've put statements in previous versions to check where it hangs, I am 90% sure that it is the MPI_FINALIZE statement, but I can't quite figure out why. Any ideas?
#include <stdio.h>
#include "/usr/include/mpich2/mpi.h"
#define ping 101
#define pong 101
float buffer[100000];
int main (int argc, char *argv[]){
int error, rank, size; //mpi holders
int i, j, k; //loops
extern float buffer[100000]; //message buffer
int length; //loop again
double start, final, time;
extern float buffer[100000];
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if(rank!=0){
MPI_Status status;
for(i=1;i<size;i++){
for(length=1;length<=30000;length+=1000){
for(j=0;j<100;j++){
MPI_Recv(buffer, length, MPI_FLOAT, 0, ping, MPI_COMM_WORLD, &status);
MPI_Send(buffer, length, MPI_FLOAT, 0, pong, MPI_COMM_WORLD);
}
}
}
}
if(rank==0){
MPI_Status status;
for(i=1;i<size;i++){
for(length=1;length<=30000;length+=1000){
start = MPI_Wtime();
for(j=0;j<100;j++){
MPI_Send(buffer, length, MPI_FLOAT, i, ping, MPI_COMM_WORLD);
MPI_Recv(buffer, length, MPI_FLOAT, MPI_ANY_SOURCE, pong, MPI_COMM_WORLD, &status);
}
final = MPI_Wtime();
time = final-start;
printf("%s\t%d\t%f\n", "Node", i, time);
}
}
}
MPI_Finalize();
return 0;
}
You have an extra loop in non-zero ranks:
if(rank!=0){
MPI_Status status;
for(i=1;i<size;i++){ <-----------
for(length=1;length<=30000;length+=1000){
for(j=0;j<100;j++){
MPI_Recv(buffer, length, MPI_FLOAT, 0, ping, MPI_COMM_WORLD, &status);
MPI_Send(buffer, length, MPI_FLOAT, 0, pong, MPI_COMM_WORLD);
}
}
} <-----------
}
With 2 ranks that loop executes a single iteration but with more than two ranks it executes size-1 iterations. Since rank 0 sends the messages only once per rank, you have to remove that loop.

MPI_Sendrecv_replace() dead-lock issue

I'm doing my homework whith following assignment:
Every process takes a double as an input. Using function
MPI_Sendrecv_replace() swap all doubles with processes of opposite
rank (first & last, second & last but one, ...). In every process output recieved number.
So here is the code that I wrote.
#include "mpi.h"
#include <stdio.h>
#include "pt4.h"
int main(int argc, char *argv[])
{
MPI_Init(&argc,&argv);
int flag;
MPI_Initialized(&flag);
if (flag == 0)
return;
int rank, size;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
double n;
pt >> n; // pt is a stream provided by side library (works perfectly fine)
int oppositeRank = (size - 1) - rank;
if (rank != oppositeRank)
{
MPI_Status status;
MPI_Sendrecv_replace(&n, 1, MPI_DOUBLE, oppositeRank, 0,
rank, 0, MPI_COMM_WORLD, &status);
}
pt << n;
MPI_Finalize();
return 0;
}
Although this code compiles without any problems, it never stops. So the question is why? What am I doing wrong?
Replace this:
MPI_Sendrecv_replace(&n, 1, MPI_DOUBLE, oppositeRank, 0,
rank, 0, MPI_COMM_WORLD, &status);
with this:
MPI_Sendrecv_replace(&n, 1, MPI_DOUBLE, oppositeRank, 0,
oppositeRank, 0, MPI_COMM_WORLD, &status);
You may find this documentation page useful.
This function sends the buffer to a processor (dest, or the 4th argument) and receives from another (source, the 6th argument). To do a swap you send to another rank and receive from that same rank. In your case you were sending to the opposite rank and receiving from yourself, which would never come, hence the deadlock.

Errors while compiling MPI

I am trying to compile a code in C++ using code from : https://stackoverflow.com/questions/5953979/sending-and-receiving-array-using-mpi-part-2.
I use the following command to compile: mpiicpc -o <filename> xxxx.cc -lmpi
After I compile, all my errors seem to refer to the two functions i have defined in my source code to print output values and do the MPI Isend and MPI Irecv. Specifically, I get two types of errors
Error: Identifier "variable" is undefined
Error: Too few arguments in function call: MPI_Isend/MPI_Irecv and MPI Waitall();
Finally, it exists with this message: compilation aborted for xxxx.cc (code 2).
Could you please point to what I must be doing wrong while defining the variables?
Here is an excerpt of my source code (The code in its entirety is available at https://stackoverflow.com/questions/5953979/sending-and-receiving-array-using-mpi-part-2):
int main (int argc, char *argv[])
{
int my_rank;
int p;
int source;
int dest;
int tag = 0;
//Allocating Memory
double *A = new double[Rows*sizeof(double)];
double *B = new double[Rows*sizeof(double)];
....
....
....
//MPI Commands
MPI_Status status;
MPI_Init (&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &p);
//For number of beats
for (ibeat=0;ibeat<beats;ibeat++)
{
for (i=0; i<Cols/2; i++)
{
for (y=0; y<Rows/2; y++)
{
if (my_rank == 0)
if (i < 48)
if (y<48)
V[i][y] = 0;
if (my_rank ==
.....
....
....
}
}
//Load the Array with the edge values
for (r=0; r<Rows/2; y++)
{
if ((my_rank == 0) || (my_rank == 1))
{
A[r] = V[r][48];
BB[r] = V[r][48];
}
if ((my_rank
...
...
}
prttofile ();
outputpass ();
ibeat = ibeat+1;
}
MPI_Finalize ();
}
void prttofile ()
{
for (i = 0; i<Cols/2; i++)
{
for (y = 0; y<Rows/2; y++)
{
if (my_rank == 0)
fout << V[i][y] << " " ;
....
....
}
}
if (my_rank == 0)
fout << endl;
....
}
void outputpass ()
{
int test = 2;
if ((my_rank%test) == 0)
{
MPI_Isend(C, Rows, MPI_DOUBLE, my_rank+1, MPI_COMM_WORLD); //Non blocking Send
MPI_Irecv(CC, Rows, MPI_DOUBLE, my_rank+1, MPI_COMM_WORLD, &status); //Non Blocking Recv
}
else if ((my_rank%test) == 1)
....
....
MPI_Waitall ();
}
You're not declaring a lot of variables - in particular the loop counters. Declare them all at the top of your functions and you'll be fine.
According to the documentation, the signature of MPI_Isend() is:
int MPI_Isend( void *buf, int count, MPI_Datatype datatype, int dest,
int tag, MPI_Comm comm, MPI_Request *request )
It has seven parameters - you are passing only five arguments. You'll need to correct that. The same goes for MPI_Irecv().
MPI_Isend() requires a lot more arguments than what you've supplied. Here's your line:
MPI_Isend(C, Rows, MPI_DOUBLE, my_rank+1, MPI_COMM_WORLD);
Where's the tag? Where's the request?
Similarly, your MPI_Waitall() doesn't have any arguments at all! You need the array of requests, the number of requests, and an array of statuses.
I suggest you read an example of non-blocking communication in MPI.