I've got probles with receiving MPI Array. I'm doing something like this:
int *b = new int[5];
for(int i = 0; i < 5; i++) {
b[i] = i;
MPI_Send(&b[0], 5, MPI_INT, procesDocelowy, 0, MPI_COMM_WORLD);
this is how I send my array.
int *b = new int[5];
MPI_Recv(&b, 5, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
My problems is that I cant receive arrays which was allocated dynamically. My process hangs just after MPI_recv and I get:
job aborted:
rank: node: exit code: message
0: Majster: terminated
1: Majster: terminated
2: Majster: 0xc0000005: process exited without calling finalize
3: Majster: terminated
It's quite interesting, because if I initialize my array in static way, I mean
int b[5]; when receiving and
int b[] = {1,2,3,4,5}; while sending
everything works fine.
I can't initialize arrays in static way, I have to do this dynamically. Any ideas how to resolve this problem?
It's because you use &b to refer to your array when you call MPI_Recv(). If you use a pointer to a dynamic address, you send the address of the pointer instead of the address of the array.
I am trying to fake shared memory currently while using the MPI library in C++. I have an array A of size n+1, where n is given from the user, and have processor 0 generate the integers for that array. I need to share the array that processor 0 created with all the other processes. So as a result I Bcast it to the others... However when I go to have each processor print out their array I get a signal 11 (Segmentation Fault) from a processor that isn't zero. If I comment that section out it runs with no problems. I would like to be able to see that my array was sent and stored correctly in all of the processors.
int *A=new int[n+1];
for(int i=1; i<=n; i++){
MPI_Bcast(&A, n+1, MPI_INT, 0, MPI_COMM_WORLD);
else {
MPI_Bcast(&A, n+1, MPI_INT, 0, MPI_COMM_WORLD);
cout<<"My rank is "<<my_rank<<" and this is my array:"<<endl;
for (int i=0; i<=n; i++)
{cout<<A[i]<<" "<<endl;}
You are incorrectly passing &A as address to MPI_Bcast. This is the address of the pointer, MPI needs the address of the data i.e. A.
MPI_Bcast(A, n+1, MPI_INT, 0, MPI_COMM_WORLD);
Move that code outside of the if/else block. It is the same call for all ranks.
I am learning MPI, and trying to create examples of some of the functions. I've gotten several to work, but I am having issues with MPI_Gather. I had a much more complex fitting test, but I trimmed it down to the most simple code. I am still, however, getting the following error:
root#master:/home/sgeadmin# mpirun ./expfitTest5
Assertion failed in file src/mpid/ch3/src/ch3u_request.c at line 584: FALSE
memcpy argument memory ranges overlap, dst_=0x1187e30 src_=0x1187e40 len_=400
internal ABORT - process 0
I am running one master instance and two node instances through AWS EC2. I have all the appropriate libraries installed, as I've gotten other MPI examples to work. My program is:
int main()
int world_size, world_rank;
int nFits = 100;
double arrCount[100];
double *rBuf = NULL;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
int nElements = nFits/(world_size-1);
for(int k = 0; k < nElements; k++)
arrCount[k] = k;
rBuf = (double*) malloc( nFits*sizeof(double));
MPI_Gather(arrCount, nElements, MPI_DOUBLE, rBuf, nElements, MPI_DOUBLE, 0, MPI_COMM_WORLD);
for(int i = 0; i < nFits; i++)
Is there something I am not understanding in malloc or MPI_Gather? I've compared my code to other samples, and can't find any differences.
The root process in a gather operation does participate in the operation. I.e. it sends data to it's own receive buffer. That also means you must allocate memory for it's part in the receive buffer.
Now you could use MPI_Gatherv and specify a recvcounts[0]/sendcount at root of 0 to follow your example closely. But usually you would prefer to write an MPI application in a way that the root participates equally in the operation, i.e. int nElements = nFits/world_size.
I'm currently working on a C program using MPI, and I've run into a roadblock regarding the MPI_Send() and MPI_Recv() functions, that I hope you all can help me out with. My goal is to send (with MPI_Send()), and receive (with MPI_Recv()), the address of "a[0][0]" (Defined Below), and then display the CONTENTS of that address after I've received it from MPI_Recv(), in order to confirm my send and receive is working. I've outlined my problem below:
I have a 2-d array, "a", that works like this:
a[0][0] Contains my target ADDRESS
*a[0][0] Contains my target VALUE
i.e. printf("a[0][0] Value = %3.2f, a[0][0] Address = %p\n", *a[0][0], a[0][0]);
So, I run my program and memory is allocated for a. Debug confirms that a[0][0] contains the address 0x83d6260, and the value stored at address 0x83d6260, is 0.58. In other words, "a[0][0] = 0x83d6260", and "*a[0][0] = 0.58".
So, I pass the address, "a[0][0]", as the first parameter of MPI_Send():
-> MPI_Send(a[0][0], 1, MPI_FLOAT, i, 0, MPI_COMM_WORLD);
// I put 1 as the second parameter becasue I only want to receive this one address
MPI_Send() executes and returns 0, which is MPI_SUCCESS, which means that it succeeded, and my Debug confirms that "0x83d6260" is the address passed.
However, when I attempt to receive the address by using MPI_Recv(), I get Segmentation fault:
MPI_Recv(a[0][0], 1, MPI_FLOAT, iNumProcs-1, 0, MPI_COMM_WORLD, &status);
The address 0x83d6260 was sent successfully using MPI_Send(), but I can't receive the same address with MPI_Recv(). My question is - Why does MPI_Recv() cause a segment fault? I want to simply print the value contained in a[0][0] immediately after the MPI_Recv() call, but the program crashes.
MPI_Send(a[0][0], 1, MPI_FLOAT ...) will send memory with size sizeof(float) starting at a[0][0]
So basicaly the value sent is *(reinterpret_cast<float*>(a[0][0]))
Therefore if a[0][0] is 0x0x83d6260 and *a[0][0] is 0.58f then MPI_Recv(&buff, 1, MPI_FLOAT...) will set buffer (of type float, which need to be allocated) to 0.58
On important thing is that different MPI programm should NEVER share pointers (even if they run on the same node). They do not share virtual memory pagination and event if you where able to acces the adress from one on the rank, the other ones should give you a segfault if you try to access the same adress in their context
This code works for me :
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
int main(int argc, char* argv[])
int size, rank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
case 0:
float*** a;
a = malloc(sizeof(float**));
a[0] = malloc(sizeof(float* ));
a[0][0] = malloc(sizeof(float ));
*a[0][0] = 0.58;
MPI_Send(a[0][0], 1, MPI_FLOAT, 1, 0, MPI_COMM_WORLD);
printf("rank 0 send done\n");
free(a[0] );
free(a );
case 1:
float buffer;
printf("rank 1 recv done : %f\n", buffer);
return 0;
results are :
mpicc mpi.c && mpirun ./a.out -n 2
> rank 0 send done
> rank 1 recv done : 0.580000
I think the problem is that you're trying to put the value into the array of pointers (which is probably causing the segfault). Try making a new buffer to receive the value:
MPI_Send(a[0][0], 1, MPI_FLOAT, i, 0, MPI_COMM_WORLD);
double buff;
MPI_Recv(&buff, 1, MPI_FLOAT, iNumProcs-1, 0, MPI_COMM_WORLD, &status);
If I remember correctly the MPI_Send/Recv will dereference the pointer giving you the value, not the address.
You also haven't given us enough information to tell if your source/destination values are correct.
any idea why the following would give me a segfault?
buf_int = new int[12];
buf_int[0] = stx1.min;
buf_int[1] = stx1.max;
buf_int[2] = stx2.min;
buf_int[3] = stx2.max;
buf_int[4] = sty1.min;
buf_int[6] = sty2.max;
MPI_Bcast(&buf_int, 12, MPI_INT, 0, MPI_COMM_WORLD);
stx1.min = buf_int[0];
if i comment out the final line, i do not get a segfault, but if i leave it in, i get the
error which turns out is a segmentation fault. if the error cannot be deduced from the code given, i can include more.
buf_int is declared as
int* buf_int;
Since the signature of MPI_Bcast is this:
int MPI_Bcast(
void *buffer,
int count,
MPI_Datatype datatype,
int root,
MPI_Comm comm
as taken from it's documentation, you should call the function as:
MPI_Bcast(buf_int, 12, MPI_INT, 0, MPI_COMM_WORLD);
That is, pass buf_int as first argument, instead of &bug_int.
You can see the example-code by scrolling down the page, and compare the usage.
The issue I am trying to resolve is the following:
The C++ serial code I have computes across a large 2D matrix. To optimize this process, I wish to split this large 2D matrix and run on 4 nodes (say) using MPI. The only communication that occurs between nodes is the sharing of edge values at the end of each time step. Every node shares the edge array data, A[i][j], with its neighbor.
Based on reading about MPI, I have the following scheme to be implemented.
if (myrank == 0)
for (i= 0 to x)
for (y= 0 to y)
MPI_SEND(A[x][0], A[x][1], A[x][2], Destination= 1.....)
MPI_RECEIVE(B[0][0], B[0][1]......Sender = 1.....)
if (myrank == 1)
for (i = x+1 to xx)
for (y = 0 to y)
MPI_SEND(B[x][0], B[x][1], B[x][2], Destination= 0.....)
MPI_RECEIVE(A[0][0], A[0][1]......Sender = 1.....)
I wanted to know if my approach is correct and also would appreciate any guidance on other MPI functions too look into for implementation.
Just to amplify Joel's points a bit:
This goes much easier if you allocate your arrays so that they're contiguous (something C's "multidimensional arrays" don't give you automatically:)
int **alloc_2d_int(int rows, int cols) {
int *data = (int *)malloc(rows*cols*sizeof(int));
int **array= (int **)malloc(rows*sizeof(int*));
for (int i=0; i<rows; i++)
array[i] = &(data[cols*i]);
return array;
int **A;
A = alloc_2d_init(N,M);
Then, you can do sends and recieves of the entire NxM array with
MPI_Send(&(A[0][0]), N*M, MPI_INT, destination, tag, MPI_COMM_WORLD);
and when you're done, free the memory with
Also, MPI_Recv is a blocking recieve, and MPI_Send can be a blocking send. One thing that means, as per Joel's point, is that you definately don't need Barriers. Further, it means that if you have a send/recieve pattern as above, you can get yourself into a deadlock situation -- everyone is sending, no one is recieving. Safer is:
if (myrank == 0) {
MPI_Send(&(A[0][0]), N*M, MPI_INT, 1, tagA, MPI_COMM_WORLD);
MPI_Recv(&(B[0][0]), N*M, MPI_INT, 1, tagB, MPI_COMM_WORLD, &status);
} else if (myrank == 1) {
MPI_Recv(&(A[0][0]), N*M, MPI_INT, 0, tagA, MPI_COMM_WORLD, &status);
MPI_Send(&(B[0][0]), N*M, MPI_INT, 0, tagB, MPI_COMM_WORLD);
Another, more general, approach is to use MPI_Sendrecv:
int *sendptr, *recvptr;
int neigh = MPI_PROC_NULL;
if (myrank == 0) {
sendptr = &(A[0][0]);
recvptr = &(B[0][0]);
neigh = 1;
} else {
sendptr = &(B[0][0]);
recvptr = &(A[0][0]);
neigh = 0;
MPI_Sendrecv(sendptr, N*M, MPI_INT, neigh, tagA, recvptr, N*M, MPI_INT, neigh, tagB, MPI_COMM_WORLD, &status);
or nonblocking sends and/or recieves.
First you don't need that much barrier
Second, you should really send your data as a single block as multiple send/receive blocking their way will result in poor performances.
This question has already been answered quite thoroughly by Jonathan Dursi; however, as Jonathan Leffler has pointed out in his comment to Jonathan Dursi's answer, C's multi-dimensional arrays are a contiguous block of memory. Therefore, I would like to point out that for a not-too-large 2d array, a 2d array could simply be created on the stack:
int A[N][M];
Since, the memory is contiguous, the array can be sent as it is:
On the receiving side, the array can be received into a 1d array of size N*M (which can then be copied into a 2d array if necessary):
int A_1d[N*M];
MPI_Recv(A_1d, N*M, MPI_INT,0,tagA, MPI_COMM_WORLD,&status);
//copying the array to a 2d-array
int A_2d[N][M];
for (int i = 0; i < N; i++){
for (int j = 0; j < M; j++){
A_2d[i][j] = A_1d[(i*M)+j]
Copying the array does cause twice the memory to be used, so it would be better to simply use A_1d by accessing its elements through A_1d[(i*M)+j].