What I am trying to acheive in this simplified code is:
2 types of processes (root, and children, ids/rank = 10 and 0-9 respectively)
init:
root will listen to children "completed"
children will listen to root notification when all has completed
while there is no winner (not all done yet):
children will have 20% chance they will be done (and notify root they are done)
root will check that all are done
if all done: send notification to children of "winner"
I have code like:
int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
for (int half = 0; half < 1; half++) {
for (int round = 0; round < 1; round++) {
if (id == 10) { // root
// keeps track of who has "completed"
fill_n(arr, 10, -1);
for (int i = 0; i < 10; i++) {
MPI_Irecv(&arr[i], 1, MPI_INT, i, 0, MPI_COMM_WORLD, &reqs[i]);
}
} else if (id < 10) { // children
// listen to root of winner notification/indication to stop
MPI_Irecv(&winner, 1, MPI_INT, 10, 1, MPI_COMM_WORLD, &winnerNotification);
}
while (winner == -1) {
//cout << id << " is in loop" << endl;
if (id < 10 && !stop && ((rand() % 10) + 1) < 3) {
// children has 20% chance to stop (finish work)
MPI_Send(&id, 1, MPI_INT, 10, 0, MPI_COMM_WORLD);
cout << id << " sending to root" << endl;
stop = true;
} else if (id == 10) {
// root checks number of children completed
int numDone = 0;
for (int i = 0; i < 10; i++) {
if (arr[i] >= 0) {
//cout << "root knows that " << i << " has completed" << endl;
numDone++;
}
}
cout << "numDone = " << numDone << endl;
// if all done, send notification to players to stop
if (numDone == 10) {
winner = 1;
for (int i = 0; i < 10; i++) {
MPI_Send(&winner, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
}
cout << "root sent notification of winner" << endl;
}
}
}
}
}
MPI_Finalize();
Output from debugging couts look like: problem seems to be root is not receiving all childrens notification that they are completed?
2 sending to root
3 sending to root
0 sending to root
4 sending to root
1 sending to root
8 sending to root
9 sending to root
numDone = 1
numDone = 1
... // many numDone = 1, but why 1 only?
7 sending to root
...
I thought perhaps I can't receive into an array: but I tried
if (id == 1) {
int x = 60;
MPI_Send(&x, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
} else if (id == 0) {
MPI_Recv(&arr[1], 1, MPI_INT, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
cout << id << " recieved " << arr[1] << endl;
}
Which works.
UPDATE
This seems to be resolved if I add a MPI_Barrier(MPI_COMM_WORLD) before the end of the while loop, but why? Even if the processes run out of sync, eventually, children will send to root that they have completed and root should "listen" to that and process accordingly? What seems to be happening is root keeps running, hogging up all resources for children to execute at all? Or whats happening here?
UPDATE 2: some children not getting notification from root
Ok now the problem that root does not receive children's notification that they have completed by #MichaelSh's answer, I focus on children not receiving from parent. Here's a code that reproduces that problem:
int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
srand(time(NULL) + id);
if (id < 10) {
MPI_Irecv(&winner, 1, MPI_INT, 10, 0, MPI_COMM_WORLD, &winnerNotification);
}
MPI_Barrier(MPI_COMM_WORLD);
while (winner == -1) {
cout << id << " is in loop ..." << endl;
if (id == 10) {
if (((rand() % 10) + 1) < 2) {
winner = 2;
for (int i = 0; i < 10; i++) {
MPI_Send(&winner, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
cout << "winner notifications sent" << endl;
}
}
}
cout << id << " b4 MPI_Finalize. winner is " << winner << endl;
MPI_Finalize();
Output looks like:
# 1 run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
9 b4 MPI_Finalize. winner is 2
0 b4 MPI_Finalize. winner is 2
# another run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
8 b4 MPI_Finalize. winner is 2
Notice some processes doesnt seem to get the notification from the parent? Why is that, MPI_Wait for child processes will just hang them? So how do I resolve this?
Also
All MPI_Barrier does in your case -- it waits for child responses to complete. Please check my answer for a better solution
If I dont do this, I suppose each child response will just take few ms? So even if I dont wait/barrier, I'd expect the receive to still happen soon after the send? Unless processes end up hogging resources and other processes does not run?
Please try this block of code (error checking omitted for simplicity):
...
// root checks number of children completed
int numDone = 0;
MPI_Status statuses[10];
MPI_Waitall(10, reqs, statuses);
for (int i = 0; i < 10; i++) {
...
Edit A better solution:
Each child initiates root winner notification receipt and sends its notification to the root.
Root initiates winner notification receipt to the array and goes into wait for all notifications to be received, and then sends winner's id to children.
Insert this code below after for (int round = 0; round < 1; round++)
if (id == 10)
{ // root
// keeps track of who has "completed"
memset(arr, -1, sizeof(arr));
for (int i = 0; i < 10; i++)
{
MPI_Irecv(&arr[i], 1, MPI_INT, i, 0, MPI_COMM_WORLD, &reqs[i]);
}
}
else if (id < 10)
{ // children
// listen to root of winner notification/indication to stop
MPI_Irecv(&winner, 1, MPI_INT, 10, 1, MPI_COMM_WORLD, &winnerNotification);
}
if (id < 10)
{
while(((rand() % 10) + 1) < 3) ;
// children has 20% chance to stop (finish work)
MPI_Send(&id, 1, MPI_INT, 10, 0, MPI_COMM_WORLD);
std::cout << id << " sending to root" << std::endl;
// receive winner notification
MPI_Status status;
MPI_Wait(&winnerNotification, &status);
// Process winner notification
}
else if (id == 10)
{
MPI_Status statuses[10];
MPI_Waitall(10, reqs, statuses);
// if all done, send notification to players to stop
{
winner = 1;
for (int i = 0; i < 10; i++)
{
MPI_Send(&winner, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
}
std::cout << "root sent notification of winner" << std::endl;
}
}
Related
I am trying to build game of life MPI version. However, when I send out true, I received false, since there is no MPI_BOOL, I used MPI_C_BOOL. The code snippet is below.
vector<bool> send_data_top;
vector<bool> recv_data_top(n);
vector<bool> send_data_down;
vector<bool> recv_data_down(n);
request = new MPI_Request[4];
for (int j = 0; j < n; j++) {
send_data_top.push_back(neigh_grid[n + j + 3]);
send_data_down.push_back(neigh_grid[last_row_index * (n + 2) + j + 1]);
cout << "send data: " << send_data_down[j] << endl;
cout << "send data top: " << send_data_top[j] << endl;
}
MPI_Isend(&send_data_top[0], n, MPI_C_BOOL, id - 1, 0, MPI_COMM_WORLD, &request[0]);
MPI_Isend(&send_data_down[0], n, MPI_C_BOOL, 0, 0, MPI_COMM_WORLD, &request[1]);
MPI_Irecv(&recv_data_top[0], n, MPI_C_BOOL, id - 1, 0, MPI_COMM_WORLD, &request[2]);
MPI_Irecv(&recv_data_down[0], n, MPI_C_BOOL, 0, 0, MPI_COMM_WORLD, &request[3]);
MPI_Waitall(4, request, MPI_STATUS_IGNORE);
delete[] request;
//assign receive values to neigh_grid
for (int j = 0; j < n; j++) {
neigh_grid[j + 1] = recv_data_top[j];
neigh_grid[(last_row_index + 1) * (n + 2) + j + 1] = recv_data_down[j];
cout << recv_data_top[j] << endl;
cout << recv_data_down[j] << endl;
}
The output is
send data: 1
send data top: 1
0
0
I have tried changing the type to MPI_CXX_BOOL and MPI_INT but none of them work and I cannot find a similar situation online. Can anyone figure out what caused this?
I'm new in MPI and I want to do make a problem where I have 2 array A and B with 15 elements and I have 16 processes and and each process represent an element in the arrays (I don't use process zero). The array A have stored input data an positions 8...15, where this positions reprezent the leaves of a tree and in the first step I make a compression in array, where the leaves send the number to the parent and parent receives from all sons and add the numbers and send to father. And the array A si done at process 1 where is the sum of all elements in the array. And in the second step I make prefix calculations where I start from process 0 and finish at leaves.
And to calculate the array B all the other processes need to wait the process 1 to finish the work and for that I using a MPI_Barrier but I have a error when I exec the code.
int m = 3;
int n = (int)pow(2, m);
int *A = (int*)malloc(2 * n * sizeof(int));
int *B = (int*)malloc(2 * n * sizeof(int));
int id;
MPI_Status status;
A[8] = 4; A[9] = 8; A[10] = 5; A[11] = 2;
A[12] = 10; A[13] = 6; A[14] = 9; A[15] = 11;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
if (id == 1)
{
int nr;
int suma = 0;
MPI_Recv(&nr, 1, MPI_INT, 2 * id, 99, MPI_COMM_WORLD, &status);
suma += nr;
MPI_Recv(&nr, 1, MPI_INT, 2 * id + 1, 99, MPI_COMM_WORLD, &status);
suma += nr;
A[id] = suma;
printf("A[%d]=%d\n", id, A[id]);
B[id] = A[id];
printf("B[%d]=%d\n", id, B[id]);
MPI_Barrier(MPI_COMM_WORLD);
}
else
{
if (id != 0)
{
if(id >= 8)
{
MPI_Send(&A[id], 1, MPI_INT, id / 2, 99, MPI_COMM_WORLD);
printf("%d a trimis %d catre %d\n", id, A[id], id / 2);
MPI_Barrier(MPI_COMM_WORLD);
}
else
{
int nr;
int suma = 0;
MPI_Recv(&nr, 1, MPI_INT, 2 * id, 99, MPI_COMM_WORLD, &status);
suma += nr;
MPI_Recv(&nr, 1, MPI_INT, 2 * id + 1, 99, MPI_COMM_WORLD, &status);
suma += nr;
A[id] = suma;
MPI_Send(&A[id], 1, MPI_INT, id / 2, 99, MPI_COMM_WORLD);
printf("%d a trimis %d catre %d\n", id, A[id], id / 2);
MPI_Barrier(MPI_COMM_WORLD);
}
if (id % 2 == 1)
{
B[id] = B[(id - 1) / 2];
printf("B[%d]=%d\n", id, B[id]);
}
else
{
B[id] = B[id / 2] - A[id + 1];
printf("B[%d]=%d\n", id, B[id]);
}
}
MPI_Finalize();
free(A);
return 0;
And I receive the next error:
[15]fatal error Fatal error in MPI_Barrier:Other MPI error,
error stack: MPI_Barrier(MPI_COMM_WORLD) failed failed to
attach to a bootstrap queue - 5064:344
How can I do to make the program work?
MPI_Barrier() is a collective operation, and it will completes once invoked by all the MPI tasks from the communicator.
If i read correctly your code, task 0 does not invoke MPI_Barrier(MPI_COMM_WORLD), so your program will deadlock unless some mechanism in the MPI library aborts it.
I'm writing program for testing whether numbers are prime. At the beginning I calculate how much numbers assign to each process, then send this amount to the processes. Next, calculations are performed and data send back to process 0 that save the results. Below code works but when I increase number of process my program doesn't speedup. It seems to me that my program doesn't work in parallel. What's wrong? This is my first program in MPI so any advices are welcome.
I use mpich2 an I test my program on Intel Core i7-950.
main.cpp:
if (rank == 0) {
int workers = (size-1);
readFromFile(path);
int elements_per_proc = (N + (workers-1)) / workers;
int rest = N % elements_per_proc;
for (int i=1; i <= workers; i++) {
if((i == workers) && (rest != 0))
MPI_Send(&rest, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
else
MPI_Send(&elements_per_proc, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
int it = 1;
for (int i=0; i < N; i++) {
if((i != 0) && ((i % elements_per_proc) == 0))
it++;
MPI_Isend(&input[i], 1, MPI_INT, it, 0, MPI_COMM_WORLD, &send_request);
}
}
if (rank != 0) {
int count;
MPI_Recv(&count, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
for (int j=0; j < count; j++) {
MPI_Recv(&number, 1, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
result = test(number, k);
send_array[0] = number;
send_array[1] = result;
MPI_Send(send_array, 2, MPI_INT, 0, 0, MPI_COMM_WORLD);
}
}
if (rank == 0) {
for (int i=0; i < N; i++) {
MPI_Recv(rec_array, 2, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
// save results
}
}
Your implementation probably doesn't scale well to many processes, since you communicate in every step. You currently communicate the numbers and results for each single input, which incurs a large latency overhead. Instead you should think about communicating the input in-bulk (ie, using a single message).
Furthermore, using MPI collective operations (MPI_Scatter/MPI_Gather) instead of loops of MPI_Send/MPI_Recv might increase your performance further.
Additionally, you can utilize the master process to work on a chunk of the input as well.
A much more scalable implementation might then look as follows:
// tell everybody how many elements there are in total
MPI_Bcast(&N, 1, MPI_INT, 0, MPI_COMM_WORLD);
// everybody determines how many elements it will work on
// (include the master process)
int num_local_elements = N / size + (N % size < rank ? 1 : 0);
// allocate local size
int* local_input = (int*) malloc(sizeof(int)*num_local_elements);
// distribute the input from master to everybody using MPI_Scatterv
int* counts; int* displs;
if (rank == 0) {
counts = (int*)malloc(sizeof(int) * size);
displs = (int*)malloc(sizeof(int) * size);
for (int i = 0; i < size; i++) {
counts[i] = N / size + (N % size < i ? 1 : 0);
if (i > 0)
displs[i] = displs[i-1] + counts[i-1];
}
// scatter from master
MPI_Scatterv(input, counts, displs, MPI_INT, local_input, num_local_elements, MPI_INT, 0, MPI_COMM_WORLD);
} else {
// receive scattered numbers
MPI_Scatterv(NULL, NULL, NULL, MPI_DATATYPE_NULL, local_input, num_local_elements, MPI_INT, 0, MPI_COMM_WORLD);
}
// perform prime testing
int* local_results = (int*) malloc(sizeof(int)*num_local_elements);
for (int i = 0; i < num_local_elements; ++i) {
local_results[i] = test(local_input[i], k);
}
// gather results back to master process
int* results;
if (rank == 0) {
results = (int*)malloc(sizeof(int)*N);
MPI_Gatherv(local_results, num_local_elements, MPI_INT, results, counts, displs, MPI_INT, 0, MPI_COMM_WORLD);
// TODO: save results on master process
} else {
MPI_Gatherv(local_results, num_local_elements, MPI_INT, NULL, NULL, NULL, MPI_INT, 0, MPI_COMM_WORLD);
}
I have a TCP client running on raspbian and a TCP server on Ubuntu. I have made this with c++ sockets. I am sending 3MB every second (are images) and it works perfectly until 2 hours more or less (2h26m, 2h16m, 2h18m...) and it has sent 7100 files aprox (7147, 7071, 7471... it is not a second exactly), after this time the connection get stop, but there's no errors, just is like it stays asleep. After 36 minutes (always 36 minutes) it goes on again from the same instruction it was stop, and the codes keep on with no problem.
I did it: I open one socket and I keep opened. The send and receive functions are inside a infinite while (in both client and server), like this:
Client:
While(1)
{
send();
cout << Hi 1 << endl;
recv();
send();
recv();
cout << Hi 2 << endl;
}
Server:
While(1)
{
recv();
cout << Hi 1 << endl;
send();
recv();
cout << Hi 2 << endl;
send();
}
When it stops, it does in the second or the third function (I know because the first cout prints on screen). When this happens I check out the CPU process (there is no process, it is 5-10% ) and the memory it is not filled up at all, so the computers are not blocked I can move the mouse, open programs, write with the keyboard...
It's made in an LAN, with a router (I have tried with another one with the same results)... I have changed the tcp_keepalive_time (7200, 86400 and 300), but same results...
The program doesn't crash...
And after 36 minutes goes on from the same instruction, with no data lost, with no errors...
I would like to keep the connection and send images constantly every second for a long long time (days), but I have no idea why what I just explained (it takes a break for a while) is happening.
This is the server code (the while is inside a thread):
while(1)
{
clock_t start = clock();
cout << "Esperant image..." << contador << endl;
int bytes_recieved , si = 1;
int long rebut;
char send_data[35]="", FI[5]="1";
char INFO[50], NOM1[8]="" ,NOM[12]="", DISP[7]="", W[5]="", H[5]="", B[2]="", R[5]="", X0[5]="", Y0[5]="", T[9]="";
// ... 2.. Rebre
if(recv(*csock1, INFO, sizeof(INFO), 0)==-1) {
fprintf(stderr, "Error receiving data %d\n", errno);
goto FINISH;
}
for(int k =0; k < sizeof(INFO);k++)
{
if(k>=0 && k <= 6) DISP[k]=INFO[k];
DISP[7]='\0';
if(k>=8 && k <= 13) NOM1[k-8]=INFO[k];
NOM1[6]='\0';
if(k>=15 && k <= 18) W[k-15]=INFO[k];
W[4]='\0';
if(k>=20 && k <= 23) H[k-20]=INFO[k];
H[4]='\0';
if(k==25) B[0]=INFO[k];
B[2]='\0'; //
if(k>=27 && k <= 30) R[k-27]=INFO[k];
R[4]='\0';
if(k>=32 && k <= 35) X0[k-32]=INFO[k];
X0[4]='\0';
if(k>=37 && k <= 40) Y0[k-37]=INFO[k];
Y0[4]='\0';
if(k>=42 && k <= 48) T[k-42]=INFO[k];
T[7]='\0';
}
strcat(NOM, DISP);
strcat(NOM, NOM1);
if(format_image ==1) strcat(NOM, ".raw");
if(format_image ==2) strcat(NOM, ".jpg");
//cout << INFO << endl;
//cout << "Nom: " << NOM << endl;
ample = atoi(W);
alt = atoi(H);
BYTES = atoi(B);
TAM = ample*alt*BYTES;
printf("Image: %s - Tamany: (%ix%i) x %iBytes = %i Bytes\n",NOM, ample,alt,BYTES,TAM);
if(strcmp(DISP, "RASP01_")==0)
{
XP=0;
YP=0;
}
if(strcmp(DISP, "RASP00_")==0)
{
XP=450;
YP=0;
}
if(strcmp(DISP, "RASP02_")==0)
{
XP=0;
YP=400;
}
window_name = DISP;
r = atoi(R);
x0 = atoi(X0);
y0 = atoi(Y0);
tam = atoi(T);
printf("Circul r=%i, x0=%i, y0=%i, %i Bytes\n",r,x0,y0,tam);
if(reduccio==true) tam_reb = tam;
else tam_reb = TAM;
image_reb = (char*) malloc (tam_reb);
data = (char*) malloc (TAM);
// ... 3 .. Enviar Confirmacio
if(send(*csock1,"Conectat a CERVELL",46,0)==-1) {
fprintf(stderr, "Error sending data %d\n", errno);
goto FINISH;
}
cout << "3" << endl;
//#... 4 .. Rebre l'IMAGE
REP = 0;
do{
rebut = recv(*csock1,&image_reb[REP],tam_reb-REP,0);
if (rebut<0) goto FINISH; //rebut=0
REP = REP + (int)rebut;
cout << ".";
}while(rebut>0 && REP<=tam_reb);
cout << "Image Rebuda: " << REP << endl;
//printf("Temps de recepcio: %fs\n", ((double)clock() - start)/CLOCKS_PER_SEC);
// ... 5 .. Enviar Confirmacio
if(send(*csock1,"1",5,0)==-1) {
fprintf(stderr, "Error sending data %d\n", errno);
goto FINISH;
}
// Reconstruir image (Generar Circul de Visio)
if(reduccio==true) //reduccio==true)
{
int xs,cx,k2=0,N, Xc, XN, N1,j,i, r2=pow(r,2);
N = y0-r;
if(N<0) N=0;
i = N*ample;
for( j=0; j<alt; j++ )
{
xs=(int)floor(x0-abs(sqrt(r2-pow(r-j,2))));
cx=(int)floor(abs(sqrt(r2 - pow(r-j,2))));
N=j*ample;
N1=N+ample;
XN=N+xs;
Xc=XN+2*cx;
//for(i=i; i<XN; i++) data[i] = 255;
memset ( &data[i], 255, XN-i );
i = XN -1;
memcpy(&data[i],&image_reb[k2],Xc-XN);
i=i+Xc-XN;
k2=k2+Xc-XN;
//for(i=Xc; i<N1; i++) data[i] = 255;
memset ( &data[i], 255, N1-i );
i = N1 - 1;
}
printf("\nTamany de mensage reconstruit: %i\n",i);
}
// Convertir a MATRIU i GUARDAR
if (1) //REP == tam_reb
{
//string format;
//if (BYTES == 4) format = 'CV_8UC4';; //Scalar(128,128,128)); CV_8UC3 CV_32F Size(1920, 1080), CV_8UC3
//if (BYTES == 1) format = 'CV_8UC1';
//if(reduccio==true)
// Mat M(Size(ample,alt), CV_8UC1,data);
//if(reduccio==false)
Mat M(Size(ample,alt), CV_8UC1,data); //image_reb
char ruta[100] = "/home/jk/Documents/02_Imagens/";
strcat(ruta, NOM);
//Guardar en RAW
if(format_image ==1)
{
FILE *f = fopen(ruta,"w");
if (f == 0) {
printf("************************No he pogut obrir %s\n", NOM);
} else {
fwrite(data,1,TAM,f);
fclose(f);
}
}
//Guardar en JPG
if(format_image ==2)
{
if(imwrite(ruta, M , compression_params )<1)
cout << "Erro al guardar" << endl;
}
// Mostrar IMAGE
namedWindow(window_name,0 ); // WINDOW_OPENGL //WINDOW_NORMAL //CV_WINDOW_KEEPRATIO //CV_WINDOW_AUTOSIZE
//cvNamedWindow(window_name2, CV_WINDOW_OPENGL);
//resizeWindow(window_name,680,480);
moveWindow(window_name, XP, YP);
//setOpenGlContext(window_name);
//updateWindow(window_name);
imshow( window_name, M );
cvWaitKey(15);
} else printf("\n Tamany de l'Image no concorda\n");
contador++;
free(image_reb);
free(data);
printf("Tiempo transcurrido: %f\n", ((double)clock() - start) / CLOCKS_PER_SEC);
}
I am trying to do some MPI parallel work. I am able to run this on any number of processors. The issue is that each processor will take one job, execute it and send it back then the program is done. I want to be able to send a processor another job when it has finished. Im not sure how to implement this. Basically I am trying to send each core 10 jobs.
if (myRank == 0) {
int numCores = MPI::COMM_WORLD.Get_size();
for (int rank = 1; rank < numCores; rank++) {
MPI::COMM_WORLD.Send(&yarray[0], imax, MPI::DOUBLE, rank, 0);
MPI::COMM_WORLD.Send(&tarray[0], imax, MPI::DOUBLE, rank, 0);
MPI::COMM_WORLD.Recv(&ans, imax, MPI::DOUBLE, MPI::ANY_SOURCE, MPI_ANY_TAG, mystatus);
answers[counter] = ans;
counter++;
}
}
else
{
MPI::COMM_WORLD.Recv(&yarray1, imax, MPI::DOUBLE, MPI::ANY_SOURCE, MPI_ANY_TAG, mystatus);
MPI::COMM_WORLD.Recv(&tarray1, imax, MPI::DOUBLE, MPI::ANY_SOURCE, MPI_ANY_TAG, mystatus);
double floor = 0.5, ceiling = 3.5, range = (ceiling - floor);
double rnd = floor + double((range * rand()) / (RAND_MAX + 1.0));
yarray [0] = rnd;
yarray1 [0] = rnd;
double temp = 0;
for (int k = 0; k < imax; k++) {
tarray1[k+1] = tarray1[k] + h;
yarray1[k+1] = yarray1[k] + h * (2 * yarray1[k] - 2 * tarray1[k] * tarray1[k] - 3);
}
temp = yarray1[int(imax)];
//cout << "Rank = " << myRank << " Solution = " << temp << endl;
MPI::COMM_WORLD.Send(&temp, 1, MPI::DOUBLE, 0, 0);
}
Update: within in myrank == 0
while(counter != jobs){
MPI::COMM_WORLD.Recv(&ans, imax, MPI::DOUBLE, MPI::ANY_SOURCE, MPI_ANY_TAG, mystatus);
answers[counter] = ans;
counter++;
}
You need to have some sort of feedback from rank 0 to the other ranks. After the other ranks return their work to rank 0, they should receive a new message back that tells them either their next job or that there is no more work to be completed. The ranks should continue looping until there is no more work to be done.