What does this sentence in the OpenMPI documentation mean? - c++

In the documentation of OpenMPI one can read the following sentence in the "When communicator is an Inter-communicator"-section:
The send buffer argument of the processes in the first group must be consistent with the receive buffer argument of the root process in the second group.
This section only appears in the documentation of non-blocking functions. In my case this is MPI_Igatherv.
I have an Inter-communicator connecting two groups. The first group contains only one process, which is a master (distributing and collecting data). The second group contains one or more worker processes (receiving data, doing work and sending results back). All the workers have the same code and the master has its own separate code. The master starts the workers with MPI_Spawn.
However I am concerned, I am not using the function call correctly.
As the master tries to receive data, I use the following code:
MPI_Igatherv(nullptr, 0, MPI_DOUBLE, recv_buf, sizes, offsets, MPI_DOUBLE, MPI_ROOT, inter_comm, &mpi_request);
The master does not contribute any data, so the send buffer here is a nullptr with zero size.
On the other hand, all workers send data like this:
MPI_Igatherv(send_buf, size, MPI_DOUBLE, nullptr, nullptr, nullptr, MPI_DOUBLE, 0, inter_comm, &mpi_request);
The workers do not receive any data, so the receive buffer is a nullptr with no sizes or offsets.
Is this the correct way?

Related

Consecutive MPI non-blocking calls

I have been wondering how MPI runtime would differentiate messages between multiple non-blocking calls (inside a same comm world)?
ex: Say we have multiple Iallgather operations.
...
auto res1 = MPI_Iallgather(... , MPI_COMM_WORLD, req[0]);
auto res2 = MPI_Iallgather(... , MPI_COMM_WORLD, req[1]);
MPI_Waitall(2, req, MPI_STATUSES_IGNORE);
...
In Isend/Irecv routines, there's a int tag parameter. But for other non-blocking calls, there's no tag param.
When we create a MPI_Request object, would it create a unique tag?
Since, as you observe, there is no tag, there may be a problem if two processes issue the Iallgathers in different orders. Therefore all processes need to issue the non-blocking collectives in the same order. The request object offers no help here, because the first request corresponds to whatever you do first, on whatver process, so you can have mismatches there.

How does one send custom MPI_Datatype over to a different process?

Suppose that I create custom MPI_Datatypes for subarrays of different sizes on each of the MPI processes allocated to a program. Now I wish to send these subarrays to the master process and assemble them into a bigger array block by block. The master process is unaware of the individual datatypes (defined by the local sizes) on the other processes. Naively, therefore, I might attempt to send over these custom datatypes to the master process in the following manner.
MPI_Datatype localarr_type;
MPI_Type_create_subarray( NDIMS, array_size, local_size, box_corner, MPI_ORDER_C, MPI_FLOAT, &localarr_type );
MPI_Type_Commit(&localarr_type);
if (rank == master)
{
for (int id = 1; id < nprocs; ++id)
{
MPI_Recv( &localarr_type, 1, MPI_Datatype, id, tag1[id], comm_cart, MPI_STATUS_IGNORE );
MPI_Recv( big_array, 1, localarray_type, id, tag2[id], comm_cart, MPI_STATUS_IGNORE );
}
}
else
{
MPI_Send( &localarr_type, 1, MPI_Datatype, master, tag1[rank], comm_cart );
MPI_Send( local_box, 1, localarr_type, master, tag2[rank], comm_cart );
}
However, this results in a compilation error with the following error message from the GNU and CLANG compilers, and the latter error message from the Intel compiler.
/* GNU OR CLANG COMPILER */
error: unexpected type name 'MPI_Datatype': expected expression
/* INTEL COMPILER */
error: type name is not allowed
This means that either (1) I am attempting to send a custom MPI_Datatype over to a different process in the wrong way or that (2) this is not possible at all. I would like to know which it is, and if it is (1), I would like to know what the correct way of communicating a custom MPI_Datatype is. Thank you.
Note.
I am aware of other ways of solving the above problem without needing to communicate MPI_Datatypes. For example, one could communicate the local array sizes and manually reconstruct the MPI_Datatype from other processes inside the master process before using it in the subsequent communication of subarrays. This is not what I am looking for.
I wish to communicate the custom MPI_Datatype itself (as shown in the example above), not something that is an instance of the datatype (which is doable, as also shown in the example code above).
First of all: You can not send a datatype like that. The value MPI_Datatype is not a value of type MPI_Datatype. (It's a cute idea though.) You could send the parameters with which it is constructed, and the reconstruct it on the sending type.
However, you are probably misunderstanding the nature of MPI. In your code, with the same datatype on workers and manager, you are sort of assuming that everyone has data of the same size/shape. That is not compatible with the manager gathering everything together.
If you're gathering data on a manager process (usually not a good idea: are you really sure you need that?) then the contributing processes have the data in a small array, say at index 0..99. So you can send them as an ordinary contiguous buffer. The "manager" has a much larger array, and places all the contributions in disjoint locations. So at most the manager needs to create subarray types to indicate where the received data goes in the big array.

MPI - how to send avalue to a specific position in array

I want so send a value to a position in array of another process.
so
1st process: MPI_ISend (&val..., process, ..)
2nd process: MPI_Recv (&array[i], ..., process, ...)
So I know the i number on the first process, I also know, that I can't use a variable - first send i and then val, as other processes can change i ( 2nd process is accepting messages from many others).
First of all other send/receives should not/cannot overwrite i. You should keep your messages clear and separated. That's what the tag is for! Also rank_2 can separate which rank did send the data. So you can have one i for every rank you await a message from.
Finally you might want to check out one-sided MPI communication (MPI_Win). With that technique rank_1 can 'drop' the message directly into rank_2's array at the position only known to rank_1.

Retrieve buffer with multiple overlapped I/O requests

There is something I'd like to know about overlapped I/O under windows, both with and without I/O completion ports.
I know in advance how many packets I will be receiving after using WSASend().
So I'd like to do that
for (int i = 0; i < n; i++)
WSARecv(sock, &buffer_array[i], 1, NULL, 0, &overlapped, completion_routine);
My problem is : how can I know which buffer has been filled upon notification the buffer has been filled? I mean, without guessing by the order of the calls (buffer[0], buffer[1], buffer[2], etc.).
I would find an alternative solution that gives me the buffer pointer at the time of the notification much more clean for example, and more easily changeable/adaptable as the design of my application evolves.
Thanks.
Right now you are starting n concurrent receive operations. Instead, start them one after the other. Start the next one when the previous one has completed.
When using a completion routine, the hEvent field in the OVERLAPPED block is unused and can be used to pass context info into the completion routine. Typically, this would be a pointer to a buffer class instance or an index to an array of buffer instances. Often, the OVL block would be a struct member of the instance since you need a separate OVL per call.

Send same data to multiples kernel in OpenCL

I have multiple kernels,in the first of them i send some entries, the output I have from the first kernel is the input for the next. My queue of kernels repeat this behavior 8 times until the last kernel that sends me the real output what I need.
This is an example of what i did:
cl::Kernel kernel1 = cl::Kernel(OPETCL::program, "forward");
//agrego los argumetnos del kernel
kernel1.setArg(0, cl_rowCol);
kernel1.setArg(1, cl_data);
kernel1.setArg(2, cl_x);
kernel1.setArg(3, cl_b);
kernel1.setArg(4, sizeof(int), &largo);
//ejecuto el kernel
OPETCL::queue.enqueueNDRangeKernel(kernel1, cl::NullRange, global, local, NULL, &profilingApp);
/********************************/
/** ejecuto las simetrias de X **/
/********************************/
cl::Kernel kernel2 = cl::Kernel(OPETCL::program, "forward_symmX");
//agrego los argumetnos del kernel
kernel2.setArg(0, cl_rowCol);
kernel2.setArg(1, cl_data);
kernel2.setArg(2, cl_x);
kernel2.setArg(3, cl_b);
kernel2.setArg(4, cl_symmLOR_X);
kernel2.setArg(5, cl_symm_Xpixel);
kernel2.setArg(6, sizeof(int), &largo);
//ejecuto el kernel
OPETCL::queue.enqueueNDRangeKernel(kernel2, cl::NullRange, global, local, NULL, &profilingApp);
OPETCL::queue.finish();
OPETCL::queue.enqueueReadBuffer(cl_b, CL_TRUE, 0, sizeof(float) * lors, b, NULL, NULL);
In this case cl_b is the output what i need.
My question is if the arguments i send to kernels are the same to all kernel, but only one is different.
Is correct what i did to set arguments??
The arguments are keeping in the device during the all kernels execution??
Since you are using the same queue and OpenCL-context this is OK and your kernels can use the data (arguments) calculated by previous kernel and the data will be kept on the device.
I suggest you to use clFinish after each kernel execution to assure the previous kernel finished the calculation, before next one starts. Alternatively, you can use events, to assure that.
I think you get this behaviour for free, as long as you don't specify CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE when you create your command queue.
It looks like you're doing it correctly. In general, this is the process:
create your buffer(s)
queue a buffer copy to the device
queue the kernel execution
repeat #3 for as many kernels as you need to run, passing the buffer as the correct parameter. Use setArg to change/add params. The buffer will still exist on the device -- and modified by the previous kernels
queue a copy of the buffer back to the host
If you do specify CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, you will have to use events to control the execution order of the kernels. This seems unnecessary for your example though.