MPI Sending Dynamically Sized Array using MPI_Alltoallv

MPI Sending Dynamically Sized Array using MPI_Alltoallv - c++

Given p processors and each having a unique input size of i where every input is a unique number. Each processor is given a integer range.
Goal: Have each processor only have integers in their range
Currently running into issues in the following circumstances.
Each processor wants to export the values that are not in its range
Each processor buckets the values not in their range from their input in bucket overflow
Every processor p, broadcasts the size of overflow,
Take the sum of the sizes of overflow including the local overflow size and create an array of total_overflow_size
Each processor will now broadcast their bucket using MPI_Alltoallv
Broadcasted buckets are now stored in globalBucket array
This is the MPI_ALLTOALLV call I am using:
int *local_overflow = &overflow;
//buckets = local buckets <- contains values not in local range, size of local overflow.
//globalBucket <- size of all overflows from all processors
//offset = Sum of all rank-1 processor's overflow.
MPI_Alltoallv(&buckets,local_overflow,off,MPI_INTEGER,
&globalBucket,offest,local_overflow,
MPI_INTEGER,MPI_COMM_WORLD);
I believe my issue lies in offsetting the values appropriately, corresponds to the 3rd and 7th parameter.
The goal is to have for example if processor 0 has bucket of size 5, and processors 1 has bucket of size 12, I want proc 0's bucket to occupy the firs 5 spaces in the array, and proc 1's bucket to occupy the next twelve in the globalBucket.
I receive error messages such as
*** Process received signal ***
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0xc355bb0
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x1733ae0
MPI_ALLTOALLV is an uncommon call, more info available at: http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Alltoallv.html
EDIT: I have calculated my offsets correctly -> the values of all previous processor's rank, I am now receiving the same error as above.

local_overflow should be an array of int, because you want to send the same number of values from this rank to all of the other ranks, all of the elements of local_overflow should have the same value.
int *local_overflow = new int[p];
for (int i = 0; i < p){
local_overflow[i] = overflow;
}

Related

MPI receiving data from an unknown number of ranks

I have a list of indices for which I do not know their corresponding entries in a vector, because the vector is distributed among the ranks. I have to send these indices to the ranks in charge to get the data.
On the other hand "my" rank also get lists of indices from an unknown number of ranks. After receiving the list, "my" rank has to send the corresponding data to this requesting ranks.
I think I have to work with a mixture of MPI_Probe and MPI_Gather. But at the moment I cannot see how to receive lists from an unknown number of ranks.
I think it has to look like this, but how can I receive the data from a bigger unknown number of rank? Or do I have to loop over all possible ranks, that could send me something?
MPI_Status status;
int nbytes;
std::vector<Size> indices;
MPI_Probe(MPI_ANY_SOURCE,MPI_ANY_TAG, comm, &status);
MPI_Get_count(&status,MPI_UINT64_T, &nbytes);
if(nbytes!=MPI_UNDEFINED){
indices.reserve(nbytes);
MPI_Recv(&indices[0],nbytes,MPI_UINT64_T,status.SOURCE,status.TAG,comm,&status);
}

This resembles a lot what I did a few years ago for parallel I/O.
One option:
From all senders, get the size that you need to send to each other rank
Send the sizes (Allgather if all ranks can be senders, otherwise sends/receives)
Does a (all)gatherv that will retrieve the size on each receiver
You can use non blocking send/receives as well as gatherv (MPI3) and this scales well (depending ont he hardware) to 500 cores for 8 senders.
The way we did it was to go through the vector by chunk of several MB and send the data in chunks. Of course, the bigger the chunks the better, but also the more memory you need on each sender ranks to hold the data.

How to mine 1st blocks after genesis (PIVX fork)?

I have generated the genesis block and related hashes, daemon runs fine.
I'm trying to mine the 1st block (block 1) using 'setgenerate true 1'
I've changed related params in chainparams.cpp, any time I run the command I get segmentation fault.
debug log shows
2018-06-25 19:30:54 keypool reserve 2
2018-06-25 19:30:54 CreateNewBlock(): total size 1000
Using latest master branch.

First thing you need to do is check the debug.log from .pivx folder
second thing what data you given in pivx.conf ?
for mine ill add below
rpcuser=user
rpcpassword=password
rpcallowip=127.0.0.1
listen=1
server=1
daemon=1
logtimestamps=1
maxconnections=256
staking=1
txindex=1
And your error segmentation fault. is because the miner.cpp . In src/miner.cpp there is line:
uint256 hashBlockLastAccumulated = chainActive[nHeight - (nHeight % 10) - 10]->GetBlockHash();
so, nHeight is blockchain last block number (which at empty blockchain is 0) + 1 = 1, and thus accessing negative index of array causes Segmentation Fault.
So you need edit this code anyway to run the mining process.

Scattering to more nodes than there is data

What happens if I call MPI::Scatter and there are more nodes in the communicator than there is data?
Suppose I have a 4 × 4 array and I'm sending one row to every processor, but I have 8 processors. What happens? Will rank 0 – 3 receive data and rank 4 – 7 get nothing?

mpi_waitall in mpich2 with null values in array_of_requests

I get the following error with MPICH-2.1.5 and PGI compiler;
Fatal error in PMPI_Waitall: Invalid MPI_Request, error stack:
PMPI_Waitall(311): MPI_Waitall(count=4, req_array=0x2ca0ae0, status_array=0x2c8d220) failed
PMPI_Waitall(288): The supplied request in array element 0 was invalid (kind=0)
in the following example Fortran code for a stencil based algorithm,
Subroutine data_exchange
! data declaration
integer request(2*neighbor),status(MPI_STATUS_SIZE,2*neighbor)
integer n(neighbor),iflag(neighbor)
integer itag(neighbor),neigh(neighbor)
! Data initialization
request = 0; n = 0; iflag = 0;
! Create data buffers to send and recv
! Define values of n,iflag,itag,neigh based on boundary values
! Isend/Irecv look like this
ir=0
do i=1,neighbor
if(iflag(i).eq.1) then
ir=ir+1
call MPI_Isend(buf_send(i),n(i),MPI_REAL,neigh(i),itag(i),MPI_COMM_WORLD,request(ir),ierr)
ir=ir+1
call MPI_Irecv(buf_recv(i),nsize,MPI_REAL,neigh(i),MPI_ANY_TAG,MPI_COMM_WORLD,request(ir),ierr)
endif
enddo
! Calculations
call MPI_Waitall(2*neighbor,request,status,ierr)
end subroutine
The error occurs when the array_of_request in mpi_waitall gets a null value (request(i)=0). The null value in array_of_request comes up when the conditional iflag(i)=1 is not satisfied. The straight forward solution is to comment out the conditional but then that would introduce overheads of sending and receiving messages of 0 sizes which is not feasible for large scale systems (1000s of cores).
As per the MPI-forum link, the array_of_requests list may contain null or inactive handles.
I have tried following,
not initializing array_of_requests,
resizing array_of_request to match the MPI_isend + MPI_irecv count,
assigning dummy values to array_of_request
I also tested the very same code with MPICH-1 as wells as OpenMPI 1.4 and the code works without any issue.
Any insights would be really appreciated!

You could just move the first increment of ir into the conditional as well. Then you would have all handles in request(1:ir) at the and of the loop and issue:
call MPI_Waitall(ir,request(1:ir),status(:,1:ir),ierr)
This would make sure all requests are initialized properly.
Another thing: does n(i) in MPI_Isend hold the same value as nsize in the corresponding MPI_Irecv?
EDIT:
After consulting the MPI Standard (3.0, Ch. 3.7.3) I think you need to initialize the request array to MPI_REQUEST_NULL if you want give the whole request array to MPI_Waitall.

fortran 77 direct access max record number

I'm trying to store double precision data from different blocks into a direct access file, i.e. the data is g(m,n) for one block and they all have the same size. Here's the code I wrote:
OPEN(3,FILE='a.TMP',ACCESS='DIRECT',RECL=8*m*n)
WRITE(3,REC=I) ((g(K,L),K=1,m),L=1,n) ! here "I" is the block number
I have 200 this kind of blocks. However, I got the following error after writing the 157th block data into the file:
severe (66): output statement overflows record, unit 3
I believe that means the record size is too large.
Is there any way to handle this? I wonder if there is the record number has a maximum value.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

MPI Sending Dynamically Sized Array using MPI_Alltoallv - c++

local_overflow should be an array of int, because you want to send the same number of values from this rank to all of the other ranks, all of the elements of local_overflow should have the same value. int *local_overflow = new int[p]; for (int i = 0; i < p){ local_overflow[i] = overflow; }

Related

MPI receiving data from an unknown number of ranks

How to mine 1st blocks after genesis (PIVX fork)?

Scattering to more nodes than there is data

mpi_waitall in mpich2 with null values in array_of_requests

fortran 77 direct access max record number

Categories

Resources