mpi all_gatherv weird behaviour? - c++

I'm trying to gather 2 matrices computed on different processes, i'm using the mpi allgatherv function to do so but I'm surprised by the results.
In the example I'm working on, the shape of the matrices is (20,80) and I would like to gather them into one matrix with (40,80) shape; with the matrix computed by process 0 stored in the 20 first rows.
Here is a sample of the code I'm using (nbRegimes is 0 in the example):
boost::mpi::communicator world;
int rank = world.rank();
std::vector< ArrayXXd> phiOutLoc(nbRegimes);
for (int iReg = 0; iReg < nbRegimes; ++iReg)
phiOutLoc[iReg].resize(nScenarioCur * nDimension, nbPoints);
... computation and storage into phiOutLoc[iReg] ...
//gathering results
ArrayXXd storeGlob(nScenarios * nDimension, nbPoints);
for (int iReg = 0; iReg < nbRegimes; ++iReg)
{
boost::mpi::all_gatherv<double>(world, phiOutLoc[iReg].data(), phiOutLoc[iReg].size(), storeGlob.data());
}
For instance, here is the beginning of the first row of phiOutLoc[iReg] computed on process 0 and process 1 :
rank 0
0 -353509 -366699 -379888 -393077 -406267 -419456 -432646 ...
rank 1
0 -399021 -413908 -428795 -443683 -458570 -473457 -488345 ...
Those rows should be stored respectively in the index 0 row of storeGlob and in the index 20 row;
if I understood the all_gatherv function behaviour correctly.
Howewer the index 0 row of storeGlob looks like this :
storeGlob row 0:
0 -366699 -393077 -419456 ... 0 -413908 -443683 -473457
and the index 20 row of storeGlob :
storeGlob row 20:
-353509 -379888 -406267 -432646 ... -399021 -428795 -458570 -488345
The index 0 row of storeGlob is filled with the even indices of the first rows of phiOutLoc[iReg] matrices.
The odd indices are stored in the index 20 row.
I can't really understand why the gathering is done that way.
Is this behaviour normal and is there a way to gather the two matrices the way I would like?

Related

Eigen3 (cpp) select column given mask and sum where true

I have a Eigen::Matrix2Xf where row are X and Y positions and cols act as list index
I would like to have the sum of the columns (rowwise) where some column condition is true, here some example code:
Eigen::Vector2f computeStuff(Eigen::Matrix2Xf & values, const float max_norm){
const auto mask = values.colwise().norm().array() < max_norm;
return mask.select(values.colwise(), Eigen::Vector2f::Zero()).rowwise().sum();
}
But this code does not compile complaining about the types of the if/else matrices, what is the correct (and computationally faster) way to do it?
Also I know that there are similar question with an answer, but they create a new Eigen::Matrix2Xf with the filtered values given the mask, this code is meant to run inside a #pragma omp parallel for so the basic idea is to do not create a new matrix for maintaining cache coherency
Thanks
The main problem with your code is that .select( ... ) needs at least one of its arguments to have the same shape as the mask. The arguments can be two matrices or a matrix and a scalar or vice-versa, but in all cases the matrices have to be shaped like the mask.
In your code mask is a row vector but values is a 2 by x matrix. One way to handle this is to just replicate the row vector into a two row matrix:
#include <Eigen/Dense>
#include <iostream>
Eigen::Vector2f computeStuff(Eigen::Matrix2Xf& values, const float max_norm) {
auto mask = (values.colwise().norm().array() < max_norm).replicate(2, 1);
return mask.select(values, 0).rowwise().sum();
}
int main() {
Eigen::Matrix2Xf mat(2,4);
mat << 1, 4, 3, 2,
1, 2, 4, 3;
auto val = computeStuff(mat, 5);
std::cout << val;
return 0;
}
In the above mask will be:
1 1 0 1
1 1 0 1
i.e. the row 1 1 0 1 duplicated once. Then mask.select(values, 0) yields
1 4 0 2
1 2 0 3
so the result will be
7
6
which i think is what you want, if I am understanding the question.

Convert vector to OpenCV Matd1d in c++

I have a variable M that is a cv::Mat1d matrix. It is made like this:
cv::Mat1d<double> M;
It is populated with a bunch of values in some other code that is probably not necessary and it looks like this when I print it:
[-0.9344576352096885;
-0.9344576352096885;
-0.5766199600499906;
0.2846686026510846;
0.9589011777015718;
0.9285453673591227;
0.3137239980297359;
-0.2302718892981206;
-0.2921750731112262;
-0.2206633656711884;
-0.2175072323850435;
-0.1725991485554647;
-0.2140556050785325;
-0.4148403958730175;
-0.4036417215304363;
-0.06016889338878993;
0.3028103268622913;
0.4454375499811856;
0.3803583582813156;
0.3188387192279333;
0.3914868895364941;
0.4488724871465618;
0.2694705005556897;
-0.05248136866304744;
-0.2971598882254832;
-0.3545797186279719;
-0.2294426230118397;
-0.1673776410980104;
-0.2768386357175945;
-0.3276029287776189;
-0.2361695287135101;
-0.06139424097086685;
0.1621769468562924;
0.3275221571852822;
0.3153071221383847;
0.1341365194415481;
-0.04596232030582767;
-0.08961855126383761;
-0.02999823421387905;
-0.03225119207066054]
It is size [1 x 40].
I had to convert M to a vector X to do some calculations on it. I converted it like this:
vector<ld> X;
X.push_back(M.at<double>(0, 0));
for (int i = 1; i < M.rows; i++) {
X.push_back(M.at<double>(i, 0));
}
Now when I print it (After the calculations I did which are irrelevant), it looks like this:
[0 0 0 0 0 -1 -1 0 1 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 1 0 0]
and is size 40.
How do I convert X back to a [1x40] cv::Mat1d so that it is the same type as M?
Note: If my question/syntax/anything else is appalling, my apologies. I am just now leaving the relative safety of python for c++...
Edit
I think I solved it. In case anyone is curious OR if my solution needs correcting, here it is:
Mat1d test;
for (int i = 0; i < X.size(); i++) {
double new_val = (double) X[i];
test.push_back(new_val);
}
I'm not sure I quite understand what you need. You are doing two things here,
dealing with type conversion between integer and floating point types
reshaping the arrays between row and column vector
cv::Mat::reshape is a little different from numpy's reshape but does the same thing. you could have two Mats sharing the same data, and you use the one for a row view and the other for a column view.
cv::Mat m1_column, m2_row;
m2_row = m1_column.reshape(0, 1);
// first arg: number of channels. 0 means same.
// second arg: number of rows. make it a row vector.
cv::Mat::convertTo is roughly equivalent to numpy's ndarray.astype. it copies the values, converted to the desired type, into another cv::Mat.
cv::Mat m1_floats, m2_ints, m3_floats;
m1_floats.convertTo(m2_ints, CV_8U);
m2_ints.convertTo(m3_floats, CV_32F);
combine both as you need.
you don't even need to move between column and row format. just use a single index in Mat::at<>(). it'll run through the matrix as if it were flattened.
you can construct a Mat from a std::vector so it shares the vector's memory:
std::vector<int> X; // contains values from somewhere, ints should be 32 bit
cv::Mat X_as_a_Mat(1, X.size(), CV_32S, X.data()); // a single-row Mat sharing the std::vector's data
cv::Mat X_as_doubles;
X_as_a_Mat.convertTo(X_as_doubles, CV_64F); // converts data type
this code assumes that int is a 32 bit signed type (CV_32S). I don't know what element type your vector<ld> is supposed to be.

Map matrix elements to {0, 1} values in EJML

I would like to turn a matrix of non-negative integers to a binary matrix. For example, given the following input matrix:
2 3
0 1
It should the following output matrix:
1 1
0 1
I think this is similar to a map operation, so pseudocode-wise this operation is equivalent to mapElements(x -> (x > 0) ? 1 : 0) or simply mapNonZeroes(x -> 1).
A possible approach is to unfurl the non-zero elements of the matrix to triplets with the value set to 0/1 and rebuild the matrix from the triplets. Is there a better way to do this?
For me what worked is to directly access the nz_values storage field, and map the values myself.
public void normalizeMatrix(DMatrixSparseCSC m) {
for (int i = 0; i < m.nz_length; i++) {
m.nz_values[i] = Math.min(m.nz_values[i], 1.0);
}
}

Logical Question

Consider a [4x8] matrix "A" and [1x8] matrix "B". I need to check if there exists a value "X" such that
[A]^T * [X] = [B]^T exists for any x >= 0 { X is a [4X1] matrix, T = transpose }
Now here is the trick/tedious part. The matrix A always has 1 as its diagonal. A11,A22,A33,A44 = 1 This matrix can be considered as two halves with first half being the first 4 columns and the second half being the second 4 columns like something below :
1 -1 -1 -1 1 0 0 1
A = -1 1 -1 0 0 1 0 0
-1 -1 1 0 1 0 0 0
-1 -1 -1 1 1 1 0 0
Each row in the first half can have either two or three -1's and if it has two -1's then that corresponding row in the second half should have one "1" or if any row has three -1's the second half of the matrix should have two 1's. The overall objective is to have the sum of each row to be 0.
Now B is a [1x8] matrix which can also be considered as two halves as follows:
B = -1 -1 0 0 0 0 1 1
Here there can be either one, two, three or four -1's in the first half and there should be equal number of 1's in the second half. It should be done in combinations For example, if there are two -1's in the first half, they can be placed in 4 choose 2 = 6 ways and for each of them there will be 6 ways to place the 1's in the second half which has a total of 6*6 = 36 ways. i.e. 36 different values for B's if there are two -1's in the first half. The placement of 1's in the matrix A should also be the same way. The way I could think of doing this is to consider a valarray or something of that sort and make the matrices A and B but I don't know what to do.
Now for every A, I've to test it with every combinations of B to see if there exists
[A]^T * [X] = [B]^T
I'm trying to prove a result that I got I need to know if such an X would exist or not. I'm very confused on implementing this. Any suggestions are welcome. This would come under linear programming concept in math. I want it either in C++ or in Matlab. Any other languages are also acceptable but I'm familiar with only these two. Thanks in advance.
Update:
Here is my answer for this problem :
clear;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%# Generating all possible values of vector B
%# permutations using dec2bin (start from 17 since it's the first solution)
vectorB = str2double(num2cell(dec2bin(17:255)));
%# changing the sign in the first half, then check that the total is zero
vectorB(:,1:4) = - vectorB(:,1:4);
vectorB = vectorB(sum(vectorB,2)==0,:);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%# generate all possible variation of first/second halves
z = -[0 1 1; 1 0 1; 1 1 0; 1 1 1]; n = -sum(z,2);
h1 = {
[ ones(4,1) z(:,1:3)] ;
[z(:,1:1) ones(4,1) z(:,2:3)] ;
[z(:,1:2) ones(4,1) z(:,3:3)] ;
[z(:,1:3) ones(4,1) ] ;
};
h2 = arrayfun(#(i) unique(perms([zeros(1,4-i) ones(1,i)]),'rows'), (1:2)', ...
'UniformOutput',false);
%'# generate all possible variations of complete rows
rows = cell(4,1);
for r=1:4
rows{r} = cell2mat( arrayfun( ...
#(i) [ repmat(h1{r}(i,:),size(h2{n(i)-1},1),1) h2{n(i)-1} ], ...
(1:size(h1{r},1))', 'UniformOutput',false) );
end
%'# generate all possible matrices (pick one row from each to form the matrix)
sz = cellfun(#(M)1:size(M,1), rows, 'UniformOutput',false);
[X1 X2 X3 X4] = ndgrid(sz{:});
matrices = cat(3, ...
rows{1}(X1(:),:), ...
rows{2}(X2(:),:), ...
rows{3}(X3(:),:), ...
rows{4}(X4(:),:) );
matrices = permute(matrices, [3 2 1]); %# 4-by-8-by-104976
A = matrices;
clear matrices X1 X2 X3 X4 rows h1 h2 sz z n r
options = optimset('LargeScale','off','Display','off');
for i = 1:size(A,3),
for j = 1:size(vectorB,1),
X = linprog([],[],[],A(:,:,i)',vectorB(j,:)');
if(size(X,1)>0) %# To check that it's not an empty matrix
if((size(find(X < 0),1)== 0)) %# to check the condition X>=0
if (A(:,:,i)'* X == (vectorB(j,:)'))
X
end
end
end
end
end
I got it with the help of stackoverflow folks. The only problem is the linprog function throws a lot of exceptions in every iteration along with the answers produced. The exception is:
(1)Exiting due to infeasibility: an all-zero row in the constraint matrix does not have a zero in corresponding right-hand-side entry.
(2) Exiting: One or more of the residuals, duality gap, or total relative error has stalled: the primal appears to be infeasible (and the dual unbounded).(The dual residual < TolFun=1.00e-008.
What does this mean. How can I overcome this?
It is not clear from your question if you are familiar with system linear equations and their solution, or it is what you are trying to "invent". See also here for Matlab-specific explanation.
If you are familiar with that, you should be more clear in your question about what makes your problem different.

generate a truth table given an input?

Is there a smart algorithm that takes a number of probabilities and generates the corresponding truth table inside a multi-dimensional array or container
Ex :
n = 3
N : [0 0 0
0 0 1
0 1 0
...
1 1 1]
I can do it with for loops and Ifs , but I know my way will be slow and time consuming . So , I am asking If there is an advanced feature that I can use to do that as efficient as possible ?
If we're allowed to fill the table with all zeroes to start, it should be possible to then perform exactly 2^n - 1 fills to set the 1 bits we desire. This may not be faster than writing a manual loop though, it's totally unprofiled.
EDIT:
The line std::vector<std::vector<int> > output(n, std::vector<int>(1 << n)); declares a vector of vectors. The outer vector is length n, and the inner one is 2^n (the number of truth results for n inputs) but I do the power calculation by using left shift so the compiler can insert a constant rather than a call to, say, pow. In the case where n=3 we wind up with a 3x8 vector. I organize it in this way (rather than the customary 8x3 with row as the first index) because we're going to take advantage of a column-based pattern in the output data. Using the vector constructors in this way also ensures that each element of the vector of vectors is initialized to 0. Thus we only have to worry about setting the values we want to 1 and leave the rest alone.
The second set of nested for loops is just used to print out the resulting data when it's done, nothing special there.
The first set of for loops implements the real algorithm. We're taking advantage of a column-based pattern in the output data here. For a given truth table, the left-most column will have two pieces: The first half is all 0 and the second half is all 1. Since we pre-filled zeroes, a single fill of half the column height starting halfway down will apply all the 1s we need. The second column will have rows 1/4th 0, 1/4th 1, 1/4th 0, 1/4th 1. Thus two fills will apply all the 1s we need. We repeat this until we get to the rightmost column in which case every other row is 0 or 1.
We start out saying "I need to fill half the rows at once" (unsigned num_to_fill = 1U << (n - 1);). Then we loop over each column. The first column starts at the position to fill, and fills that many rows with 1. Then we increment the row and reduce the fill size by half (now we're filling 1/4th of the rows at once, but we then skip blank rows and fill a second time) for the next column.
For example:
#include <iostream>
#include <vector>
int main()
{
const unsigned n = 3;
std::vector<std::vector<int> > output(n, std::vector<int>(1 << n));
unsigned num_to_fill = 1U << (n - 1);
for(unsigned col = 0; col < n; ++col, num_to_fill >>= 1U)
{
for(unsigned row = num_to_fill; row < (1U << n); row += (num_to_fill * 2))
{
std::fill_n(&output[col][row], num_to_fill, 1);
}
}
// These loops just print out the results, nothing more.
for(unsigned x = 0; x < (1 << n); ++x)
{
for(unsigned y = 0; y < n; ++y)
{
std::cout << output[y][x] << " ";
}
std::cout << std::endl;
}
return 0;
}
You can split his problem into two sections by noticing each of the rows in the matrix represents an n bit binary number where n is the number of probabilities[sic].
so you need to:
iterate over all n bit numbers
convert each number into a row of your 2d container
edit:
if you are only worried about runtime then for constant n you could always precompute the table, but it think you are stuck with O(2^n) complexity for when it is computed
You want to write the numbers from 0 to 2^N - 1 in binary numeral system. There is nothing smart in it. You just have to populate every cell of the two dimensional array. You cannot do it faster than that.
You can do it without iterating directly over the numbers. Notice that the rightmost column is repeating 0 1, then the next column is repeating 0 0 1 1, then the next one 0 0 0 0 1 1 1 1 and so on. Every column is repeating 2^columnIndex(starting from 0) zeros and then ones.
To elaborate on jk's answer...
If you have n boolean values ("probabilities"?), then you need to
create a truth table array that's n by 2^n
loop i from 0 to (2^n-1)
inside that loop, loop j from 0 to n-1
inside THAT loop, set truthTable[i][j] = jth bit of i (e.g. (i >> j) & 1)
Karnaugh map or Quine-McCluskey
http://en.wikipedia.org/wiki/Karnaugh_map
http://en.wikipedia.org/wiki/Quine%E2%80%93McCluskey_algorithm
That should head you in the right direction for minimizing the resulting truth table.