Assigning in high-dimensional Xtensor arrays - c++

I am using the Xtensor library for C++.
I have a xt::zeros({n, n, 3}) array and I would like to assign the its i, j, element an xt::xarray{ , , } so that it would store a 3D dimensional vector at each (i, j). However the documentation does not mention assigning values - I am in general unable to figure out from the documentation how arrays with multiple coodinates works.
What I have been trying is this
xt::xarray<double> force(Body body1, Body body2){
// Function to calulate the vector force on body2 from
// body 1
xt::xarray<double> pos1 = body1.get_position();
xt::xarray<double> pos2 = body2.get_position();
// If the positions are equal return the zero-vector
if(xt::all(xt::equal(pos1, pos2))) {
return xt::zeros<double>({1, 3});
}
xt::xarray<double> r12 = pos2 - pos1;
double dist = xt::linalg::norm(r12);
return -6.67259e-11 * body1.get_mass() * body2.get_mass()/pow(dist, 3) * r12;
}
xt::xarray <double> force_matrix(){
// Initialize the matrix that will hold the force vectors
xt::xarray <double> forces = xt::zeros({self_n, self_n, 3});
// Enter the values into the force matrix
for (int i = 0; i < self_n; ++i) {
for (int j = 0; j < self_n; ++j)
forces({i, j}) = force(self_bodies[i], self_bodies[j]);
}
}
Where I'm trying to assign the output of the force function as the ij'th coordinate in the forces array, but that does not seem to work.

In xtensor, assigning and indexing into multidimensional arrays is quite simple. There are two main ways:
Either index with round brackets:
xarray<double> a = xt::zeros({3, 3, 5});
a(0, 1, 3) = 10;
a(1, 1, 0) = -100; ...
or by using the xindex type (which is a std::vector at the moment), and the square brackets:
xindex idx = {0, 1, 3};
a[idx] = 10;
idx[0] = 1;
a[idx] = -100; ...
Hope that helps.

You can also use view to achieve that.
In the inner loop, you could do:
xt::view(forces, i, j, xt::all()) = a_xarray_with_proper_size;

Related

C++. Low level graphics. Polygon mesh data reading via classes. Need elaboration on the code of the constructor

I have been studying C++ and low-level graphics and currently have questions on this code from https://www.scratchapixel.com/lessons/3d-basic-rendering/ray-tracing-polygon-mesh.
class PolygonMesh : public Object
{
public:
PolygonMesh(uint32_t nfaces, int *fi, int *vi, Vec3f *p) :
numFaces(nf), faceIndex(NULL), vertexIndex(NULL), P(NULL)
{
// compute vertArraySize and maxVertexIndex
uint32_t vertArraySize = 0;
uint32_t maxVertexIndex = 0, index = 0;
for (uint32_t i = 0; i < numFaces; ++i) {
vertArraySize += nv[i];
for (uint32_t j = 0; j < fi[i]; ++j)
if (vi[index + j] > maxVertexIndex)
maxVertexIndex = vi[index + j];
index += fi[i];
}
maxVertexIndex += 1;
pts = std::unique_ptr<Vec3f []>(new point[maxVertexIndex]);
for (uint32_t i = 0; i < maxVertexIndex; ++i) P[i] = p[i];
vertexIndex = std::unique_ptr<uint32_t []>(new int[maxVertexIndex]);
for (uint32_t i = 0; i < maxVertexIndex; ++i) vertexIndex [i] = vi[i];
faceIndex = std::unique_ptr<uint32_t>(new int[numFaces]);
for (uint32_t i = 0; i < numFaces; ++i) faceIndex[i] = fi[i];
};
~PolygonMesh() { /* release memory */ ... }
bool intersect(...) const { ... }
void getSurfaceData(...) const { ... }
uint32_t numFaces; //number of faces
std::unique_ptr<uint32_t []> faceIndex; //face index
std::unique_ptr<uint32_t []> vertexIndex; //vertex index
std::unique_ptr<Vec3f []> P;
};
int main(...)
{
...
uint32_t numFaces = 2;
uint32_t faceIndex[2] = {4, 4};
uint32_t vertexIndex[8] = {0, 1, 2, 3, 0, 3, 4, 5};
Vec3f P[6] = {
Vec3f (-5, -5, 5), Vec3f ( 5, -5, 5),
Vec3f ( 5, -5, -5), Vec3f (-5, -5, -5),
Vec3f (-5, 5, -5), Vec3f (-5, 5, 5),
};
PolygonMesh *mesh = new PolygonMesh(numFaces, faceIndex, vertexIndex, P);
...
}
The website (the author) says :
The point list, face and vertex index array are passed to the constructor of the MeshPolygon class as well as the number of faces. However we don't know how many points are in the point array. To find out, we look for the vertex with the maximum index value in the vertex index array (lines 13-14). The first element of an array in C++ start at 0, therefore the total number of vertices in the point list is the maximum index value plus 1 (line 17).
So, it is about getting data from polygon mesh! Several questions puzzle me :
Above it says, "...we don't know how many points are in the point array..." In my understanding, why just not read the size of the P array that is being passed from the main() ? That is supposed to be the size of the array, or...?
If my understanding correct, then in the PolygonMesh(..) constructor what happens is deep copying of pointers(and all the values those addresses' possess) ? Is it right? I am asking , because I have just learned( or read) recently about smart pointers, std::move and r-values references. Also, in the code, they don't std::move all the pointers from main to the class object because we want to save those original data (pointers), right?
Is it correct that in order to find Vertex Array Size in the above code for the class object, we could just read the maximum value of uint32_t vertexIndex[8], i.e maximum vertexIndexArray is the total number of vertices?
I assume in line 11 it must be vertIndexArraySize += fi[i]; instead of vertArraySize += nv[i]; because I have no idea where does nv come from what what it means...
Thank you all for your genuine help !

Eigen compund addition gives wrong result

I declared two Eigen::RowVectorXd variables in the program as below. I get wrong results in the compound addition statement sdf_grad+=gradval. Only the first two elements are added and the rest of elements in the sdf_grad vector become 1e19. I don't have any clue why it is happening. Please Help.
Eigen::RowVectorXd sdf_grad(24);
Eigen::VectorXd stress_dof = get_stress_dof();
Eigen::VectorXd strain_dof = get_strain_dof();
for(unsigned int i=0;i!=qn.size(); i++)
{
for(unsigned int j=0; j!=qn.size();j++)
{
double sval = qn[i];
double tval = qn[j];
if(!m_shape->m_set_coordinate)
m_shape->add_coordinates(this->get_xcoords(),this->get_ycoords());
m_shape->update_shapefn(sval,tval);
Eigen::MatrixXd Bs = get_bsmat_local(i,j);
Eigen::Vector3d stress = Bs*stress_dof;
Eigen::MatrixXd Bd = get_bmat(sval,tval);
Eigen::Vector3d strain = Bd* strain_dof;
Eigen::Vector3d cnfn = m_material->get_constitutive_function(stress,strain);
auto WxJ = qw[i] * qw[j] * m_shape->get_detJ();
double delval=cnfn.norm();
objval+=delval*WxJ;
//SETTING GRADIENT OF STRESS DOF
Eigen::MatrixXd CxBs = m_material->get_cmat()*Bs;
Eigen::MatrixXd Bstrans = CxBs.transpose();
Eigen::RowVectorXd gradval= (-WxJ/delval)*Bstrans*cnfn;
sdf_grad+= gradval ; // Wrong Result.
}
}
You did not zero initialize your vector. Write this instead of the first line:
Eigen::RowVectorXd sdf_grad = Eigen::RowVectorXd::Zero(24);

Assigning the function output to a variable

I have a function which returns the address of a 4x2 matrix whose name is 'a'.
This function computes the elements of 'a' matrix inside and returns the address of the matrix. When I use that function, I want to assign its output to a matrix called 'a1' but when I do so, 'a1' becomes a zero matrix. However, when I assign the output to the same 'a' matrix, everything works fine. Can anyone help me? The code is written on Arduino IDE.
double a[4][2], a1[4][2];
double T0E[4][4]={
{0.1632, -0.3420, 0.9254, 297.9772},
{0.0594, 0.9397, 0.3368, 108.4548},
{-0.9848, 0, 0.1736, -280.5472},
{0, 0, 0, 1}
};
const int axis_limits[4][2]=
{
{ -160, 160 },
{ -135, 60 },
{ -135, 135 },
{ -90, 90 }
};
const unsigned int basex = 50, basez = 100, link1 = 200, link2 = 200, link3=30, endeff=link3+50;
double *inversekinematic(double target[4][4])
{
// angle 1
a[0][0] = -asin(target[0][1]);
a[0][1] = a[0][0];
if (a[0][0]<axis_limits[0][0] || a[0][0]>axis_limits[0][1] || isnan(a[0][0]))
{
bool error=true;
}
// angle 2
double A = sqrt(pow(target[0][3]-cos(a[0][0])*endeff*target[2][2], 2) + pow(target[1][3]-sin(a[0][0])*endeff*target[2][2], 2));
double N = (A - basex) / link1;
double M = -(target[2][3]-endeff*target[2][0] - basez) / link2;
double theta = acos(N / sqrt(pow(N, 2) + pow(M, 2)));
a[1][0] = theta + acos(sqrt(pow(N, 2) + pow(M, 2)) / 2);
a[1][1] = theta - acos(sqrt(pow(N, 2) + pow(M, 2)) / 2);
// angle 3
for (int i = 0; i <= 1; i++)
{
a[2][i] = {asin(-(target[2][3]-endeff*target[2][0]-basez)/link2-sin(a[1][i]))-a[1][i]};
}
// angle 4
for(int i = 0; i <=1; i++)
{
a[3][i] = {-asin(target[2][0])-a[1][i]-a[2][i]};
}
return &a[4][2];
}
void setup(){
Serial.begin(9600);
}
void loop() {
a1[4][2]={*inversekinematic(T0E)};
}
When you type return &a[4][2]; you are returning the address of the 3rd element of the 5th row. This is out of bounds, since C++ uses zero-based indexing and the array was declared as double a[4][2];. I think what you want to do is just return a; to return the address of the entire matrix.
Also, you're doing lots of strange things like declaring the parameter double target[4][4] with a size and using initializer lists to assign single elements, which look unusual to me.
I'll try to be a little more detailed. In C/C++, arrays are nothing more than pointers. So, when you assign one array to another array you are making them literally point to the same data in memory. What you will have to do is copy the elements with loops, or perhaps use memcpy(dest, src, size). For example, if you want to copy the contents of double a[4][2] to double b[4][2], you would use something like memcpy(b, a, sizeof(double) * 8);. If you use a = b; then a and b are pointing to same locations in memory.
Two points:
1. your code says the function inversekinematic() returns a pointer to a double, not an array.
2. you return a pointer to a double, but it's always the same address.
Maybe typedefs will help simplify the code?
typedef double Mat42[4][2];
Mat42 a, a1;
Mat42 *inversekinematic(double target[4][4])
{
// ...
return &a;
}
But, for the code you've shown, I don't see why you need to return the address of a fixed global value. Perhaps your real code might return the address of 'a' or 'a1', but if it doesn't ...

how could I access the element of a high-dimensional matrix in OpenCV?

I am trying to use a 4-d matrix in OpenCV, the initialization part looks like this:
int feature_points_size[] = {bincellDim.x , bincellDim.y , bincellDim.z , 100};
feature_points.create(4 , feature_points_size , CV_64F);
But the library doesn't allow me to access the elements of feature_points with 'at' like this:
feature_points.at<double>(k , j , i , l) = stickfea_code.at<double>(l , 0);
feature_points.at<double>(k , j , i , l + 50) = countfea_code.at<double>(l , 0);
it seems Mat.at<> doesn't have a version for 4 inputs
what's the best practice to access the element of it ?
thx in advance!
cv::Mat::at<> does, in fact have an n-dim overload, you will need to use the T& Mat::at(const int* idx) version.
Alternatively, just write your own (external) wrapper for it that you might use like this:
at4D<double>(feature_points,k,j,i,l);
Just example following the answer of #Adi-Shavit:
std::vector<int> dims = {1, 3, 700, 400};
cv::Mat mat4d = cv::Mat(std::vector<int>, CV_32FC1);
int p[4];
p[0] = 0;
for (unsigned hi=0; hi<dims[3]; hi++) {
p[3] = hi;
for (unsigned wi=0; wi<dims[2]; wi++) {
p[2] = wi;
for (unsigned ci=0; ci<dims[1]; ci++) {
p[1] = ci;
float value = mat4d.at<float>(&p[0]);
mat4d.at<float>(&p[0]) = some_new_value;
}
}
}

Tensor Product Algorithm Optimization

double data[12] = {1, z, z^2, z^3, 1, y, y^2, y^3, 1, x, x^2, x^3};
double result[64] = {1, z, z^2, z^3, y, zy, (z^2)y, (z^3)y, y^2, z(y^2), (z^2)(y^2), (z^3)(y^2), y^3, z(y^3), (z^2)(y^3), (z^3)(y^3), x, zx, (z^2)x, (z^3)x, yx, zyx, (z^2)yx, (z^3)yx, (y^2)x, z(y^2)x, (z^2)(y^2)x, (z^3)(y^2)x, (y^3)x, z(y^3)x, (z^2)(y^3)x, (z^3)(y^3)x, x^2, z(x^2), (z^2)(x^2), (z^3)(x^2), y(x^2), zy(x^2), (z^2)y(x^2), (z^3)y(x^2), (y^2)(x^2), z(y^2)(x^2), (z^2)(y^2)(x^2), (z^3)(y^2)(x^2), (y^3)(x^2), z(y^3)(x^2), (z^2)(y^3)(x^2), (z^3)(y^3)(x^2), x^3, z(x^3), (z^2)(x^3), (z^3)(x^3), y(x^3), zy(x^3), (z^2)y(x^3), (z^3)y(x^3), (y^2)(x^3), z(y^2)(x^3), (z^2)(y^2)(x^3), (z^3)(y^2)(x^3), (y^3)(x^3), z(y^3)(x^3), (z^2)(y^3)(x^3), (z^3)(y^3)(x^3)};
What is the fastest (fewest executions) to produce result given data? Assume, that data is variable in size, but always a factor of 4 (e.g., 4, 8, 12, etc.).
No Boost. I am trying to keep my dependencies small. STL Algorithms are ok.
HINT: result array size should always be 4^(multiple size) (e.g., 4, 16, 64, etc.).
BONUS: If you can compute result just given x, y, z
Additional examples:
double data[4] = {1, z, z^2, z^3};
double result[4] = {1, z, z^2, z^3};
double data[8] = {1, z, z^2, z^3, 1, y, y^2, y^3};
double result[16] = { ... };
I chose the accepted answer code after running this benchmark: https://gist.github.com/1232406. Basically, the top two codes were run and the one with the smallest execution time won.
void Tensor(std::vector<double>& result, double x, double y, double z) {
result.resize(64); //almost noop if already right size
double tz = z*z;
double ty = y*y;
double tx = x*x;
std::array<double, 12> data = {0, 0, tz, tz*z, 1, y, ty, ty*y, 1, x, tx, tx*x};
register std::vector<double>::iterator iter = result.begin();
register int yi;
register double xy;
for(register int xi=0; xi<4; ++xi) {
for(yi=0; yi<4; ++yi) {
xy = data[4+yi]*data[8+xi];
*iter = xy; //a smart compiler can do these four in parallell
*(++iter) = z*xy;
*(++iter) = data[2]*xy;
*(++iter) = data[3]*xy;
++iter; //workaround for speed!
}
}
}
There's probably at least one bug in here somewhere, but it should be fast, with no dependancies (outside of std::vector/std::array), just takes x,y,z. I avoided recursion though, so it only works for 3 in/64 out. The concept can be applied to any number of parameters though. You just have to instantiate yourself.
A good compiler will autovectorize this I guess none of my compilers are good:
void tensor(const double *restrict data,
int dimensions,
double *restrict result) {
result[0] = 1.0;
for (int i = 0; i < dimensions; i++) {
for (int j = (1 << (i * 2)) - 1; j > -1; j--) {
double alpha = result[j];
{
double *restrict dst = &result[j * 4];
const double *restrict src = &data[(dimensions - 1 - i) * 4];
for (int k = 0; k < 4; k++) dst[k] = alpha * src[k];
}
}
}
}
you should use dynamic algorithm. that is, you can use previous results. for example, you keep y^2 result and use it when computing (y^2)z instead of computing it again.
#include <vector>
#include <cstddef>
#include <cmath>
void Tensor(std::vector<double>& result, const std::vector<double>& variables, size_t index)
{
double p1 = variables[index];
double p2 = p1*p1;
double p3 = p1*p2;
if (index == variables.size() - 1) {
result.push_back(1);
result.push_back(p1);
result.push_back(p2);
result.push_back(p3);
} else {
Tensor(result, variables, index+1);
ptrdiff_t size = result.size();
for(int j=0; j<size; ++j)
result.push_back(result[j]*p1);
for(int j=0; j<size; ++j)
result.push_back(result[j]*p2);
for(int j=0; j<size; ++j)
result.push_back(result[j]*p3);
}
}
std::vector<double> Tensor(const std::vector<double>& params) {
std::vector<double> result;
double rsize = (1<<(2*params.size());
result.reserve(rsize);
Tensor(result, params);
return result;
}
int main() {
std::vector<double> params;
params.push_back(3.1415926535);
params.push_back(2.7182818284);
params.push_back(42);
params.push_back(65536);
std::vector<double> result = Tensor(params);
}
I verified that this one compiles and runs (http://ideone.com/IU1eQ). It runs fast, with no dependancies (outside of std::vector). It also takes any number of parameters. Since calling the recursive form is awkward, I made a wrapper. It makes one function call for each parameter, and one call to dynamic memory (in the wrapper).
You should look for Pascal's pyramid to get fast solution. Useful link 1, useful link 2, useful link 3 and useful link 4.
One more thing: as I see it would be a base of a finite element solver. Usually to write own BLAS solver is not a good idea. Do not reinvent the wheel! I think you should use a BLAS solver like intel MKL or Cuda base BLAS.