Inlined function to return nested array value not performing as expected - c++

I want to inline the function MyClass:at(), but performance isn't as I expect.
MyClass.cpp
#include <algorithm>
#include <chrono>
#include <iostream>
#include <vector>
#include <string>
// Making this a lot shorter than in my actual program
std::vector<std::vector<int>> arrarr =
{
{ 1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48},
{24, 47, 32, 60, 99, 3, 45, 2, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50},
{32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70},
{67, 26, 20, 68, 2, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21},
{24, 55, 58, 5, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72},
{21, 36, 23, 9, 75, 0, 76, 44, 20, 45, 35, 14, 0, 61, 33, 97, 34, 31, 33, 95},
{78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 3, 80, 4, 62, 16, 14, 9, 53, 56, 92},
{16, 39, 5, 42, 96, 35, 31, 47, 55, 58, 88, 24, 0, 17, 54, 24, 36, 29, 85, 57},
{86, 56, 0, 48, 35, 71, 89, 7, 5, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58},
{19, 80, 81, 68, 5, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 4, 89, 55, 40},
{ 4, 52, 8, 83, 97, 35, 99, 16, 7, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66},
{88, 36, 68, 87, 57, 62, 20, 72, 3, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69},
{ 4, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36},
{20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 4, 36, 16},
{20, 73, 35, 29, 78, 31, 90, 1, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 5, 54},
{ 1, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 1, 89, 19, 67, 48},
};
class MyClass
{
public:
MyClass(std::vector<std::vector<int>> arr) : arr(arr)
{
rows = arr.size();
cols = arr.at(0).size();
}
inline auto at(int row, int col) const { return arr[row][col]; }
void arithmetic(int n) const;
private:
std::vector<std::vector<int>> arr;
int rows;
int cols;
};
MyClass.cpp:
void MyClass::arithmetic(int n) const
{
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::duration;
using std::chrono::milliseconds;
auto t1 = high_resolution_clock::now();
int highest_product = 0;
for (auto y = 0; y < rows; ++y)
{
for (auto x = 0; x < cols; ++x)
{
// Horizontal product
if (x + n < cols)
{
auto product = 1;
for (auto i = 0; i < n; ++i)
{
product *= at(y, x + i);
}
highest_product = std::max(highest_product, product);
}
}
}
auto t2 = high_resolution_clock::now();
duration<double, std::milli> ms_double = t2 - t1;
std::cout << ms_double.count() << "ms\n";
return highestProduct;
};
Now what I want know is why do I get better performance when I replace product *= at(y, x + i); with product *= arr[y][x+i];? When I test it with the first case, the timing on my large array takes roughly 6.7ms, and the second case takes 5.3ms. I thought when I inlined the function, it should be the same implementation as the second case.

Member function directly defined in the class definition (typically in header files) are implicitly inlined so using inline is useless in this case. inline do not guarantee the function is inlined. It is just an hint for the compiler. The keyword is also an important during the link to avoid the multiple-definition issue. Function that are not make inline can still be inlined if the compiler can see the code of the target function (ie. it is in the same translation unit or link time optimization are applied). For more information about this, please read Why are class member functions inlined?
Note that the inlining is typically performed in the optimization step of compilers (eg. -O1//O1). Thus without optimizations, most compilers will not inline the function.
Using std::vector<std::vector<int>> is not efficient since it is not a contiguous data structure and it require 2 indirection to access an item. Two sub-vectors next to each other can be stored far away in memory likely causing more cache misses (and/or thrashing due to the alignment). Please consider using one big flatten array and access items using y*cols+x where cols is the size of the sub-vectors (20 here). Alternatively a int[16][20] data type should do the job well if the size if fixed at compile-time.
MyClass(std::vector<std::vector<int>> arr) cause the input parameter to be copied (and so all the sub-vectors). Please consider using a const std::vector<std::vector<int>>& type.
While at is convenient for checking bounds at runtime, this feature can strongly decrease performance. Consider using the operator [] if you do not need that. You can use assertions combined with flatten arrays so to get a fast code in release and a safe code in debug (you can enable/disable them by defining the NDEBUG macro).

Related

Compile time efficient remove duplicates from a boost::hana tuple

I use the boost::hana to_map function to remove duplicates from boost::hana tuple of types. See it at the compiler explorer. The code works very well but compiles very long (~10s). I wonder if there exist a faster solution that is compatible with boost::hana tuple.
#include <boost/hana/map.hpp>
#include <boost/hana/pair.hpp>
#include <boost/hana/type.hpp>
#include <boost/hana/basic_tuple.hpp>
#include <boost/hana/size.hpp>
using namespace boost::hana;
constexpr auto to_type_pair = [](auto x) { return make_pair(typeid_(x), x); };
template <class Tuple>
constexpr auto remove_duplicate_types(Tuple tuple)
{
return values(to_map(transform(tuple, to_type_pair)));
}
int main(){
auto tuple = make_basic_tuple(
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30
, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40
, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60
, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70
);
auto noDuplicatesTuple = remove_duplicate_types(tuple);
// Should return 1 since there is only one distinct type in the tuple
return size(noDuplicatesTuple);
}
I haven't run any benchmarks, but your example does not appear to take 10 seconds on Compiler Explorer. However, I can explain why it is a relatively slow solution, and suggest an alternative that assumes you are only interested getting a unique list of types and not retaining any run-time information in your result.
Creating large tuples and/or instantiating function templates that have large tuples in their prototypes are expensive compile-time operations.
Just your call to transform instantiates a lambda for each element which in turn instantiates pair. The input/output of this call are both large tuples.
The call to to_map makes an empty map and recursively calls insert for each element each time making a new map, but in this simple case the intermediate result will always be hana::map<int>. I'm willing to bet that this is exploding your compile-times if your actual use case is non-trivial. (It was certainly an issue when we were implementing hana::map so we made hana::make_map avoid this since it has all of its inputs up front).
All of this, and there is a significant penalty for these large function types being used in run-time code. You might notice a difference if you wrapped the operations in decltype and only used the resulting type.
Alternatively, using raw template metaprogramming can sometimes yield performance results over function template based metaprogramming. Here is an example for your use case:
#include <boost/hana/basic_tuple.hpp>
#include <boost/mp11/algorithm.hpp>
namespace hana = boost::hana;
using namespace boost::mp11;
int main() {
auto tuple = hana::make_basic_tuple(
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30
, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40
, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60
, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70
);
hana::basic_tuple<int> no_dups = mp_unique<std::decay_t<decltype(tuple)>>{};
}
https://godbolt.org/z/EnTWf6

Faster tetrahedron-tetrahedron intersection

For one project of mine I require reliable detection of intersection between two tetrahedrons in 3D space. I do not need the points/lines/faces just to know if intersection is present or not. Touching is considered intersection too but common triangle face is not considered as intersection. After quite a struggle to achieve this as fast as possible my solution boiled to this horribility:
let have tetrahedrons v0,v1
each tetrahedron has 4 triangles t[4] where each triangle has 3 points p0,p1,p2 and normal vector n.
compute planes of all 4 sides of both tetrahedrons
so any point p on plane is given by equation
dot(p,n) + d = 0
where n is normal of the plane. As that is known this boils to computing d
D0[i] = -dot(v0.t[i].p0,v0.t[i].n)
D1[i] = -dot(v1.t[i].p0,v1.t[i].n)
i = { 0,1,2,3 }
for each triangle of each tetrahedron
test any combination of triangle vs triangle intersection between v0,v1
so just loop between all 16 combinations and use triangle vs triangle intersection.
The triangle v0.t[i] vs triangle v1.t[j] intersection boils down to this:
compute intersection between planes
this is obviously ray (for non parallel planes) so simple cross product of the plane normals will give the ray direction
dir = cross(v0.t[i].n,v1.t[j].n)
now it is just matter of finding intersection point belonging to both planes. Exploiting determinant computation directly from the cross product of the normals the ray computation is like this:
// determinants
det=vector_len2(dir);
d0=D0[i]/det;
d1=D1[j]/det;
// position
pos = d0*cross(dir,v1.t[j].n) + d1*cross(v0.t[i].n,dir);
for more info see:
SO/SE: Line of intersection between two planes
Wolfram: Plane-Plane Intersection
compile signed distance intervals of triangle ray intersection for each triangle
so simply compute intersection between ray and each edge line of triangle remembering min and max distance from pos. We do not need the actual point just the scalar distance from pos which is the parameter returned by line/ray intersection.
check if ranges of both triangles overlaps or not
if overlaps than v0,v1 intersect ... if no overlap occurs for all of the 16 tests than v0,v1 does not intersect.
As you can see it is a lot of stuff to compute. My linear algebra and vector math knowledge is very limited to things I use so there is a high chance there might be much better approach for this. I tried a lot of things to ease up this without any luck (like using bbox,bsphere, using more simple test exploiting that both ray and triangle edges are on the same plane etc) but the result was either slower or even wrong (not counting edge cases correctly).
Here my actual C++ implementation:
//---------------------------------------------------------------------------
bool tetrahedrons::intersect_lin_ray(double *a0,double *a1,double *b0,double *db,double &tb)
{
int i0,i1;
double da[3],ta,q;
vector_sub(da,a1,a0); ta=0.0; tb=0.0; i0=0; i1=1;
if (fabs(da[i0])+fabs(db[i0])<=_zero) i0=2;
else if (fabs(da[i1])+fabs(db[i1])<=_zero) i1=2;
q=(da[i0]*db[i1])-(db[i0]*da[i1]);
if (fabs(q)<=_zero) return 0; // no intersection
// intersection ta,tb parameters
ta=divide(db[i0]*(a0[i1]-b0[i1])+db[i1]*(b0[i0]-a0[i0]),q);
tb=divide(da[i0]*(a0[i1]-b0[i1])+da[i1]*(b0[i0]-a0[i0]),q);
if ((ta<0.0)||(ta>1.0)) return 0; // inside line check
return 1;
}
//---------------------------------------------------------------------------
bool tetrahedrons::intersect_vol_vol(_vol4 &v0,_vol4 &v1) // tetrahedron v0 intersect tetrahedron v1 ?
{
int i,j,_ti,_tj;
_fac3 *f0,*f1;
double pos[3],dir[3],p[3],det,D0[4],D1[4],d0,d1,t,ti0,ti1,tj0,tj1;
// planes offset: dot(p,v0.t[i].n)+D0[i] = 0 , dot(p,v1.t[j].n)+D1[j] = 0
for (i=0;i<4;i++)
{
D0[i]=-vector_mul(pnt.pnt.dat+fac.dat[v0.t[i]].p0,fac.dat[v0.t[i]].n);
D1[i]=-vector_mul(pnt.pnt.dat+fac.dat[v1.t[i]].p0,fac.dat[v1.t[i]].n);
}
// plane plane intersection -> ray
for (i=0;i<4;i++)
for (j=0;j<4;j++)
{
f0=fac.dat+v0.t[i];
f1=fac.dat+v1.t[j];
// no common vertex
if ((f0->p0==f1->p0)||(f0->p0==f1->p1)||(f0->p0==f1->p2)) continue;
if ((f0->p1==f1->p0)||(f0->p1==f1->p1)||(f0->p1==f1->p2)) continue;
if ((f0->p2==f1->p0)||(f0->p2==f1->p1)||(f0->p2==f1->p2)) continue;
// direction
vector_mul(dir,f0->n,f1->n);
det=vector_len2(dir);
if (fabs(det)<=_zero) continue; // parallel planes?
d0=D0[i]/det;
d1=D1[j]/det;
// position
vector_mul(p,dir,f1->n); vector_mul(pos,p,d0);
vector_mul(p,f0->n,dir); vector_mul(p,p,d1);
vector_add(pos,pos,p);
// compute intersection edge points
_ti=1; _tj=1;
if (intersect_lin_ray(pnt.pnt.dat+f0->p0,pnt.pnt.dat+f0->p1,pos,dir,t)){ if (_ti) { _ti=0; ti0=t; ti1=t; } if (ti0>t) ti0=t; if (ti1<t) ti1=t; }
if (intersect_lin_ray(pnt.pnt.dat+f0->p1,pnt.pnt.dat+f0->p2,pos,dir,t)){ if (_ti) { _ti=0; ti0=t; ti1=t; } if (ti0>t) ti0=t; if (ti1<t) ti1=t; }
if (intersect_lin_ray(pnt.pnt.dat+f0->p2,pnt.pnt.dat+f0->p0,pos,dir,t)){ if (_ti) { _ti=0; ti0=t; ti1=t; } if (ti0>t) ti0=t; if (ti1<t) ti1=t; }
if (intersect_lin_ray(pnt.pnt.dat+f1->p0,pnt.pnt.dat+f1->p1,pos,dir,t)){ if (_tj) { _tj=0; tj0=t; tj1=t; } if (tj0>t) tj0=t; if (tj1<t) tj1=t; }
if (intersect_lin_ray(pnt.pnt.dat+f1->p1,pnt.pnt.dat+f1->p2,pos,dir,t)){ if (_tj) { _tj=0; tj0=t; tj1=t; } if (tj0>t) tj0=t; if (tj1<t) tj1=t; }
if (intersect_lin_ray(pnt.pnt.dat+f1->p2,pnt.pnt.dat+f1->p0,pos,dir,t)){ if (_tj) { _tj=0; tj0=t; tj1=t; } if (tj0>t) tj0=t; if (tj1<t) tj1=t; }
if ((_ti)||(_tj)) continue;
if ((ti0>=tj0)&&(ti0<=tj1)) return 1;
if ((ti1>=tj0)&&(ti1<=tj1)) return 1;
if ((tj0>=ti0)&&(tj0<=ti1)) return 1;
if ((tj1>=ti0)&&(tj1<=ti1)) return 1;
}
return 0;
};
//---------------------------------------------------------------------------
It is a part of a much larger program. The _zero is just threshold for zero based on min detail size. _fac3 is triangle and _vol4 is tetrahedron. Both points and triangles are indexed from pnt.pnt.dat[] and fac.dat[] dynamic lists. I know is weird but there is a lot going on behind it (like spatial subdivision to segments and more to speed up the processes which is this used for).
the vector_mul(a,b,c) is a=cross(b,c) and a=dot(b,c) product (which depends on c if it is vector or not).
I would rather avoid any precomputed values for each triangle/tetrahedron as even now the classes holds quite a lot of info already (like parent-ship, usage count etc). And as I am bound to Win32 the memory is limited to only around 1.2 GB so any additional stuff will limit the max size of mesh usable.
So what I am looking for is any of these:
some math or coding trick to speed current approach if possible
different faster approach for this
I am bound to BDS2006 Win32 C++ and would rather avoid using 3th party libs.
[Edit1] sample data
Here is tetrahedronized pointcloud as a sample data for testing:
double pnt[192]= // pnt.pnt.dat[pnt.n*3] = { x,y,z, ... }
{
-0.227,0.108,-0.386,
-0.227,0.153,-0.386,
0.227,0.108,-0.386,
0.227,0.153,-0.386,
0.227,0.108,-0.431,
0.227,0.153,-0.431,
-0.227,0.108,-0.431,
-0.227,0.153,-0.431,
-0.227,0.108,0.429,
-0.227,0.153,0.429,
0.227,0.108,0.429,
0.227,0.153,0.429,
0.227,0.108,0.384,
0.227,0.153,0.384,
-0.227,0.108,0.384,
-0.227,0.153,0.384,
-0.023,0.108,0.409,
-0.023,0.153,0.409,
0.023,0.108,0.409,
0.023,0.153,0.409,
0.023,0.108,-0.409,
0.023,0.153,-0.409,
-0.023,0.108,-0.409,
-0.023,0.153,-0.409,
-0.318,0.210,0.500,
-0.318,0.233,0.500,
0.318,0.210,0.500,
0.318,0.233,0.500,
0.318,0.210,-0.500,
0.318,0.233,-0.500,
-0.318,0.210,-0.500,
-0.318,0.233,-0.500,
-0.273,-0.233,0.432,
-0.273,0.222,0.432,
-0.227,-0.233,0.432,
-0.227,0.222,0.432,
-0.227,-0.233,0.386,
-0.227,0.222,0.386,
-0.273,-0.233,0.386,
-0.273,0.222,0.386,
0.227,-0.233,0.432,
0.227,0.222,0.432,
0.273,-0.233,0.432,
0.273,0.222,0.432,
0.273,-0.233,0.386,
0.273,0.222,0.386,
0.227,-0.233,0.386,
0.227,0.222,0.386,
-0.273,-0.233,-0.386,
-0.273,0.222,-0.386,
-0.227,-0.233,-0.386,
-0.227,0.222,-0.386,
-0.227,-0.233,-0.432,
-0.227,0.222,-0.432,
-0.273,-0.233,-0.432,
-0.273,0.222,-0.432,
0.227,-0.233,-0.386,
0.227,0.222,-0.386,
0.273,-0.233,-0.386,
0.273,0.222,-0.386,
0.273,-0.233,-0.432,
0.273,0.222,-0.432,
0.227,-0.233,-0.432,
0.227,0.222,-0.432,
};
struct _fac3 { int p0,p1,p2; double n[3]; };
_fac3 fac[140]= // fac.dat[fac.num] = { p0,p1,p2,n(x,y,z), ... }
{
78, 84, 96, 0.600,-0.800,-0.000,
72, 84, 96, -0.844,-0.003,-0.537,
72, 78, 84, -0.000,1.000,-0.000,
72, 78, 96, -0.000,-0.152,0.988,
6, 84, 96, -0.859,0.336,-0.385,
6, 78, 96, 0.597,-0.801,0.031,
6, 78, 84, 0.746,-0.666,0.000,
6, 72, 96, -0.852,-0.006,-0.523,
6, 72, 84, -0.834,0.151,-0.530,
78, 84,147, 0.020,1.000,-0.000,
72, 84,147, -0.023,-1.000,-0.015,
72, 78,147, -0.000,1.000,0.014,
78, 96,186, 0.546,-0.776,0.316,
6, 96,186, -0.864,0.067,-0.500,
6, 78,186, 0.995,0.014,-0.104,
78, 84,186, 0.980,-0.201,0.000,
6, 84,186, -0.812,0.078,-0.578,
72, 96,186, -0.865,-0.011,-0.501,
6, 72,186, -0.846,0.071,-0.529,
6, 84,147, -0.153,-0.672,-0.724,
6, 72,147, -0.222,-0.975,-0.024,
84,135,147, 0.018,1.000,-0.013,
78,135,147, -0.311,0.924,0.220,
78, 84,135, 0.258,0.966,-0.000,
72,135,147, -0.018,1.000,0.013,
72, 78,135, -0.000,0.995,0.105,
96,132,186, -0.000,-1.000,-0.000,
78,132,186, 0.995,-0.087,-0.056,
78, 96,132, 0.081,-0.256,0.963,
84,132,186, 0.976,-0.209,-0.055,
78, 84,132, 0.995,-0.101,0.000,
84,147,186, -0.190,-0.111,-0.975,
6,147,186, -0.030,-0.134,0.991,
0, 96,186, -0.587,-0.735,-0.339,
0, 72,186, 0.598,0.801,-0.031,
0, 72, 96, -0.992,-0.087,-0.092,
72,147,186, -0.675,-0.737,-0.044,
135,147,189, 0.000,1.000,-0.000,
84,147,189, -0.018,0.980,-0.197,
84,135,189, 0.126,0.992,-0.007,
81, 84,135, -0.183,0.983,-0.023,
78, 81,135, -0.930,-0.000,0.367,
78, 81, 84, 1.000,-0.000,0.000,
105,135,147, -0.000,1.000,0.000,
72,105,147, -0.126,0.992,0.007,
72,105,135, 0.018,0.980,0.197,
72, 81,135, -0.036,0.996,-0.082,
72, 78, 81, -0.000,-0.000,1.000,
96,120,132, -0.000,-1.000,-0.000,
78,120,132, 0.685,-0.246,0.685,
78, 96,120, -0.000,-0.152,0.988,
132,180,186, -0.000,-1.000,0.000,
84,180,186, 0.000,-0.152,-0.988,
84,132,180, 0.995,-0.101,-0.000,
147,150,186, 0.101,0.010,0.995,
84,150,186, -0.100,-0.131,-0.986,
84,147,150, -0.190,-0.019,-0.982,
96,114,186, 0.000,-1.000,0.000,
0,114,186, -0.584,-0.729,-0.357,
0, 96,114, -0.991,0.134,0.000,
0,147,186, -0.144,-0.058,-0.988,
0, 72,147, -0.926,-0.374,-0.052,
72, 96,114, -0.995,-0.101,0.000,
0, 72,114, -0.993,-0.077,-0.093,
75,147,189, -0.001,1.000,-0.012,
75,135,189, 0.018,1.000,-0.001,
75,135,147, -0.016,-1.000,0.012,
147,159,189, -0.000,1.000,-0.000,
84,159,189, -0.000,0.985,-0.174,
84,147,159, -0.025,-0.999,-0.025,
81,135,189, -0.274,0.962,0.015,
81, 84,189, 0.114,0.993,-0.023,
75,105,147, -0.115,-0.993,0.006,
75,105,135, 0.017,-0.983,0.181,
72, 75,147, -0.999,-0.000,-0.051,
72, 75,105, 0.599,-0.000,0.801,
81,105,135, -0.009,0.996,-0.093,
72, 81,105, -0.036,0.991,0.127,
120,126,132, -0.000,-1.000,-0.000,
78,126,132, 0.995,-0.101,-0.000,
78,120,126, -0.000,-0.152,0.988,
0,150,186, 0.101,-0.000,0.995,
0,147,150, -0.000,-0.000,1.000,
144,150,186, 0.000,-1.000,0.000,
84,144,186, -0.091,-0.133,-0.987,
84,144,150, -0.000,0.249,0.968,
147,150,159, -0.705,-0.071,-0.705,
84,150,159, -0.125,-0.100,-0.987,
114,150,186, 0.000,-1.000,0.000,
0,114,150, -0.998,-0.000,-0.059,
72,114,147, -0.995,-0.088,-0.052,
0,114,147, -0.906,-0.365,-0.215,
93,147,189, -0.009,-0.996,-0.093,
75, 93,189, 0.020,1.000,0.000,
75, 93,147, -0.237,-0.971,-0.000,
75, 81,189, -0.000,1.000,-0.012,
75, 81,135, -0.000,-0.995,0.096,
93,159,189, -0.000,-0.987,-0.160,
93,147,159, -0.069,-0.995,-0.069,
84, 93,189, 0.036,0.991,-0.127,
84, 93,159, -0.036,-0.993,-0.113,
84, 87,189, -0.599,-0.000,-0.801,
81, 87,189, -0.120,0.993,-0.000,
81, 84, 87, 1.000,0.000,0.000,
75, 81,105, -0.000,-0.987,0.160,
72, 93,147, -0.183,-0.983,-0.023,
72, 75, 93, -1.000,0.000,-0.000,
72, 75, 81, 0.000,-0.000,1.000,
114,147,150, -0.993,-0.100,-0.059,
144,162,186, 0.000,-1.000,0.000,
84,162,186, -0.000,-0.152,-0.988,
84,144,162, -0.600,0.800,0.000,
144,150,159, 0.000,0.101,0.995,
84,144,159, -0.125,-0.087,-0.988,
144,147,159, -0.707,0.000,-0.707,
144,147,150, -0.000,0.000,1.000,
93,114,147, 0.732,-0.587,-0.346,
72, 93,114, -0.995,-0.100,-0.002,
81, 93,189, 0.022,1.000,-0.014,
75, 81, 93, -0.000,1.000,0.000,
93,144,159, 0.582,-0.140,-0.801,
93,144,147, -0.930,0.000,0.367,
87, 93,189, -0.000,0.987,0.160,
84, 87, 93, -0.000,0.000,-1.000,
84, 93,144, -0.009,-0.238,-0.971,
81, 87, 93, -0.000,1.000,0.000,
114,144,150, -0.000,-1.000,-0.000,
114,144,147, -1.000,0.000,-0.000,
93,144,162, -0.995,-0.096,0.000,
84, 93,162, -0.005,-0.145,-0.989,
93,114,144, -0.995,-0.096,0.000,
72,114,144, -0.995,-0.101,-0.000,
72, 93,144, -0.995,-0.097,-0.002,
90,144,162, -0.995,-0.101,0.000,
90, 93,162, 0.834,0.000,-0.552,
90, 93,144, -0.930,0.000,0.367,
84, 90,162, 0.000,-0.152,-0.988,
84, 90, 93, 0.000,0.000,-1.000,
72, 90,144, -0.995,-0.101,-0.000,
72, 90, 93, -1.000,0.000,-0.000,
};
struct _vol4 { int p0,p1,p2,p3,t[4]; double s[4]; };
_vol4 vol[62]= // vol.dat[vol.num] = { p0,p1,p2,p3,t[0],t[1],t[2],t[3],s[0],s[1],s[2],s[3], ... }
{
72, 78, 96, 84, 0, 1, 2, 3, 1,1,1,1,
78, 84, 96, 6, 4, 5, 6, 0, 1,1,1,-1,
72, 84, 96, 6, 4, 7, 8, 1, -1,1,1,-1,
72, 78, 84,147, 9, 10, 11, 2, 1,1,1,-1,
6, 78, 96,186, 12, 13, 14, 5, 1,1,1,-1,
6, 78, 84,186, 15, 16, 14, 6, 1,1,-1,-1,
6, 72, 96,186, 17, 13, 18, 7, 1,-1,1,-1,
6, 72, 84,147, 10, 19, 20, 8, -1,1,1,-1,
78, 84,147,135, 21, 22, 23, 9, 1,1,1,-1,
72, 78,147,135, 22, 24, 25, 11, -1,1,1,-1,
78, 96,186,132, 26, 27, 28, 12, 1,1,1,-1,
78, 84,186,132, 29, 27, 30, 15, 1,-1,1,-1,
6, 84,186,147, 31, 32, 19, 16, 1,1,-1,-1,
72, 96,186, 0, 33, 34, 35, 17, 1,1,1,-1,
6, 72,186,147, 36, 32, 20, 18, 1,-1,-1,-1,
84,135,147,189, 37, 38, 39, 21, 1,1,1,-1,
78, 84,135, 81, 40, 41, 42, 23, 1,1,1,-1,
72,135,147,105, 43, 44, 45, 24, 1,1,1,-1,
72, 78,135, 81, 41, 46, 47, 25, -1,1,1,-1,
78, 96,132,120, 48, 49, 50, 28, 1,1,1,-1,
84,132,186,180, 51, 52, 53, 29, 1,1,1,-1,
84,147,186,150, 54, 55, 56, 31, 1,1,1,-1,
0, 96,186,114, 57, 58, 59, 33, 1,1,1,-1,
0, 72,186,147, 36, 60, 61, 34, -1,1,1,-1,
0, 72, 96,114, 62, 59, 63, 35, 1,-1,1,-1,
135,147,189, 75, 64, 65, 66, 37, 1,1,1,-1,
84,147,189,159, 67, 68, 69, 38, 1,1,1,-1,
84,135,189, 81, 70, 71, 40, 39, 1,1,-1,-1,
105,135,147, 75, 66, 72, 73, 43, -1,1,1,-1,
72,105,147, 75, 72, 74, 75, 44, -1,1,1,-1,
72,105,135, 81, 76, 46, 77, 45, 1,-1,1,-1,
78,120,132,126, 78, 79, 80, 49, 1,1,1,-1,
147,150,186, 0, 81, 60, 82, 54, 1,-1,1,-1,
84,150,186,144, 83, 84, 85, 55, 1,1,1,-1,
84,147,150,159, 86, 87, 69, 56, 1,1,-1,-1,
0,114,186,150, 88, 81, 89, 58, 1,-1,1,-1,
0, 72,147,114, 90, 91, 63, 61, 1,1,-1,-1,
75,147,189, 93, 92, 93, 94, 64, 1,1,1,-1,
75,135,189, 81, 70, 95, 96, 65, -1,1,1,-1,
147,159,189, 93, 97, 92, 98, 67, 1,-1,1,-1,
84,159,189, 93, 97, 99,100, 68, -1,1,1,-1,
81, 84,189, 87, 101,102,103, 71, 1,1,1,-1,
75,105,135, 81, 76, 96,104, 73, -1,-1,1,-1,
72, 75,147, 93, 94,105,106, 74, -1,1,1,-1,
72, 75,105, 81, 104, 77,107, 75, -1,-1,1,-1,
0,147,150,114, 108, 89, 91, 82, 1,-1,-1,-1,
84,144,186,162, 109,110,111, 84, 1,1,1,-1,
84,144,150,159, 112, 87,113, 85, 1,-1,1,-1,
147,150,159,144, 112,114,115, 86, -1,1,1,-1,
72,114,147, 93, 116,105,117, 90, 1,-1,1,-1,
75, 93,189, 81, 118, 95,119, 93, 1,-1,1,-1,
93,147,159,144, 114,120,121, 98, -1,1,1,-1,
84, 93,189, 87, 122,101,123, 99, 1,-1,1,-1,
84, 93,159,144, 120,113,124,100, -1,-1,1,-1,
81, 87,189, 93, 122,118,125,102, -1,-1,1,-1,
114,147,150,144, 115,126,127,108, -1,1,1,-1,
84,144,162, 93, 128,129,124,111, 1,1,-1,-1,
93,114,147,144, 127,121,130,116, -1,-1,1,-1,
72, 93,114,144, 130,131,132,117, -1,1,1,-1,
93,144,162, 90, 133,134,135,128, 1,1,1,-1,
84, 93,162, 90, 134,136,137,129, -1,1,1,-1,
72, 93,144, 90, 135,138,139,132, -1,1,1,-1,
};
the p? are point indexes 0,3,6,9... from pnt the n is normal s is sign of normal (in case triangle is shared so normals point the same way) and t[4] are indexes of triangles 0,1,2,3,... from fac.
And here a sample test:
bool tetrahedrons::vols_intersect() // test if vol[] intersects each other
{
int i,j;
for (i=0;i<vol.num;i++)
for (j=i+1;j<vol.num;j++,dbg_cnt++)
if (intersect_vol_vol(vol.dat[i],vol.dat[j]))
{
linc=0x800000FF;
if (intersect_vol_vol(vol.dat[j],vol.dat[i])) linc=0x8000FFFF;
lin_add_vol(vol.dat[i]);
lin_add_vol(vol.dat[j]);
return 1;
}
return 0;
}
where dbg_cnt is counter of intersection tests. For this mesh I got this results:
tests | time
------+-------------
18910 | 190-215 [ms]
I called the vols_intersect test 10 times to make the measurements long enough. Of coarse none of placed tetrahedrons in this dataset will intersect (leading to highest time). In the real process (too big to post) which lead to this mesh are the count like this:
intersecting 5
non intersecting 1766
all tests 1771

How are tripled sequence in IBO working?

I'm analyzing an obfuscated OpenGL application. I want to generate a .obj file that describes the multi-polygon model which is displayed in the application.
So I froze the app and dig out the values set in VBO and IBO. But the values set in IBO was far more mysterious than what I've expected. The value was
0, 0, 1, 2, 3, 4, 5, 6, 7, 7, 5, 8, 3, 3, 9, 9, 10, 11, 12, 12, 10, 13, 14, 14, 10, 15, 16, 16, 17, 17, 7, 8, 8, 18, 18, 19, 20, 21, 21, 22, 22, 23, 24, 25, 25, 26, 26, 27, 28, 29, 29, 30, 30, 31, 32, 32, 33, 33, 34, 35, 36, 37, 38, 38, 36, 39, 34, 34, 40, 40, 40, 41, 42, 43, 44, 44, 45, 45, 46, 47, 48, 49, 49, 50, 50, 51, 52, 52, 53, 53, 54, 55, 55, 56, 56, 57, 58, 58, 59, 59, 60, 61, 62, 62, 63, 63, 63, 64, 65, 66, 67, 64, 68, 68, 69, 69, 70, 71, 72, 73, 74, 75, 76, 76, 77, 77, 78, 79, 80, 81, 82, 82, 80, 83, 83, 84, 84, 85, 86, 87, 88, 88, 89, 89, 90, 91, 91, 92, 92, 92, 93, 94, 95, 96, 96, 97, 97, 97, 98, 99, 100, 101, 102, 102, 100, 103, 103, 104, 104, 105, 106, 107, 107, 108, 108, 108, 109, 110, 111, 112, 112, 100, 100, 101, 113, 114, 114, ... (length=10495)
As you can see indices like 40, 63, 92 and 108 are tripled, so setting neither GL_TRIANGLES, GL_TRIANGLE_STRIP, GL_TRIANGLE_FAN, GL_QUADS, GL_QUAD_STRIP nor GL_POLYGON to glDrawElements won't work correctly.
Are there some kind of advanced techniques to use triple sequenced indices in IBO? What does it mean? For what reason is it used for?
Repeated indices like that are indicative of aggressive optimization of triangle strips. A repeated index creates degenerate triangles: triangles with zero area. Since they have no visible area, they are not rendered. They exist so that you can jump from one triangle strip to the next without having to issue another draw command.
So a double-index is often used to stitch two strips together. The two triangles it generates will not be rendered.
However, because of the way strips work with the winding order, the facing for the triangles can work out incorrectly. That is, if you stitched two strips together with a double-index, the second strip would start out with the reverse winding order than it desires.
That's where triple indices come in. The third index fixes the winding order for the triangles in the destination strip. The three extra triangles it generates will not be rendered.
The more modern way to handle multiple strips in the same draw call is to use primitive restart indices. But the index list as it currently stands is adequate for use with GL_TRIANGLE_STRIP.
You can read this strip list and process it into a series of separate triangles (as appropriate for GL_TRIANGLES) easily enough. Simply look at each sequence of 3 vertices, and output that to your triangle buffer, so long as it is not a degenerate triangle. And you'll have to reverse the order of two of the indices for every odd-numbered triangle. The code would look something like this:
const int num_faces = indices.size() - 2;
faces.reserve(num_faces);
for(auto i = 0; i < num_faces; ++i)
{
Face f(indices[i], indices[i + 1], indices[i + 2]);
//Don't add any degenerate faces.
if(!(f[0] == f[1] || f[0] == f[2] || f[1] == f[2]))
{
if(i % 2 == 1) //Every odd-numbered face.
std::swap(f[1], f[2]);
faces.push_back(f);
}
}

Vector to Matrix

I am new using Eigen library and I am having problems transform/reshape a vector in a matrix.
I am trying to get an specific row of a matrix and convert it as a matrix, but each time that I do that the result is not what I am expecting.
Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor> m(8, 9);
m << 11, 12, 13, 14, 15, 16, 17, 18, 19,
21, 22, 23, 24, 25, 26, 27, 28, 29,
31, 32, 33, 34, 35, 36, 37, 38, 39,
41, 42, 43, 44, 45, 46, 47, 48, 49,
51, 52, 53, 54, 55, 56, 57, 58, 59,
61, 62, 63, 64, 65, 66, 67, 68, 69,
71, 72, 73, 74, 75, 76, 77, 78, 79,
81, 82, 83, 84, 85, 86, 87, 88, 89;
std::cout << m << std::endl << std::endl;
Matrix<double,1,Dynamic,RowMajor> B = m.row(0);
std::cout << B << std::endl << std::endl;
Map<Matrix3d,RowMajor> A(B.data(),3,3);
std::cout << A << std::endl << std::endl;
Result
11 14 17
12 15 18
13 16 19
I want:
11 12 13
14 15 16
17 18 19
You dont need to select a row first and then map. Just map directly from m and assign the transpose of map to a matrix A as follows
Matrix3d A = Map<Matrix3d>(m.data()).transpose();
If you don't like transposing then forcing the map to use RowMajor for the destination type works too
Matrix3d A = Map<Matrix<double, 3, 3, RowMajor>>(m.data());
Although, at this small size it doesn't matter. Cheers
You need to get the transpose of the result matrix. I think eigen library is converting a vector to a matrix by picking every n'th element to form a row in a n*n sized vector.

Automatically Deleting specific Elements in Mathematica Tables

I have a question which can be divided into two subquestions.
I have created a table the code of which is given below.
Problem 1.
xstep = 1;
xmaximum = 6;
numberofxnodes = 6;
numberofynodes = 3;
numberofzlayers = 3;
maximumgridnodes = numberofxnodes*numberofynodes
mnodes = numberofxnodes*numberofynodes*numberofzlayers;
orginaltable =
Table[{i,
node2 = i + xstep, node3 = node2 + xmaximum,
node4 = node3 - xstep,node5 = i + maximumgridnodes,
node6 = node5 + xstep,node7 = node6 + xmaximum,
node8 = node7 - xstep},
{i, 1, mnodes}]
If I run this I will get my original table. Basically I want to remove the sixth element and multiples of the sixth element from my original table. I am able to do this by using this code below.
modifiedtable = Drop[orginaltable, {6, mnodes, 6}]
Now I get the modified table where every sixth element and multiples of sixth element of my original table is removed. This solves my Problem 1.
Now my Problem 2:
** MAJOR EDITED VERSION**:(ALL THE CODES GIVEN ABOVE IS CORRECT)
Thanks a lot for the answers, but I wanted something else and I made a mistake
while explaining it initially so I'm making another try.
Below is my modified table: I want the elements in between
"/** and **/" deleted and remaining there.
{{1, 2, 8, 7, 19, 20, 26, 25}, {2, 3, 9, 8, 20, 21, 27, 26}, {3, 4,10, 9, 21, 22, 28, 27}, {4, 5, 11, 10, 22, 23, 29, 28}, {5, 6, 12, 11, 23, 24, 30, 29}, {7, 8, 14, 13, 25, 26, 32, 31}, {8, 9, 15, 14, 26, 27, 33, 32}, {9, 10, 16, 15, 27, 28, 34, 33}, {10, 11, 17, 16, 28, 29, 35, 34}, {11, 12, 18, 17, 29, 30, 36, 35}, /**{13, 14, 20, 19, 31, 32, 38, 37}, {14, 15, 21, 20, 32, 33, 39, 38}, {15, 16, 22, 21, 33, 34, 40, 39}, {16, 17, 23, 22, 34, 35, 41, 40}, {17, 18, 24, 23, 35, 36, 42, 41},**/ {19, 20, 26, 25, 37, 38, 44, 43}, {20, 21, 27, 26, 38, 39, 45, 44}, {21, 22, 28, 27, 39, 40, 46, 45}, {22, 23, 29, 28, 40, 41, 47, 46}, {23, 24, 30, 29, 41, 42, 48, 47}, {25, 26, 32, 31,43, 44, 50, 49}, {26, 27, 33, 32, 44, 45, 51, 50}, {27, 28, 34, 33, 45, 46, 52, 51}, {28, 29, 35, 34, 46, 47, 53, 52}, {29, 30, 36, 35, 47, 48, 54, 53}, /**{31, 32, 38, 37, 49, 50, 56, 55}, {32, 33, 39, 38,50, 51, 57, 56}, {33, 34, 40, 39, 51, 52, 58, 57}, {34, 35, 41, 40, 52, 53, 59, 58}, {35, 36, 42, 41, 53, 54, 60, 59},**/ {37, 38, 44, 43,55, 56, 62, 61}, {38, 39, 45, 44, 56, 57, 63, 62}, {39, 40, 46, 45, 57, 58, 64, 63}, {40, 41, 47, 46, 58, 59, 65, 64}, {41, 42, 48, 47,59, 60, 66, 65}, {43, 44, 50, 49, 61, 62, 68, 67}, {44, 45, 51, 50, 62, 63, 69, 68}, {45, 46, 52, 51, 63, 64, 70, 69}, {46, 47, 53, 52, 64, 65, 71, 70}, {47, 48, 54, 53, 65, 66, 72, 71}, /**{49, 50, 56, 55, 67, 68, 74, 73}, {50, 51, 57, 56, 68, 69, 75, 74},{51,52, 58, 57, 69, 70, 76, 75}, {52, 53, 59, 58, 70, 71, 77, 76}, {53, 54, 60, 59, 71, 72, 78, 77}}**/
Now, if you observe, I wanted the first ten elements
(1st to 10th element of modifiedtable) to be there in my final table
( DoubleModifiedTable ). the the next five (11th to 15th elements of modifiedtable) deleted.
Then the next ten elements ( 16th to 25th elements of modifiedtable)
to be present in my final table ( DoubleModifiedTable )
then the next five deleted (26th to 30th elements of modifiedtable) and so on for the whole table.
Let say we solve this problem and we name the final table DoubleModifiedTable.
I am basically interested in getting the DoubleModifiedTable. I decided to subdivide the problem as it easy to explain.
I want this to happen automatically through the table since as this is just an example table but in reality I have huge table. If I can understand how I can solve this problem for this table, then I can solve it for my large table too.
Perhaps simpler:
DoubleModifiedTable =
Module[{copy = modifiedtable},
copy[[Flatten[# + Range[5] & /# Range[10, Length[copy], 10]]]] = Sequence[];
copy]
EDIT
Even simpler:
DoubleModifiedTable =
Delete[modifiedtable,
Transpose[{Flatten[# + Range[5] & /# Range[10, Length[modifiedtable], 10]]}]]
EDIT 2
Per OP's request: one only has to change a single number (10 to 15) in any of my solutions to get the answer to a modified problem:
DoubleModifiedTable =
Delete[modifiedtable,
Transpose[{Flatten[# + Range[5] & /# Range[10, Length[modifiedtable], 15]]}]]
Another way is to do something like
DoubleModifiedTable = With[{n = 10, m = 5},
Flatten[{{modifiedtable[[;; m]]},
Partition[modifiedtable, n - m, n, {n - m + 1, 1}, {}]}, 2]]
Edit
The edited version of Problem 2 is actually slightly simpler to solve than the original version. You could for example do something like
DoubleModifiedTable =
With[{n = 10, m = 5}, Flatten[Partition[modifiedtable, n, n + m, 1, {}], 1]]
Edit 2
What my second version does is to split the original list modifiedtable into sublists using Partition and then to flatten these sublists to form the final list. If you look at the Documentation for Partition you can see that I'm using the 6th form of Partition which means that the length of the sublists is n and the offset (the distance be is n+m. The gap between the sublists is therefore n+m-n==m.
The next argument, 1, is actually equivalent to {1,1} which tells Mathematica that the first element of modifiedtable should appear at position 1 in the first sublist and the last element of modifiedtable should appear on or after position 1 of the last sublist.
The last argument, {} is to indicate that no padding should be used for sublists with length <=n.
In summary, if you want to delete the first 10 elements and keep the next 5 you want sublists of length n=5 with gap m=10. Since you want the first sublist to start with the (m+1)-th element of modifiedtable, you could replace the fourth argument in Partition with something of the form {k,1} for some value of k but it's probably easier to just drop the first m elements of modifiedtable beforehand, i.e.
DoubleModifiedTable =
With[{n = 5, m = 10},
Flatten[Partition[Drop[modifiedtable, m], n, n + m, 1, {}], 1]]
DoubleModifiedTable=
modifiedtable[[
Complement[
Range[Length[modifiedtable]],
Flatten#Table[10 i + j, {i, Floor[Length[modifiedtable]/10]}, {j, 5}]
]
]]
or, slightly shorter
DoubleModifiedTable=
#[[
Complement[
Range[Length[#]],
Flatten#Table[10 i + j, {i, Floor[Length[#]/10]}, {j, 5}]
]
]] & # modifiedtable