3D to 1D mapping function confusion - c++

I just finished coding a C++ program to manage a 3D matrix with dynamically allocated memory.
In order to use a contiguous chunk of memory, I decided to use a mapping function to physically store the elements of my matrix to 1D array.
For this purpose, I have a T *_3D_matrix pointer to the array, which is defined as
_3D_matrix = new T[height * width * depth];
where height, width and depth are input parameters for the constructor.
The program works just fine, I even tested it with Valgrind and no memory problems happen.
What I don't get is: my array has got height * width * depth = 12 elements, and the mapping function seems to map some elements out of the [0..11] range.
What am I missing here?
EDIT:
This is the output I get from recreating the same matrix and printing it in my program.

Lets say we have a "3D" array defined as
some_type m[1][3][2];
That would look something like this if we draw it:
+------------+-------------+------------+-------------+------------+------------+
| m[0] |
+------------+-------------+------------+-------------+------------+------------+
| m[0][0] | m[0][1] | m[0][2] |
+------------+-------------+------------+-------------+------------+------------+
| m[0][0][0] | m[0][0][1] | m[0][1][0] | m[0][1][1] | m[0][2][0] | m[0][2][1] |
+------------+-------------+------------+-------------+------------+------------+
If x represents the first "dimension", y the second, and z the third, then an expressions such as m[x][y][z] would with a flat array be like m[x * 3 * 2 + y * 3 + z]. The number 3 is the number of elements in the second dimension, and 2 is the number of elements in the third dimension.
Generalized, an array like
some_type m[X][Y][Z];
would as a flat array have the formula x * Y * Z + y * Z + z for the index. Compared to your formula the x and the y have been switched.

You computed the mapped index for out-of-bounds values of y.
You said height * width * depth= 12, and:
index = y * width * depth + x * depth + z
And we see in your table:
#.| Y | X | Z | index
--+---+---+---+------
1 | 0 | 1 | 0 | 2
2 | 1 | 0 | 0 | 6
This implies:
0 * width * depth + 1 * depth + 0 = 2 => depth = 2
1 * width * depth + 0 * depth + 0 = 6 => width * depth + 6 => width = 3
height * width * depth= 12 => height = 2
Thus:
y is in [0, 1]
x is in [0, 2]
z is in [0, 1]
The maximum index is at {x, y, z} = {2, 1, 1} and its value is 1 * 2 * 3 + 2 * 2 + 1 = 11.

Assuming from your example:
width = 2, height = 3, and depth = 2
x is in [0, width), y is in [0, height), z is in [0, depth)
Mapping second element should be:
1*2*2 + 0*2 + 0 = 4, but you get 6. I think the reason is that some of the dimensions or indices are swapped somewhere else in your code. Seems that width or depth is 3 in your case.

Related

How do optimize my code to find product of all the contiguous subsequences of an array?

This is my try to count the contiguous subsequences of an array with product mod 4 is not equal to 2:
# include <iostream>
using namespace std;
int main() {
long long int n, i, j, s, t, count = 0;
cin>>n;
long long int arr[n];
count = 0;
for(i = 0; i<n; i++) {
cin>>arr[i];
}
for(i = 0; i<n; i++) {
s = 1;
for(j = i; j<n; j++) {
s = s*arr[j];
if(s%4!=2) {
count++;
}
}
}
cout<<count;
return 0;
}
However, I want to reduce the time taken by my code to execute. I am looking for a way to do it. Any help/hint would be appreciated.
Thank you.
What does this definition of contiguous subsequences mean?
Listing all the subsequences
Suppose we have the sequence:
A B C D E F
First of all, we should recognize that there is one substring for every unique start and end point. Let's use the notation C-F to mean all items from C through F: i.e.: C D E F.
We can list all subsequences in a triangular arrangement like this:
A B C D E F
A-B B-C C-D D-E E-F
A-C B-D C-E D-F
A-D B-E C-F
A-E B-F
A-F
The first row lists all the subsequences of length 1.
The second row lists all the subsequences of length 2.
The third row lists all the subsequences of length 3. Etc.
The last row is the full sequence.
Modular arithmetic
Computing the product MOD 4 of a set of numbers
To figure out the product of a bunch of numbers MOD 4, we just need to look at each element of the set MOD 4. Intuitively, this is because when you multiply a bunch of numbers, the last digit of the result is determined entirely by the last digit of each factor. In this case "the last digit base 4" is the number mod 4.
The identity we are using is:
(A * B) MOD N == ((A MOD N) * (B MOD N)) MOD N
The table of products
Now we also have to look at the matrix of possible multiplications that might happen. It's a fairly small table and the interesting entries are given here:
2 * 2 = 4 4 MOD 4 = 0
2 * 3 = 6 6 MOD 4 = 2
3 * 3 = 9 9 MOD 4 = 1
So the results of multiplying any 2 numbers MOD 4 are given by this table:
+--------+---+---+---+---+
| Factor | 0 | 1 | 2 | 3 |
+--------+---+---+---+---+
| 0 | 0 | / | / | / |
| 1 | 0 | 1 | / | / |
| 2 | 0 | 2 | 0 | / |
| 3 | 0 | 3 | 2 | 1 |
+--------+---+---+---+---+
The /'s are omitted because of the symmetry of multiplication (A * B = B * A)
An example sequence
Now for each subsequence, let's compute the product MOD 4 of its elements.
Consider the following list of numbers
242 497 681 685 410 795
The first thing we do is take all these numbers MOD 4 and list them as the first row of our list of all subsequences triangle.
2 0 1 1 2 3
The second row is just the product of the pairs above it.
2 0 1 1 2 3
0 0 1 2 3
In general, the Nth element of each row is the product, MOD 4, of:
the number just to its left in the row above left times and
the element in the first row that is diagonally to its right
For example C = A * B
* * * * B *
* * * / *
* A / *
* C *
* *
*
Again,
A is immediately up and left of C
B is diagonally right all the way to the top row from C
Now we can complete our triangle
2 0 1 1 2 3
0 0 1 2 3
0 0 2 3
0 0 2
0 0
0
This can be computed easily in O(n^2) time.
Optimization
These optimizations do not improve the time complexity of the algorithm in its worse case, but can cause an early exit in the computation, and should therefore be included if time is to be reduced and the input is unknown.
Contageous 0's
Furthermore, as a matter of optimization, notice how contagious the 0's are. Anything times 0 is 0, so you can skip computing products of cells below a 0. In your case those sequences will not equal 2 MOD 4 once the product of one of its subsequences is determined to be equal to 0 MOD 4.
* * * 0 * * // <-- this zero infects all cells below it
* * 0 0 *
* 0 0 0
0 0 0
0 0
0
Need a 2 to make a 2.
Look back at the table of factors and products. Notice that the only way to get a product that is equal to 2 MOD 4 is to have one of the factors be equal to 2 MOD 4. What that means is that there can only be a 2 below another 2. So we are only interested in following computing entries in the table that are below a 2. Other entries in rows below can never become a 2.
You don't have to store more than the whole rows.
You only need O(n) storage to implement this. Working line by line, you can compute the values in a row entirely from the values in the first row and values in the row above.
Reading the answers from the table
Now you can look at the rows of the triangle list as you generate them and read off which subsequences are to be included.
Entries with a 2 are to be excluded. All others are to be included.
2 0 1 1 3 2
0 0 1 3 2
0 0 3 2
0 0 2
0 0
0
The excluded subsequences for the example (which I will list only because there are fewer of them in my example) are:
A
F
E-F
D-F
C-F
Which remember, according to our convention refer to the elements:
A
F
E F
D E F
C D E F
Which are:
242
795
410 795
685 410 795
681 685 410 795
Hopefully it's obvious how to display the "included" sequences, rather than the "excluded" sequences, as I have shown above.
Displaying all the elements makes it take much longer.
Sadly, actually displaying all of the elements of such subsequences is still an O(N^3) operation in the worst case. (Imagine a sequence of all zeros.)
Summary
For me, I feel like an average developer could take the magic bullet observation made in the diagram below and write an implementation that has optimal time complexity.
C = A * B
* * * * B *
* * * / *
* A / *
* C *
* *
*

A many-to-one mapping in the natural domain using discrete input variables?

I would like to find a mapping f:X --> N, with multiple discrete natural variables X of varying dimension, where f produces a unique number between 0 to the multiplication of all dimensions. For example. Assume X = {a,b,c}, with dimensions |a| = 2, |b| = 3, |c| = 2. f should produce 0 to 12 (2*3*2).
a b c | f(X)
0 0 0 | 0
0 0 1 | 1
0 1 0 | 2
0 1 1 | 3
0 2 0 | 4
0 2 1 | 5
1 0 0 | 6
1 0 1 | 7
1 1 0 | 8
1 1 1 | 9
1 2 0 | 10
1 2 1 | 11
This is easy when all dimensions are equal. Assume binary for example:
f(a=1,b=0,c=1) = 1*2^2 + 0*2^1 + 1*2^0 = 5
Using this naively with varying dimensions we would get overlapping values:
f(a=0,b=1,c=1) = 0*2^2 + 1*3^1 + 1*2^2 = 4
f(a=1,b=0,c=0) = 1*2^2 + 0*3^1 + 0*2^2 = 4
A computationally fast function is preferred as I intend to use/implement it in C++. Any help is appreciated!
Ok, the most important part here is math and algorythmics. You have variable dimensions of size (from least order to most one) d0, d1, ... ,dn. A tuple (x0, x1, ... , xn) with xi < di will represent the following number: x0 + d0 * x1 + ... + d0 * d1 * ... * dn-1 * xn
In pseudo-code, I would write:
result = 0
loop for i=n to 0 step -1
result = result * d[i] + x[i]
To implement it in C++, my advice would be to create a class where the constructor would take the number of dimensions and the dimensions itself (or simply a vector<int> containing the dimensions), and a method that would accept an array or a vector of same size containing the values. Optionaly, you could control that no input value is greater than its dimension.
A possible C++ implementation could be:
class F {
vector<int> dims;
public:
F(vector<int> d) : dims(d) {}
int to_int(vector<int> x) {
if (x.size() != dims.size()) {
throw std::invalid_argument("Wrong size");
}
int result = 0;
for (int i = dims.size() - 1; i >= 0; i--) {
if (x[i] >= dims[i]) {
throw std::invalid_argument("Value >= dimension");
}
result = result * dims[i] + x[i];
}
return result;
}
};

Prove that the height of a heap with n nodes is floor(log2n)

How would I prove that the height of a heap with n nodes is floor(log2N)?
Any explanation would be great...
There are 2height-1 elements in every height of the heap tree.
20 = 1 node at height 1
21 = 2 nodes at height 2
22 = 4 nodes at height 3
Therefore, at height x, you can have (20 + 21 + ... + 2x-2) + (1 to 2x-1) = (2x-1-1) + (1 to 2x-1) = 2x-1 + (0 to 2x-1-1) = 2x-1 to 2x - 1 nodes
So, if you apply floor(log2N) on it, you will get (x-1).

Interpolating Skinning Weights

I am subdividing the triangles of mesh, and as you can guess, I need weight values for these new vertices. Currently I am using linear interpolation (Vnew.weight[i] = (V1.weight[i] + V2.weight[i]) * 0.5) but, it seems like I cannot get correct values.
Do you know a better solution for using interpolating the weights?
Edit:
Right know, I am using LBS, and dividing one triangles into two triangles by taking halfway point. This division is done as soon as triangle information is read from the file (I am using SMD files).
I think the problem is weights because, in the rest pose (Without any skinning) everything is fine. But when the poses are started to apply, and skinning is done, some crazy triangles, and holes appear. And when I looked closely to these "crazy triangles", their vertices are moving with the mesh but, not fast enough with other vertices.
And here is the code of division process, and interpolating vertices, normals, UVs, and weights
int pi=0;
while (1)
{
// Some control, and decleration
for (int w = 0; w < 3; w++)
{
// Some declerations
Vert v;
// Values are read from file into Cert v
// Using boneIndex2[] and boneWeight2[] because GLSL 1.30 does
// not support shader storage buffer object, and I need just
// 8 indices most for now.
v.boneIndex2[0] = 0;
v.boneIndex2[1] = 0;
v.boneIndex2[2] = 0;
v.boneIndex2[3] = 0;
v.boneWeight2[0] = 0;
v.boneWeight2[1] = 0;
v.boneWeight2[2] = 0;
v.boneWeight2[3] = 0;
m.vert.push_back(v);
pi++;
}
// Dividing the triangle
Vert a = m.vert[pi - 2];
Vert b = m.vert[pi - 1];
Vert v;
// Interpolate position
v.pos[0] = (a.pos[0] + b.pos[0]) / 2;
v.pos[1] = (a.pos[1] + b.pos[1]) / 2;
v.pos[2] = (a.pos[2] + b.pos[2]) / 2;
// Interpolate normal
v.norm[0] = (a.norm[0] + b.norm[0]) / 2;
v.norm[1] = (a.norm[1] + b.norm[1]) / 2;
v.norm[2] = (a.norm[2] + b.norm[2]) / 2;
// Interpolate UV
v.uv[0] = (a.uv[0] + b.uv[0]) / 2;
v.uv[1] = (a.uv[1] + b.uv[1]) / 2;
// Assign bone indices
// The new vertex should be treated by each bone of Vert a, and b
v.boneIndex[0] = a.boneIndex[0];
v.boneIndex[1] = a.boneIndex[1];
v.boneIndex[2] = a.boneIndex[2];
v.boneIndex[3] = a.boneIndex[3];
v.boneIndex2[0] = b.boneIndex[0];
v.boneIndex2[1] = b.boneIndex[1];
v.boneIndex2[2] = b.boneIndex[2];
v.boneIndex2[3] = b.boneIndex[3];
// Interpolate weights
float we[4];
we[0] = (a.boneWeight[0] + b.boneWeight[0]) / 2;
we[1] = (a.boneWeight[1] + b.boneWeight[1]) / 2;
we[2] = (a.boneWeight[2] + b.boneWeight[2]) / 2;
we[3] = (a.boneWeight[3] + b.boneWeight[3]) / 2;
// Assign weights
v.boneWeight[0] = we[0];
v.boneWeight[1] = we[1];
v.boneWeight[2] = we[2];
v.boneWeight[3] = we[3];
v.boneWeight2[0] = we[0];
v.boneWeight2[1] = we[1];
v.boneWeight2[2] = we[2];
v.boneWeight2[3] = we[3];
// Push new vertex
m.vert.push_back(v);
pi++;
// Push new faces
m.face.push_back(Face(pi - 4, pi - 1, pi - 2));
m.face.push_back(Face(pi - 4, pi - 3, pi - 1));
} // End of while(1)
You are blending weights that just might belong to different bones (i.e. if the bone indices are not equal).
Instead, gather all influencing bone indices from the two vertices. If only one vertex refers to any bone, use half of this weight. If both vertices refer to the bone, use the interpolation as you already did. Then, pick the four bones with the highest weights and re-normalize to a sum of 1.
Here is an example. Consider you have two vertices with these bone indices and weights:
v1 v2
index | weight index | weight
------+-------- ------+--------
0 | 0.2 2 | 0.1
1 | 0.5 3 | 0.6
2 | 0.1 4 | 0.2
3 | 0.2 5 | 0.1
You would start by building the table of joint weights:
index | weight
------+--------
0 | 0.2 / 2 = 0.1
1 | 0.5 / 2 = 0.25
2 | (0.1 + 0.1) / 2 = 0.1
3 | (0.2 + 0.6) / 2 = 0.4
4 | 0.2 / 2 = 0.1
5 | 0.1 / 2 = 0.05
Sort wrt weight and pick the four greatest:
index | weight
------+--------
3 | 0.4 *
1 | 0.25 *
0 | 0.1 *
2 | 0.1 *
4 | 0.1
5 | 0.05
These weights sum to 0.85. So divide the weights by 0.85 to get the final weights and indices:
index | weight
------+--------
3 | 0.47
1 | 0.29
0 | 0.12
2 | 0.12
The other option would be to extend your structure to use more (either static eight or dynamically) bones. But it's probably not worth the effort.

Index arithmetic - Fast converting index to 3D coordinates

What is the fastes way in C++ to convert an index with such a formation to X, Y and Z coordinates and back ?
EDIT:
I want for example get for the index 15 the numbers X=0, Y=1, Z=2, for the index 17 the numbers X=2, Y=1, Z=2, and for the index 22 the numbers X=1, Y=2, Z=1.
I need this to emulate a multidimensional array.
To:
x = index % 3;
y = index / 3 % 3;
z = index / 9;
Back:
index = ((z) * 3 + y) * 3 + x;