I have a homework assignment where, given an array of points, we must find the number of point pairs that are less than or equal to a given distance epsilon apart. The first algorithm I wrote gives me the correct answer, which is here:
#pragma omp parallel for schedule(dynamic, CHUNK) num_threads(NB_THREADS) reduction(+:counter) private(dx, dy, distance)
for (int i = 0; i < N-1; ++i)
{
for (int j = i+1; j < N; j = ++j)
{
dx = (data[j].x - data[i].x);
dy = (data[j].y - data[i].y);
distance = (dx*dx) + (dy*dy);
if (distance <= epsilon_squared)
{
++counter;
}
}
}
The second algorithm makes use of distance from origin and trigonometry to perform the same operations. The problem is that the final result is off by a very small margin, typically between 2-4. The point array is sorted beforehand by distance from origin.
#pragma omp parallel for schedule(dynamic, CHUNK) num_threads(NB_THREADS) reduction(+:counter) private(a, b, theta, distance)
for (int i = 0; i < N-1; ++i)
{
//dfo = distance from origin
a = data[i].dfo;
for (int j = i+1; j < N; j = ++j)
{
b = data[j].dfo;
//find angle between point a, origin, point b
theta = acos(((data[i].x*data[j].x)+(data[i].y*data[j].y))/((sqrt(((data[i].x*data[i].x)+(data[i].y*data[i].y)))*(sqrt((data[j].x*data[j].x)+(data[j].y*data[j].y))))));
distance = (a*a) + (b*b) - (2*a*b*(cos(theta)));
if (distance <= epsilon_squared)
{
++counter;
} else {
if (abs(a-b)>epsilon)
{
break;
}
}
}
}
My Question: Can the operations in the second algorithm lead to a different distance result compared to the first algorithm? I have checked the results between the first and the second, and they seem to be completely identical. If there is a difference being created, what can I do to fix this? Thank you in advance.
Related
I would like to construct a distance matrix in parallel in C++11 using OpenMP. I read various documentations, introductions, examples etc. Yet, I still have a few questions. To facilitate answering this post, I state my questions as assumptions numbered 1 through 7. This way, you can quickly browse through them and point out which ones are correct and which ones are not.
Let us begin with a simple serially executed function computing a dense Armadillo matrix:
// [[Rcpp::export]]
arma::mat compute_dist_mat(arma::mat &coordinates, unsigned int n_points) {
arma::mat dist_mat(n_points, n_points, arma::fill::zeros);
double dist {};
for(unsigned int i {0}; i < n_points; i++) {
for(unsigned int j = i + 1; j < n_points; j++) {
dist = compute_dist(coordinates(i, 1), coordinates(j, 1), coordinates(i, 0), coordinates(j, 0));
dist_mat.at(i, j) = dist;
dist_mat.at(j, i) = dist;
}
}
return dist_mat;
}
As a side note: this function is supposed to be called from R through the Rcpp interface - indicated by the // [[Rcpp::export]]. And accordingly the top of the file includes
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
#include <omp.h>
// [[Rcpp::plugins(openmp)]]
using namespace Rcpp;
using namespace arma;
However, the function should work also fine without the R interface.
In an attempt to parallelize the code, I replace the loops with
unsigned int i {};
unsigned int j {};
# pragma omp parallel for private(dist, i, j) num_threads(n_threads) if(n_threads > 1)
for(i = 0; i < n_points; i++) {
for(j = i + 1; j < n_points; j++) {
dist = compute_dist(coordinates(i, 1), coordinates(j, 1), coordinates(i, 0), coordinates(j, 0));
dist_mat.at(i, j) = dist;
dist_mat.at(j, i) = dist;
}
}
and add n_threads as an argument to the compute_dist_mat function.
This distributes the iterations of the outer loop across threads, with the iterations of the inner loop executed by the respective thread handling the outer loop.
The two loop levels cannot be combined because the inner loop depends on the outer one.
dist, i, and j are all to be initialized above the # pragma line and then declared private rather than initializing them in the loops.
The # pragma line does not have any effect when n_treads = 1, inducing a serial execution.
Extending the dense matrix application, the following code block illustrates the serial sparse matrix case with batch insertion. To motivate the use of sparse matrices here, I set distances below a certain threshold to zero.
// [[Rcpp::export]]
arma::sp_mat compute_dist_spmat(arma::mat &coordinates, unsigned int n_points, double dist_threshold) {
std::vector<double> dists;
std::vector<unsigned int> dist_i;
std::vector<unsigned int> dist_j;
double dist {};
for(unsigned long int i {0}; i < n_points; i++) {
for(unsigned long int j = i + 1; j < n_points; j++) {
dist = compute_dist(coordinates(i, 1), coordinates(j, 1), coordinates(i, 0), coordinates(j, 0));
if(dist >= dist_threshold) {
dists.push_back(dist);
dist_i.push_back(i);
dist_j.push_back(j);
}
}
}
unsigned int mat_size = dist_i.size();
arma::umat index_mat(2, mat_size * 2);
arma::vec dists_vec(mat_size * 2);
unsigned int j {};
for(unsigned int i {0}; i < mat_size; i++) {
j = i * 2;
index_mat.at(0, j) = dist_i[i];
index_mat.at(1, j) = dist_j[i];
index_mat.at(0, j + 1) = dist_j[i];
index_mat.at(1, j + 1) = dist_i[i];
dists_vec.at(j) = dists[i];
dists_vec.at(j + 1) = dists[i];
}
arma::sp_mat dist_mat(index_mat, values_vec, n_points, n_points);
return dist_mat;
}
Because the function does ex ante not know how many distances are above the threshold, it first stores the non-zero values in standard vectors and then constructs the Armadillo objects from them.
I parallelize the function as follows:
// [[Rcpp::export]]
arma::sp_mat compute_dist_spmat(arma::mat &coordinates, unsigned int n_points, double dist_threshold, unsigned short int n_threads) {
std::vector<std::vector<double>> dists(n_points);
std::vector<std::vector<unsigned int>> dist_j(n_points);
double dist {};
unsigned int i {};
unsigned int j {};
# pragma omp parallel for private(dist, i, j) num_threads(n_threads) if(n_threads > 1)
for(i = 0; i < n_points; i++) {
for(j = i + 1; j < n_points; j++) {
dist = compute_dist(coordinates(i, 1), coordinates(j, 1), coordinates(i, 0), coordinates(j, 0));
if(dist >= dist_threshold) {
dists[i].push_back(dist);
dist_j[i].push_back(j);
}
}
}
unsigned int vec_intervals[n_points + 1];
vec_intervals[0] = 0;
for (i = 0; i < n_points; i++) {
vec_intervals[i + 1] = vec_intervals[i] + dist_j[i].size();
}
unsigned int mat_size {vec_intervals[n_points]};
arma::umat index_mat(2, mat_size * 2);
arma::vec dists_vec(mat_size * 2);
unsigned int vec_begins_i {};
unsigned int vec_length_i {};
unsigned int k {};
# pragma omp parallel for private(i, j, k, vec_begins_i, vec_length_i) num_threads(n_threads) if(n_threads > 1)
for(i = 0; i < n_points; i++) {
vec_begins_i = vec_intervals[i];
vec_length_i = vec_intervals[i + 1] - vec_begins_i;
for(j = 0, j < vec_length_i, j++) {
k = (vec_begins_i + j) * 2;
index_mat.at(0, k) = i;
index_mat.at(1, k) = dist_j[i][j];
index_mat.at(0, k + 1) = dist_j[i][j];
index_mat.at(1, k + 1) = i;
dists_vec.at(k) = dists[i][j];
dists_vec.at(k + 1) = dists[i][j];
}
}
arma::sp_mat dist_mat(index_mat, dists_vec, n_points, n_points);
return dist_mat;
}
Using dynamic vectors in the loop is thread-safe.
dist, i, j, k, vec_begins_i, and vec_length_i are all to be initialized above the # pragma line and then declared private rather than initializing them in the loops.
Nothing has to be marked as a section.
Are any of the seven statements incorrect?
The following does not directly answer your question (it's just some dev code I copied from a personal GitHub repo), but it makes several points clear that may be of use in your application:
OpenMP automatically determines private members so long as you are not doing any dynamic memory allocation within the parallel loop
For sparse matrix distance calculations, it becomes important to move beyond a simple calculation of distance at each non-zero index and instead consider the structure of sparsity that is expected, and optimize for that. In the example below, I assume both matrices are very sparse and their intersection is less than their union. Thus, I "precondition" each distance calculation with squared column sums (for calculating Euclidean distance), and then adjust the calculation for the intersection only. This avoids complicated iterator structures and is very fast.
Using as few temporaries as possible is much to your benefit, and sparse matrix iterators do as good of a job of this as any alternative code anyone may ever write.
Eigen provides better vectorization than Armadillo (across the board, I might add) which means you want Eigen instead of Armadillo if those last 20% of performance gains are important to you.
This function calculates the Euclidean distance between all unique pairs of columns in an Eigen::SparseMatrix<double> object:
// sparse column-wise Euclidean distance between all columns
Eigen::MatrixXd distance(Eigen::SparseMatrix<double>& A) {
Eigen::MatrixXd dists(A.cols(), A.cols());
Eigen::VectorXd sq_colsums(A.cols());
for (int col = 0; col < A.cols(); ++col)
for (Eigen::SparseMatrix<double>::InnerIterator it(A, col); it; ++it)
sq_colsums(col) += it.value() * it.value();
#pragma omp parallel for
for (unsigned int i = 0; i < (A.cols() - 1); ++i) {
for (unsigned int j = (i + 1); j < A.cols(); ++j) {
double dist = sq_colsums(i) + sq_colsums(j);
Eigen::SparseMatrix<double>::InnerIterator it1(A, i), it2(A, j);
while (it1 && it2) {
if (it1.row() < it2.row()) ++it1;
else if (it1.row() > it2.row()) ++it2;
else {
dist -= it1.value() * it1.value();
dist -= it2.value() * it2.value();
dist += std::pow(it1.value() - it2.value(), 2);
++it1; ++it2;
}
}
dists(i, j) = std::sqrt(dist);
dists(j, i) = dists(i, j);
}
}
dists.diagonal().array() = 1;
return dists;
}
As Dirk and others have said, there are packages out there (i.e. ParallelDist) that seem to do everything you're after (for dense matrices). Look at wordspace for fast cosine distance calculations. See here for some comparisons. Cosine distance is easy to efficiently calculate in R without use of Rcpp using crossprod operations (see qlcMatrix::cosSparse source code for algorithmic inspiration).
I'm trying to make more efficient via parallelizing my code that calculates the accumulative probability function. I have a vector<double> of radii called r and I need to count how many elements there are with a radius bigger than a given one > R. In addition, I need to calculate the accumulative probability function for the volume.
The code I have is the following one:
int i, j;
double aux, contar, contar1, aux;
vector<double> r, contador, contador1, vol,
for (i = 0; i != r.size() - 1; i++)
{
aux = r[i];
contador[i] = 0;
contador1[i] = 0;
contar = 0;
contar1 = 0;
vol[i] = 0.0;
for (j = 0; j != r.size() - 1; j++)
{
if(aux <= r[j])
{
contar++;
#pragma omp atomic write
vol[i] = vol[i] + 4.0 * 3.141592653589793 * r[j] * r[j] * r[j] / 3.0;
}
if(aux==r[j])
{
contar1++;
}
}
#pragma omp atomic write
contador[i]=contar;
#pragma omp atomic write
contador1[i]=contar1;
}
but it's not efficient at all. Any help in order to make it more efficient with OpenMP?
I've come up with this code for applying a 3x3 kernel to my image:
double sum;
for(int i = 1; i < src.rows - 1; i++){
for(int j = 1; j < src.cols - 1; j++)
for (int k = 0; k < 3; k ++) {
sum = 0.0;
dst.at<cv::Vec3b>(i,j)[k] = 0.0;
for(int x = -1; x <= 1; x++){
for(int y = -1; y <=1; y++){
sum += (Kernel_Matrix[y+1][x+1]*src.at<cv::Vec3b>(i - x, j - y)[k]);
}
}
dst.at<cv::Vec3b>(i,j)[k] = cv::saturate_cast<uchar>(sum);
}
}
Now I got 2 questions:
By reading https://en.wikipedia.org/wiki/Kernel_(image_processing), there's various matrix for various filter, let's say I want my Blur filter to increase intensity, via a gui Slider that gives a value from x to whatever, what kind of operation should I make to my Blur Matrix(make a sum, a multiplication...)?
(I wold like to do the same with sharpness)
is there a specific matrix for Noise Reduction?
If you also have any mods to suggest me on my algorithm please let me know!
thanks!
everyone I am trying to implement patter matching with FFT but I am not sure what the result should be (I think I am missing something even though a read a lot of stuff about the problem and tried a lot of different implementations this one is the best so far). Here is my FFT correlation function.
void fft2d(fftw_complex**& a, int rows, int cols, bool forward = true)
{
fftw_plan p;
for (int i = 0; i < rows; ++i)
{
p = fftw_plan_dft_1d(cols, a[i], a[i], forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
}
fftw_complex* t = (fftw_complex*)fftw_malloc(rows * sizeof(fftw_complex));
for (int j = 0; j < cols; ++j)
{
for (int i = 0; i < rows; ++i)
{
t[i][0] = a[i][j][0];
t[i][1] = a[i][j][1];
}
p = fftw_plan_dft_1d(rows, t, t, forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
for (int i = 0; i < rows; ++i)
{
a[i][j][0] = t[i][0];
a[i][j][1] = t[i][1];
}
}
fftw_free(t);
}
int findCorrelation(int argc, char* argv[])
{
BMP bigImage;
BMP keyImage;
BMP result;
RGBApixel blackPixel = { 0, 0, 0, 1 };
const bool swapQuadrants = (argc == 4);
if (argc < 3 || argc > 4) {
cout << "correlation img1.bmp img2.bmp" << endl;
return 1;
}
if (!keyImage.ReadFromFile(argv[1])) {
return 1;
}
if (!bigImage.ReadFromFile(argv[2])) {
return 1;
}
//Preparations
const int maxWidth = std::max(bigImage.TellWidth(), keyImage.TellWidth());
const int maxHeight = std::max(bigImage.TellHeight(), keyImage.TellHeight());
const int rowsCount = maxHeight;
const int colsCount = maxWidth;
BMP bigTemp = bigImage;
BMP keyTemp = keyImage;
keyImage.SetSize(maxWidth, maxHeight);
bigImage.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
if (i < bigTemp.TellHeight() && j < bigTemp.TellWidth()) {
p1 = bigTemp.GetPixel(j, i);
} else {
p1 = blackPixel;
}
bigImage.SetPixel(j, i, p1);
RGBApixel p2;
if (i < keyTemp.TellHeight() && j < keyTemp.TellWidth()) {
p2 = keyTemp.GetPixel(j, i);
} else {
p2 = blackPixel;
}
keyImage.SetPixel(j, i, p2);
}
//Here is where the transforms begin
fftw_complex **a = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **b = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **c = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
for (int i = 0; i < rowsCount; ++i) {
a[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
b[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
c[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
p1 = bigImage.GetPixel(j, i);
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
RGBApixel p2;
p2 = keyImage.GetPixel(j, i);
b[i][j][0] = (0.299*p2.Red + 0.587*p2.Green + 0.114*p2.Blue);
b[i][j][1] = 0.0;
}
}
fft2d(a, rowsCount, colsCount);
fft2d(b, rowsCount, colsCount);
result.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
fftw_complex& y = a[i][j];
fftw_complex& x = b[i][j];
double u = x[0], v = x[1];
double m = y[0], n = y[1];
c[i][j][0] = u*m + n*v;
c[i][j][1] = v*m - u*n;
int fx = j;
if (fx>(colsCount / 2)) fx -= colsCount;
int fy = i;
if (fy>(rowsCount / 2)) fy -= rowsCount;
float r2 = (fx*fx + fy*fy);
const double cuttoffCoef = (maxWidth * maxHeight) / 37992.;
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
}
fft2d(c, rowsCount, colsCount, false);
const int halfCols = colsCount / 2;
const int halfRows = rowsCount / 2;
if (swapQuadrants) {
for (int i = 0; i < halfRows; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i + halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i + halfRows][j + halfCols][1]);
}
for (int i = halfRows; i < rowsCount; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i - halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i - halfRows][j + halfCols][1]);
}
}
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
const double& g = c[i][j][0];
RGBApixel pixel;
pixel.Alpha = 0;
int gInt = 255 - static_cast<int>(std::floor(g + 0.5));
pixel.Red = gInt;
pixel.Green = gInt;
pixel.Blue = gInt;
result.SetPixel(j, i, pixel);
}
BMP res;
res.SetSize(maxWidth, maxHeight);
result.WriteToFile("result.bmp");
return 0;
}
Sample output
This question would probably be more appropriately posted on another site like cross validated (metaoptimize.com used to also be a good one, but it appears to be gone)
That said:
There's two similar operations you can perform with FFT: convolution and correlation. Convolution is used for determining how two signals interact with each-other, whereas correlation can be used to express how similar two signals are to each-other. Make sure you're doing the right operation as they're both commonly implemented throught a DFT.
For this type of application of DFTs you usually wouldn't extract any useful information in the fourier spectrum unless you were looking for frequencies common to both data sources or whatever (eg, if you were comparing two bridges to see if their supports are spaced similarly).
Your 3rd image looks a lot like the power domain; normally I see the correlation output entirely grey except where overlap occurred. Your code definitely appears to be computing the inverse DFT, so unless I'm missing something the only other explanation I've come up with for the fuzzy look could be some of the "fudge factor" code in there like:
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
As for what you should expect: wherever there are common elements between the two images you'll see a peak. The larger the peak, the more similar the two images are near that region.
Some comments and/or recommended changes:
1) Convolution & correlation are not scale invariant operations. In other words, the size of your pattern image can make a significant difference in your output.
2) Normalize your images before correlation.
When you get the image data ready for the forward DFT pass:
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
/* ... */
How you grayscale the image is your business (though I would've picked something like sqrt( r*r + b*b + g*g )). However, I don't see you doing anything to normalize the image.
The word "normalize" can take on a few different meanings in this context. Two common types:
normalize the range of values between 0.0 and 1.0
normalize the "whiteness" of the images
3) Run your pattern image through an edge enhancement filter. I've personally made use of canny, sobel, and I think I messed with a few others. As I recall, canny was "quick'n dirty", sobel was more expensive, but I got comparable results when it came time to do correlation. See chapter 24 of the "dsp guide" book that's freely available online. The whole book is worth your time, but if you're low on time then at a minimum chapter 24 will help a lot.
4) Re-scale the output image between [0, 255]; if you want to implement thresholds, do it after this step because the thresholding step is lossy.
My memory on this one is hazy, but as I recall (edited for clarity):
You can scale the final image pixels (before rescaling) between [-1.0, 1.0] by dividing off the largest power spectrum value from the entire power spectrum
The largest power spectrum value is, conveniently enough, the center-most value in the power spectrum (corresponding to the lowest frequency)
If you divide it off the power spectrum, you'll end up doing twice the work; since FFTs are linear, you can delay the division until after the inverse DFT pass to when you're re-scaling the pixels between [0..255].
If after rescaling most of your values end up so black you can't see them, you can use a solution to the ODE y' = y(1 - y) (one example is the sigmoid f(x) = 1 / (1 + exp(-c*x) ), for some scaling factor c that gives better gradations). This has more to do with improving your ability to interpret the results visually than anything you might use to programmatically find peaks.
edit I said [0, 255] above. I suggest you rescale to [128, 255] or some other lower bound that is gray rather than black.
I have to design functional rubik cube as part of my homework. I am not using openGL directly, but a framework that was provided. ( All functions that do not belong to openGL and do not have their body listed here will be presumed correct)
Functionalities: all faces need to be rotated if selected by pressing a key.
The whole cube must rotate.
The rotation of the whole cube is correct and does not make the subject of this question.
In order to do this, I created the rubik cube from 27 smaller cubes(cube size is 3) and, at the same time, a tridimensional array. A replica of the cube that contains small cubes indexes.
In order to better understand this :
if initially one face was:
0 1 2
3 4 5
6 7 8
after a rotation it should be:
6 3 0
7 4 1
8 5 2
I can rotate the cubes relative to axis X or Y an indefinite number of times and it works perfectly.
However, if I combine the rotations( alternate X rotations with Y rotations in a random way) there appear cases when the cube deforms.
As this happens inconsistently, it is difficult for me to find the cause.
This is how I am creating the cube :
int count = 0;
for (int i = -1; i < 2; i++)
for(int j = -1; j < 2; j++)
for(int k = -1; k < 2; k++) {
RubikCube.push_back(drawCube());
RubikCube.at(count)->translate(4*i,4*j,4*k);
CubIndici[j+1][k+1][i+1] = count;
count++;
}
The function drawCube() effectively draws a cube of size 4 with the center positioned in origin.
CubIndici is the 3D array that I use to store the positions of the cube.
This is the function that I am using to rotate a matrix in the 3D array. (I have double checked it so it should be correct, but perhaps I am missing something).
void rotateMatrix(int face_index, int axis) {
if (axis == 0 )
{
for ( int i = 0; i < 3; i++)
for( int j = i; j < 3; j++)
{
swap(&CubIndici[i][j][face_index],&CubIndici[j][i][face_index]);
}
for (int i = 0; i < 3; i++)
for(int j = i; j < 3; j++)
{
swap(&CubIndici[i][j][face_index],&CubIndici[2-i][j][face_index]);
}
}
if (axis == 1 )
{
for ( int i = 0; i < 3; i++)
for( int j = i; j < 3; j++)
{
swap(&CubIndici[face_index][i][j],&CubIndici[face_index][j][i]);
}
for (int i = 0; i < 3; i++)
for(int j = i; j < 3; j++)
{
swap(&CubIndici[face_index][i][j],&CubIndici[face_index][2-i][j]);
}
}
}
The CubIndici 3d array is global, so I need the axis parameter to determine what kind of rotation to performe( relative to X, Y or Z)
on pressing w key I should rotate a( hardcoded, for now) face around axis X
for (int i = 0; i < 3; i++)
for(int j = 0; j < 3; j++)
RubikCube.at(CubIndici[i][j][1])->rotateXRelativeToPoint(
RubikCube.at(CubIndici[1][1][1])->axiscenter, 1.57079633);
rotateMatrix(1,0);
CubIndici11 should always contain the cube that is in the center of the face CubIndici[*][*]1.
Similarly,
for (int i = 0; i < 3; i++)
for(int j = 0; j < 3; j++)
RubikCube.at(CubIndici[2][i][j])->rotateYRelativeToPoint(
RubikCube.at(CubIndici[2][1][1])->axiscenter, 1.57079633);
rotateMatrix(2,1);
for rotating on axis Y.
1.57079633 is the radian equivalent of 90 degrees
For a better understanding I add the detailed display of rotating the left face on X axis and the top down one on Y axis.
The first block of coordinates is the initial cube face. ( The CubIndici index matrix is unmodified)
pre rotatie - coordinates and indexes for each of the cubes of the face.
post rotatie - coordiantes and indexes after rotating the objects. ( The matrix was not touched )
post rotatie matrice - after rotating the matrix aswell. If you compare the indexes of "pre rotatie" with "post rotatie matrice" you will notice 90 degrees turn.
This is the first rotation ( rotate the left face around X) and it is entirely correct.
On the next rotation however, the cubes that should be contained in the top down face ( as well as in the left one) should be 2,5,8. However, they appear as 2,5,6.
If you look at the first "post rotatie matrice" 2,5,8 are indeed the top row.
This is the issue that deforms the cube and I don't know what causes it.
If anything is unclear please let me know and I will edit the post or reply to the comment!
The formula of a PI/2 rotation clockwise for a cell at position <x;y> in a slice of the cube is:
x' = 2 - y
y' = x
similarily, for a counter clockwise rotation:
x' = y
y' = 2 - x
but these are rotations, and you want to do in-place modification of your arrays.
We can replace a rotation by a combination of 2 mirror symmetries.
clockwise:
<x;y> -> <y;x>
<x;y> -> <2-x;y>
and counter clockwise is the opposite composition of these functions.
Given these formulas, you can write your rotation function like this:
void rotateMatrix_X_CW(int face_index) {
int i,j;
for ( i = 0; i < 2; i++)
for( j = i; j < 3; j++)
{
swap(CubIndici[i][j][face_index],CubIndici[2-i][j][face_index]);
}
for ( i = 0; i < 2; i++)
for( j = i; j < 3; j++)
{
swap(CubIndici[i][j][face_index],CubIndici[j][i][face_index]);
}
}
void rotateMatrix_X_CCW(int face_index) {
int i,j;
for ( i = 0; i < 2; i++)
for( j = i; j < 3; j++)
{
swap(CubIndici[i][j][face_index],CubIndici[j][i][face_index]);
}
for ( i = 0; i < 2; i++)
for( j = i; j < 3; j++)
{
swap(CubIndici[i][j][face_index],CubIndici[2-i][j][face_index]);
}
}
You should be able to implement the other axes from there.