How to optimize Gauss-Seidel routine in C++ for sparse matrices? - c++

I've written a routine in C++ that solves the system of equations Ax = b using Gauss-Seidel method. However, I want to use this code for specific "A" matrices that are sparse (most of the elements are zero). This way, most of the time that this solver takes is busy multiplying some elements by zero.
For example, for the following system of equations:
| 4 -1 0 0 0 | | x1 | | b1 |
|-1 4 -1 0 0 | | x2 | | b2 |
| 0 -1 4 -1 0 | | x3 | = | b3 |
| 0 0 -1 4 -1 | | x4 | | b4 |
| 0 0 0 -1 4 | | x5 | | b5 |
Using Gauss-Seidel method, we will have the following iteration formula for x1:
x1 = [b1 - (-1 * x2 + 0 * x3 + 0 * x4 + 0 * x5)] / 4
As you see, the solver is wasting time by multiplying zero elements. Since I work with big matrices (for example, 10^5 by 10^5), this will influence the total CPU time in a negative way. I wonder if there is a way to optimize the solver so that it omits those part of calculations related to zero element multiplications.
Note that the the form of the "A" matrix in the example above is arbitrary and the solver must be able to work with any "A" matrix.
Here is the code:
void GaussSeidel(double **A, double *b, double *x, int arraySize)
{
const double tol = 0.001 * arraySize;
double error = tol + 1;
for (int i = 1; i <= arraySize; ++i)
x[i] = 0;
double *xOld;
xOld = new double [arraySize];
for (int i = 1; i <= arraySize; ++i)
xOld[i] = 101;
while (abs(error) > tol)
{
for (int i = 1; i <= arraySize; ++i)
{
sum = 0;
for (int j = 1; j <= arraySize; ++j)
{
if (j == i)
continue;
sum = sum + A[i][j] * x[j];
}
x[i] = 1 / A[i][i] * (b[i] - sum);
}
//cout << endl << "Answers:" << endl << endl;
error = errorCalc(xOld, x, arraySize);
for (int i = 1; i <= arraySize; ++i)
xOld[i] = x[i];
cout << "Solution converged!" << endl << endl;
}

Writing a sparse linear system solver is hard. VERY HARD.
I would just pick one of the exisiting implementations. Any reasonable LP solver has a sparse linear system solver inside, see for example lp_solve, GLPK, etc.
If the licence is acceptable for you, I recommend the Harwell Subroutine library. Interfacing C++ and Fortran is not fun though...

How sparse do you mean?
Here's a crappy sparse implementation that should work well for solving systems of linear equasions. It's probably a naive implementation, I know very little about the data structures typically used in industrial strength sparse matrices.
The code, and an example, is here.
Here's the class that does most of the work:
template <typename T>
class SparseMatrix
{
private:
SparseMatrix();
public:
SparseMatrix(int row, int col);
T Get(int row, int col) const;
void Put(int row, int col, T value);
int GetRowCount() const;
int GetColCount() const;
static void GaussSeidelLinearSolve(const SparseMatrix<T>& A, const SparseMatrix<T>& B, SparseMatrix<T>& X);
private:
int dim_row;
int dim_col;
vector<map<int, T> > values_by_row;
vector<map<int, T> > values_by_col;
};
The other method definitions are included in the ideone. I don't test for convergence, but rather simply loop an arbitrary number of times.
The sparse representation stores, by row and column, the positions of all of the values, using STL maps. I'm able to solve a system of 10000 equasions in just 1/4 seconds for a very sparse matrix like the one you provided (density < .001).
My implementation should be generic enough to support any integral or user defined type that supports comparison, the 4 arithmetic operators (+, -, *, /), and that can be explicitly cast from 0 (empty nodes are given the value (T) 0).

Recently, I face the same problem.
My solution is using vector array to save the sparse matrix.
Here is my code:
#define PRECISION 0.01
inline bool checkPricision(float x[], float pre[], int n) {
for (int i = 0; i < n; i++) {
if (fabs(x[i] - pre[i]) > PRECISION) return false;
}
return true;
}
/* mx = b */
void gaussIteration(std::vector< std::pair<int, float> >* m, float x[], float b[], int n) {
float* pre = new float[n];
int cnt = 0;
while (true) {
cnt++;
memcpy(pre, x, sizeof(float)* n);
for (int i = 0; i < n; i++) {
x[i] = b[i];
float mii = -1;
for (int j = 0; j < m[i].size(); j++) {
if (m[i][j].first != i) {
x[i] -= m[i][j].second * x[m[i][j].first];
}
else {
mii = m[i][j].second;
}
}
if (mii == -1) {
puts("Error: No Solution");
return;
}
x[i] /= mii;
}
if (checkPricision(x, pre, n)) {
break;
}
}
delete[] pre;
}

Try PETSC. You need CRS (Compressed row storage) format for this.

Related

Sum of independent diagonal in a matrix

I'm currently studying for an exam and I'm trying to deal with dynamical matrix. I've come across a problem regarding calculating the sum of every diagonal of a matrix whose values and size are chosen by the user.
The intent of my program is to print, thanks to a function, whose parameters are the matrix and its size, the value of every diagonal sum. I'll show you the code and describe it in depth.
----------------
| 52 | 35 | 5 | Example of matrix.
---------------- Imagine the first diagonal to be the one which goes right-to-left
| 2 | 71 | 1 | and only consists in the number "47".
---------------- The second diagonal would be the one which goes right-to-left and
| 3 | 60 | 25 | consists in the number "15" and "79".
---------------- So to get the sum of the second diagonal it would be:
| 79 | 55 | 98 |
---------------- sum = m[n_rows - 1][diag - 2] + m[n_rows - 2][diag - 1]
| 47 | 15 | 66 |
---------------- When diag > columns, in order to avoid error regarding matrix size,
I should lower the quantity "n_rows - 1" by the quantity "diag - n_columns".
This is what I thought to do, according to my description:
void diag_matrix(int** m, int righe, int colonne){//righe = rows, colonne = columns.
//M is the matrix.
// diag is the number of the diagonal I'm considering.
for(int diag = 1; diag < (righe + colonne); diag++){
int sum = 0;// the sum
int i = 0;// the counter of the cicle
int l = 0;// this is the value to riallign the row in case diag > column
int temp = diag;//I use this variable not to modify the value of diag.
// What I want is: when the column-index/row-index of the matrix reaches 0, the cicle will interrupt (after final iteration);
while(righe - i - l - 1 > 0 || diag - 1 - i > 0){
if (diag > colonne){//this condition changes l-value only if diag value is greater than column. Explanation outside the code
l = diag - colonne;//this is the value to subtract to row-index
temp = colonne;//this position is necessary to set column-index to its maxium.
}
sum = sum + m[righe - 1 - l - i][temp -1 - i];//pretty clear I think.
i++;//the i is incremented by one.
}// end of while-statement
cout << "Somma Diagonale " << diag << " = " << sum << ".\n";
}// end of for-statement
}//end of function declaration
Obviously it does not work, but I can't figure out the problem.
(There used to be a paragraph here, but on a second look, you didn’t make the mistake it was talking about.)
Since you didn’t post to Code Reviews, here’s a solution instead of a detailed code review. (If you want to make the original approach work, I’d suggest single-stepping through it in a debugger and checking where your variables first get the wrong value.) It’s got a lot of boilerplate to make it compile and run, but the part you’ll be most interested in is diag_sums() and its comments.
One idea here is to use OOP to automatically check the bounds of your array accesses. The latter is very important for catching off-by-one errors and the like. You can turn it off in production if you want, but you really don’t want to silence warnings when your program has a buffer overrun. Other optimizations here include locality of access for the data, and strength reduction on the operations: rather than check on each iteration whether we’ve hit the right edge and the bottom edge, we can simply calculate the length of each diagonal in advance.
Since the definition of diagonal number k of matrix a with M rows is equivalent to: all elements a[i][j] such that such that M - k = i - j, the algorithm ensures correctness by maintaining the invariant, which holds whenever we add 1 to both i and j, starting when either i or j is 0, and stopping whenever i = M or j = N, that is, traversing each step of the diagonal from the left or top edge to the right or bottom edge, whichever comes first.
#include <assert.h>
#include <iostream>
#include <stddef.h>
#include <stdlib.h>
#include <utility>
#include <vector>
using std::cin;
using std::cout;
template <typename T>
class matrix {
public:
matrix( const ptrdiff_t rows,
const ptrdiff_t cols,
std::vector<T>&& elems )
: rows_(rows), cols_(cols), elems_(elems)
{
assert( rows_ > 0 );
assert( cols_ > 0 );
assert( elems_.size() == static_cast<size_t>(rows_*cols_) );
}
matrix( const ptrdiff_t rows,
const ptrdiff_t cols,
const std::vector<T>& elems )
: matrix( rows, cols, std::move(std::vector<T>(elems)) )
{}
matrix( const matrix<T>& ) = default;
matrix( matrix<T>&& ) = default;
matrix& operator= ( const matrix<T>& ) = default;
matrix& operator= ( matrix<T>&& ) = default;
T& operator() ( const ptrdiff_t m, const ptrdiff_t n )
{
assert( m >= 0 && m < rows_ );
assert( n >= 0 && n < cols_ );
return elems_[static_cast<size_t>(m*cols_ + n)];
}
const T& operator() ( const ptrdiff_t m, const ptrdiff_t n ) const
{
/* Because this call does not modify any data, and the only reason the
* member function above cannot be const is that it returns a non-const
* reference to an element of elems, casting away the const qualifier
* internally and then returning a const reference is a safe way to
* re-use the code.
*/
matrix<T>& nonconst = *const_cast<matrix<T>*>(this);
return nonconst(m,n);
}
ptrdiff_t rows() const { return rows_; }
ptrdiff_t cols() const { return cols_; }
private:
ptrdiff_t rows_;
ptrdiff_t cols_;
std::vector<T> elems_;
};
template<typename T>
std::ostream& operator<< ( std::ostream& out, const matrix<T>& x )
/* Boilerplate to print a matrix. */
{
const ptrdiff_t m = x.rows(), n = x.cols();
for ( ptrdiff_t i = 0; i < m; ++i ) {
out << x(i,0);
for ( ptrdiff_t j = 1; j < n; ++j )
out << ' ' << x(i,j);
out << '\n';
} // end for
return out;
}
using elem_t = int;
std::vector<elem_t> diag_sums( const matrix<elem_t>& a )
/* Return a vector of all the diagonal sums of a.
*
* The first diagonal sum is a(rows-1,0)
* The second is a(rows-2,0) + a(rows-1,1)
* The third is a(rows-3,0) + a(rows-2,1) + a(rows-1,2)
* And so on. I.e., the kth diagonal is the sum of all elements a(i,j) such
* that i - j == rows - k.
*
* If a is a M×N matrix, there are M diagonals starting in column zero, and
* N-1 diagonals (excluding the one containing a(0,0) so we don't count it
* twice) starting in row 0. We process them bottom to top, then left to
* right.
*
* The number of elements in a diagonal starting at a(i,0) is min{M-i, N}. The
* number of elements in a diagonal starting at a(0,j) is min{M, N-j}. This is
* because a diagonal stops at either the bottom edge or the left edge of a.
*/
{
const ptrdiff_t m = a.rows(), n = a.cols();
std::vector<elem_t> result;
result.reserve( static_cast<size_t>(m + n - 1) );
for ( ptrdiff_t i = m-1; i > 0; --i ) {
elem_t sum = 0;
const ptrdiff_t nk = (m-i) < n ? (m-i) : n;
for ( ptrdiff_t k = 0; k < nk; ++k )
sum += a(i+k, k);
result.emplace_back(sum);
} // end for i
for ( ptrdiff_t j = 0; j < n; ++j ) {
elem_t sum = 0;
const ptrdiff_t nk = m < (n-j) ? m : (n-j);
for ( ptrdiff_t k = 0; k < nk; ++k )
sum += a(k, j+k);
result.emplace_back(sum);
} // end for j
return result;
}
matrix<elem_t> read_input_matrix( const int row, const int column )
/* Reads in row*column consecutive elements from cin and packs them into a
* matrix<elem_t>.
*/
{
assert(row > 0);
assert(column > 0);
const ptrdiff_t nelements = row*column;
assert(nelements > 0); // Check for overflow.
std::vector<elem_t> result;
result.reserve(static_cast<size_t>(nelements));
for ( ptrdiff_t i = nelements; i > 0; --i ) {
int x;
cin >> x;
assert(cin.good());
result.push_back(x);
}
return matrix<elem_t>( row,
column,
std::move(result) );
}
template<typename T>
bool print_sequence( const T& container )
/* Prints the contents of a container in the format
* "{47, 94, 124, 160, 148, 36, 5}".
*/
{
cout << "{";
if ( container.begin() != container.end() )
cout << *container.begin();
for ( auto it = container.begin() + 1; it < container.end(); ++it )
cout << ", " << *it;
cout << "}\n";
return cout.good();
}
/* A simple test driver that reads in the number of rows, the number of
* columns, and then row*columns int values, from standard input. It
* then passes the result to diag_matrix(), E.g.:
*
* 5 3
* 52 35 5
* 2 71 1
* 3 60 25
* 79 55 98
* 47 15 66
*/
int main()
{
int rows, columns;
cin >> rows;
cin >> columns;
assert(cin.good());
const matrix<elem_t> input_matrix = read_input_matrix( rows, columns );
// cout << input_matrix; // Instrumentation.
const std::vector<elem_t> sums = diag_sums(input_matrix);
print_sequence(sums);
return EXIT_SUCCESS;
}
You could also just do print_sequence(diag_sums(read_input_matrix( rows, columns ))).
You can simplify your code finding the starting position of each diagonal and then stepping through the matrix as long as the coordinates stay inside the matrix.
Something like this:
#include <iostream>
using namespace std;
void diag_matrix(int** m, int rows, int cols)
{
for (int diag = 1; diag < rows + cols; diag++)
{
int x, y;
if (diag < rows)
{
y = rows - diag;
x = 0;
}
else
{
y = 0;
x = diag - rows;
}
int sum = 0;
cout << "Summing diagonal #" << diag << ":";
while ((x < cols) && (y < rows))
{
sum += m[y][x];
cout << " " << m[y][x];
x++;
y++;
}
cout << " result: " << sum << "." << endl;
}
}
int main(int argc, char* argv[])
{
int rows = 5, cols = 3;
int **m = new int*[rows];
for (int i = 0; i < rows; i++)
m[i] = new int[cols];
m[0][0] = 52; m[0][1] = 35; m[0][2] = 5;
m[1][0] = 2; m[1][1] = 71; m[1][2] = 1;
m[2][0] = 3; m[2][1] = 60; m[2][2] = 25;
m[3][0] = 79; m[3][1] = 55; m[3][2] = 98;
m[4][0] = 47; m[4][1] = 15; m[4][2] = 66;
diag_matrix(m, rows, cols);
for (int i = 0; i < rows; i++)
delete[] m[i];
delete[] m;
return 0;
}

Function to differentiate a polynomial in C++

I've been trying to get this solved but without luck.
All I want to do is to differentiate a polynomial like P(x) = 3x^3 + 2x^2 + 4x + 5
At the end of the code, the program should evaluate this function and gives me just the answer.
The derivative of P(x) is P'(x) = 3*3x^2 + 2*2x + 4*1. If x = 1, the answer is 17.
I just don't get that answer no matter how I alter my loop.
/*
x: value of x in the polynomial
c: array of coefficients
n: number of coefficients
*/
double derivePolynomial(double x, double c[], int n) {
double result = 0;
double p = 1;
int counter = 1;
for(int i=n-1; i>=0; i--) //backward loop
{
result = result + c[i]*p*counter;
counter++; // number of power
p = p*x;
}
return result;
}
//Output in main() looks like this
double x=1.5;
double coeffs[4]={3,2.2,-1,0.5};
int numCoeffs=4;
cout << " = " << derivePolynomial(x,coeffs,numCoeffs) << endl;
The derivative of x ^ n is n * x ^ (n - 1), but you are calculating something completely different.
double der(double x, double c[], int n)
{
double d = 0;
for (int i = 0; i < n; i++)
d += pow(x, i) * c[i];
return d;
}
This would work, assuming that your polinomial is in the form c0 + c1x + c2x ^ 2 + ...
Demonstration, with another function that does the derivation as well.
Edit: alternative solution avoiding the use of the pow() function, with simple summation and repeated multiplication:
double der2(double x, double c[], int n)
{
double d = 0;
for (int i = 0; i < n - 1; i++) {
d *= x;
d += (n - i - 1) * c[i];
}
return d;
}
This works too. Note that the functions that take the iterative approach (those which don't use pow()) expect their arguments (the coefficients) in reverse order.
You need to reverse the direction of the loop. Start at 0 and go to n.
At the moment when you compute the partial sum for the n-th power p is 1. For the last one x^0 you your p will contain x^n-1 th power.
double derivePolynomial(double x, double c[], int n) {
double result = 0;
double p = 1;
int counter = 1;
for(int i=1; i<n; i++) //start with 1 because the first element is constant.
{
result = result + c[i]*p*counter;
counter++; // number of power
p = p*x;
}
return result;
}
double x=1;
double coeffs[4]={5,4,2,3};
int numCoeffs=4;
cout << " = " << derivePolynomial(x,coeffs,numCoeffs) << endl;

numerical analysis equation

I have this equation
and
then find the polynomial from
I am trying to implement it like this:
for (int n=0;n<order;n++){
df[n][0]=y[n];
for (int i=0;i<N;i++){ //N number of points
df[n][i]+=factorial(n,i)*y[i+n-1];
}
}
for (int i=0;i<N;i++){
term=factorial(s,i);
result*=df[0][i]*term;
sum+=result;
}
return sum;
1) I am not sure how to implement the sign of every argument in the function.As you can see it goes 'positive' , 'negative', 'positive' ...
2) I am not sure for any mistakes...
Thanks!
----------------------factorial-----------------------------
int fact(int n){
//3!=1*2*3
if (n==0) return 1;
else
return n*fact(n-1);
}
double factorial(double s,int n){
//(s 3)=s*(s-1)*(s-2)/6
if ((n==0) &&(s==0)) return 1;
else
return fact(s)/fact(n);
}
The simplest solution is probably to just keep the sign in
a variable, and multiply it in each time through the loop.
Something like:
sign = 1.0;
for ( int i = 0; i < N; ++ i ) {
term = factorial( s, i );
result *= df[0][i] * term;
sum += sign * result;
sign = - sign;
}
You cannot do pow( -1, m ).
You can write your own:
inline int minusOnePower( unsigned int m )
{
return (m & 1) ? -1 : 1;
}
You may want to build up some tables of calculated values.
Well, I understand you want to approximately calculate the value f(x) for a given x=X, using Newton Interpolation polynomial with equidistant points (more specifically Newton-Gregory forward difference interpolation polynomial).
Assuming s=(X-x0)/h, where x0 is the first x, and h the step to obtain the rest of the x for which you know the exact value of f :
Considere:
double coef (double s, int k)
{
double c(1);
for (int i=1; i<=k ; ++i)
c *= (s-i+1)/i ;
return c;
}
double P_interp_value(double s, int Num_of_intervals , double f[] /* values of f in these points */) // P_n_s
{
int N=Num_of_intervals ;
double *df0= new double[N+1]; // calculing df only for point 0
for (int n=0 ; n<=N ; ++n) // n here is the order
{
df0[n]=0;
for (int k=0, sig=-1; k<=n; ++k, sig=-sig) // k here is the "x point"
{
df0[n] += sig * coef(n,k) * f[n-k];
}
}
double P_n_s = 0;
for (int k=0; k<=N ; ++k ) // here k is the order
{
P_n_s += coef(s,k)* df0[k];
}
delete []df0;
return P_n_s;
}
int main()
{
double s=0.415, f[]={0.0 , 1.0986 , 1.6094 , 1.9459 , 2.1972 };
int n=1; // Num of interval to use during aproximacion. Max = 4 in these example
while (true)
{
std::cin >> n;
std::cout << std::endl << "P(n=" << n <<", s=" << s << ")= " << P_interp_value(s, n, f) << std::endl ;
}
}
it print:
1
P(n=1, s=0.415)= 0.455919
2
P(n=2, s=0.415)= 0.527271
3
P(n=3, s=0.415)= 0.55379
4
P(n=4, s=0.415)= 0.567235
compare with:
http://ecourses.vtu.ac.in/nptel/courses/Webcourse-contents/IIT-KANPUR/Numerical%20Analysis/numerical-analysis/Rathish-kumar/rathish-oct31/fratnode8.html
It works. Now we can start to optimize these code.
just for the sign ;-)
inline signed int minusOnePower( unsigned int m )
{
return 1-( (m & 1)<<1 );
}

fftshift/ifftshift C/C++ source code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Does anyone know if there is any free and open source library that has implemented these two functions the way they are defined in matlab?
Thanks
FFTHIFT / IFFTSHIFT is a fancy way of doing CIRCSHIFT.
You can verify that FFTSHIFT can be rewritten as CIRCSHIFT as following.
You can define macros in C/C++ to punt FFTSHIFT to CIRCSHIFT.
A = rand(m, n);
mm = floor(m / 2);
nn = floor(n / 2);
% All three of the following should provide zeros.
circshift(A,[mm, nn]) - fftshift(A)
circshift(A,[mm, 0]) - fftshift(A, 1)
circshift(A,[ 0, nn]) - fftshift(A, 2)
Similar equivalents can be found for IFFTSHIFT.
Circular shift can be implemented very simply with the following code (Can be improved with parallel versions ofcourse).
template<class ty>
void circshift(ty *out, const ty *in, int xdim, int ydim, int xshift, int yshift)
{
for (int i = 0; i < xdim; i++) {
int ii = (i + xshift) % xdim;
for (int j = 0; j < ydim; j++) {
int jj = (j + yshift) % ydim;
out[ii * ydim + jj] = in[i * ydim + j];
}
}
}
And then
#define fftshift(out, in, x, y) circshift(out, in, x, y, (x/2), (y/2))
#define ifftshift(out, in, x, y) circshift(out, in, x, y, ((x+1)/2), ((y+1)/2))
This was done a bit impromptu. Bear with me if there are any formatting / syntactical problems.
Possible this code may help. It perform fftshift/ifftshift only for 1D array within one buffer. Algorithm of forward and backward fftshift for even number of elements is fully identical.
void swap(complex *v1, complex *v2)
{
complex tmp = *v1;
*v1 = *v2;
*v2 = tmp;
}
void fftshift(complex *data, int count)
{
int k = 0;
int c = (int) floor((float)count/2);
// For odd and for even numbers of element use different algorithm
if (count % 2 == 0)
{
for (k = 0; k < c; k++)
swap(&data[k], &data[k+c]);
}
else
{
complex tmp = data[0];
for (k = 0; k < c; k++)
{
data[k] = data[c + k + 1];
data[c + k + 1] = data[k + 1];
}
data[c] = tmp;
}
}
void ifftshift(complex *data, int count)
{
int k = 0;
int c = (int) floor((float)count/2);
if (count % 2 == 0)
{
for (k = 0; k < c; k++)
swap(&data[k], &data[k+c]);
}
else
{
complex tmp = data[count - 1];
for (k = c-1; k >= 0; k--)
{
data[c + k + 1] = data[k];
data[k] = data[c + k];
}
data[c] = tmp;
}
}
UPDATED:
Also FFT library (including fftshift operations) for arbitrary points number could be found in Optolithium (under the OptolithiumC/libs/fourier)
Normally, centering the FFT is done with v(k)=v(k)*(-1)**k in
the time domain. Shifting in the frequency domain is a poor substitute, for
mathematical reasons and for computational efficiency.
See pp 27 of:
http://show.docjava.com/pub/document/jot/v8n6.pdf
I am not sure why Matlab documentation does it the way they do,
they give no technical reference.
Or you can do it yourself by typing type fftshift and recoding that in C++. It's not that complicated of Matlab code.
Edit: I've noticed that this answer has been down-voted a few times recently and commented on in a negative way. I recall a time when type fftshift was more revealing than the current implementation, but I could be wrong. If I could delete the answer, I would as it seems no longer relevant.
Here is a version (courtesy of Octave) that implements it without
circshift.
I tested the code provided here and made an example project to test them. For 1D code one can simply use std::rotate
template <typename _Real>
static inline
void rotshift(complex<_Real> * complexVector, const size_t count)
{
int center = (int) floor((float)count/2);
if (count % 2 != 0) {
center++;
}
// odd: 012 34 changes to 34 012
std::rotate(complexVector,complexVector + center,complexVector + count);
}
template <typename _Real>
static inline
void irotshift(complex<_Real> * complexVector, const size_t count)
{
int center = (int) floor((float)count/2);
// odd: 01 234 changes to 234 01
std::rotate(complexVector,complexVector +center,complexVector + count);
}
I prefer using std::rotate over the code from Alexei due to its simplicity.
For 2D it gets more complicated. For even numbers it is basically a flip left right and flip upside down. For odd it is the circshift algorithm:
// A =
// 1 2 3
// 4 5 6
// 7 8 9
// fftshift2D(A)
// 9 | 7 8
// --------------
// 3 | 1 2
// 6 | 4 5
// ifftshift2D(A)
// 5 6 | 4
// 8 9 | 7
// --------------
// 2 3 | 1
Here I implemented the circshift code with an interface using only one array for in and output. For even numbers only a single array is required, for odd numbers a second array is temporarily created and copied back to the input array. This causes a performance decrease because of the additional time for copying the array.
template<class _Real>
static inline
void fftshift2D(complex<_Real> *data, size_t xdim, size_t ydim)
{
size_t xshift = xdim / 2;
size_t yshift = ydim / 2;
if ((xdim*ydim) % 2 != 0) {
// temp output array
std::vector<complex<_Real> > out;
out.resize(xdim * ydim);
for (size_t x = 0; x < xdim; x++) {
size_t outX = (x + xshift) % xdim;
for (size_t y = 0; y < ydim; y++) {
size_t outY = (y + yshift) % ydim;
// row-major order
out[outX + xdim * outY] = data[x + xdim * y];
}
}
// copy out back to data
copy(out.begin(), out.end(), &data[0]);
}
else {
// in and output array are the same,
// values are exchanged using swap
for (size_t x = 0; x < xdim; x++) {
size_t outX = (x + xshift) % xdim;
for (size_t y = 0; y < yshift; y++) {
size_t outY = (y + yshift) % ydim;
// row-major order
swap(data[outX + xdim * outY], data[x + xdim * y]);
}
}
}
}
template<class _Real>
static inline
void ifftshift2D(complex<_Real> *data, size_t xdim, size_t ydim)
{
size_t xshift = xdim / 2;
if (xdim % 2 != 0) {
xshift++;
}
size_t yshift = ydim / 2;
if (ydim % 2 != 0) {
yshift++;
}
if ((xdim*ydim) % 2 != 0) {
// temp output array
std::vector<complex<_Real> > out;
out.resize(xdim * ydim);
for (size_t x = 0; x < xdim; x++) {
size_t outX = (x + xshift) % xdim;
for (size_t y = 0; y < ydim; y++) {
size_t outY = (y + yshift) % ydim;
// row-major order
out[outX + xdim * outY] = data[x + xdim * y];
}
}
// copy out back to data
copy(out.begin(), out.end(), &data[0]);
}
else {
// in and output array are the same,
// values are exchanged using swap
for (size_t x = 0; x < xdim; x++) {
size_t outX = (x + xshift) % xdim;
for (size_t y = 0; y < yshift; y++) {
size_t outY = (y + yshift) % ydim;
// row-major order
swap(data[outX + xdim * outY], data[x + xdim * y]);
}
}
}
}
Notice: There are better answers provided, I just keep this here for a while for... I do not know what.
Try this:
template<class T> void ifftShift(T *out, const T* in, size_t nx, size_t ny)
{
const size_t hlen1 = (ny+1)/2;
const size_t hlen2 = ny/2;
const size_t shft1 = ((nx+1)/2)*ny + hlen1;
const size_t shft2 = (nx/2)*ny + hlen2;
const T* src = in;
for(T* tgt = out; tgt < out + shft1 - hlen1; tgt += ny, src += ny) { // (nx+1)/2 times
copy(src, src+hlen1, tgt + shft2); //1->4
copy(src+hlen1, src+ny, tgt+shft2-hlen2); } //2->3
src = in;
for(T* tgt = out; tgt < out + shft2 - hlen2; tgt += ny, src += ny ){ // nx/2 times
copy(src+shft1, src+shft1+hlen2, tgt); //4->1
copy(src+shft1-hlen1, src+shft1, tgt+hlen2); } //3->2
};
For matrices with even dimensions you can do it in-place, just passing the same pointer into in and out parameters.
Also note that for 1D arrays fftshift is just std::rotate.
You could also use arrayfire's shift function as replacement for Matlab's circshift and re-implement the rest of the code. This could be useful if you are interested in any of the other features of AF anyway (such as portability to GPU by simply changing a linker flag).
However if all your code is meant to be run on the CPU and is quite sophisticated or you don't want to use any other data format (AF requires af::arrays) stick with one of the other options.
I ended up changing to AF because I would have had to re-implement fftshift as an OpenCL kernel otherwise back in the time.
It will give equivalent result to ifftshift in matlab
ifftshift(vector< vector <double> > Hlow,int RowLineSpace, int ColumnLineSpace)
{
int pivotRow=floor(RowLineSpace/2);
int pivotCol=floor(ColumnLineSpace/2);
for(int i=pivotRow;i<RowLineSpace;i++){
for(int j=0;j<ColumnLineSpace;j++){
double temp=Hlow.at(i).at(j);
second.push_back(temp);
}
ifftShiftRow.push_back(second);
second.clear();
}
for(int i=0;i<pivotRow;i++){
for(int j=0;j<ColumnLineSpace;j++){
double temp=Hlow.at(i).at(j);
first.push_back(temp);
}
ifftShiftRow.push_back(first);
first.clear();
}
double** arr = new double*[RowLineSpace];
for(int i = 0; i < RowLineSpace; ++i)
arr[i] = new double[ColumnLineSpace];
int i1=0,j1=0;
for(int j=pivotCol;j<ColumnLineSpace;j++){
for(int i=0;i<RowLineSpace;i++){
double temp2=ifftShiftRow.at(i).at(j);
arr[i1][j1]=temp2;
i1++;
}
j1++;
i1=0;
}
for(int j=0;j<pivotCol;j++){
for(int i=0;i<RowLineSpace;i++){
double temp1=ifftShiftRow.at(i).at(j);
arr[i1][j1]=temp1;
i1++;
}
j1++;
i1=0;
}
for(int i=0;i<RowLineSpace;i++){
for(int j=0;j<ColumnLineSpace;j++){
double value=arr[i][j];
temp.push_back(value);
}
ifftShiftLow.push_back(temp);
temp.clear();
}
return ifftShiftLow;
}
Octave uses fftw to implement (i)fftshift.
You can use kissfft. It's reasonable fast, extremely simple to use, and free. Arranging the output like you want it requires only to:
a) shift by (-dim_x/2, -dim_y/2, ...), with periodic boundary conditions
b) FFT or IFFT
c) shift back by (dim_x/2, dim_y/2, ...) , with periodic boundary conditions
d) scale ? (according to your needs IFFT*FFT will scale the function by dim_x*dim_y*... by default)

Logic error for Gauss elimination

Logic error problem with the Gaussian Elimination code...This code was from my Numerical Methods text in 1990's. The code is typed in from the book- not producing correct output...
Sample Run:
SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
USING GAUSSIAN ELIMINATION
This program uses Gaussian Elimination to solve the
system Ax = B, where A is the matrix of known
coefficients, B is the vector of known constants
and x is the column matrix of the unknowns.
Number of equations: 3
Enter elements of matrix [A]
A(1,1) = 0
A(1,2) = -6
A(1,3) = 9
A(2,1) = 7
A(2,2) = 0
A(2,3) = -5
A(3,1) = 5
A(3,2) = -8
A(3,3) = 6
Enter elements of [b] vector
B(1) = -3
B(2) = 3
B(3) = -4
SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS
The solution is
x(1) = 0.000000
x(2) = -1.#IND00
x(3) = -1.#IND00
Determinant = -1.#IND00
Press any key to continue . . .
The code as copied from the text...
//Modified Code from C Numerical Methods Text- June 2009
#include <stdio.h>
#include <math.h>
#define MAXSIZE 20
//function prototype
int gauss (double a[][MAXSIZE], double b[], int n, double *det);
int main(void)
{
double a[MAXSIZE][MAXSIZE], b[MAXSIZE], det;
int i, j, n, retval;
printf("\n \t SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS");
printf("\n \t USING GAUSSIAN ELIMINATION \n");
printf("\n This program uses Gaussian Elimination to solve the");
printf("\n system Ax = B, where A is the matrix of known");
printf("\n coefficients, B is the vector of known constants");
printf("\n and x is the column matrix of the unknowns.");
//get number of equations
n = 0;
while(n <= 0 || n > MAXSIZE)
{
printf("\n Number of equations: ");
scanf ("%d", &n);
}
//read matrix A
printf("\n Enter elements of matrix [A]\n");
for (i = 0; i < n; i++)
for (j = 0; j < n; j++)
{
printf(" A(%d,%d) = ", i + 1, j + 1);
scanf("%lf", &a[i][j]);
}
//read {B} vector
printf("\n Enter elements of [b] vector\n");
for (i = 0; i < n; i++)
{
printf(" B(%d) = ", i + 1);
scanf("%lf", &b[i]);
}
//call Gauss elimination function
retval = gauss(a, b, n, &det);
//print results
if (retval == 0)
{
printf("\n\t SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS\n");
printf("\n\t The solution is");
for (i = 0; i < n; i++)
printf("\n \t x(%d) = %lf", i + 1, b[i]);
printf("\n \t Determinant = %lf \n", det);
}
else
printf("\n \t SINGULAR MATRIX \n");
return 0;
}
/* Solves the system of equations [A]{x} = {B} using */
/* the Gaussian elimination method with partial pivoting. */
/* Parameters: */
/* n - number of equations */
/* a[n][n] - coefficient matrix */
/* b[n] - right-hand side vector */
/* *det - determinant of [A] */
int gauss (double a[][MAXSIZE], double b[], int n, double *det)
{
double tol, temp, mult;
int npivot, i, j, l, k, flag;
//initialization
*det = 1.0;
tol = 1e-30; //initial tolerance value
npivot = 0;
//mult = 0;
//forward elimination
for (k = 0; k < n; k++)
{
//search for max coefficient in pivot row- a[k][k] pivot element
for (i = k + 1; i < n; i++)
{
if (fabs(a[i][k]) > fabs(a[k][k]))
{
//interchange row with maxium element with pivot row
npivot++;
for (l = 0; l < n; l++)
{
temp = a[i][l];
a[i][l] = a[k][l];
a[k][l] = temp;
}
temp = b[i];
b[i] = b[k];
b[k] = temp;
}
}
//test for singularity
if (fabs(a[k][k]) < tol)
{
//matrix is singular- terminate
flag = 1;
return flag;
}
//compute determinant- the product of the pivot elements
*det = *det * a[k][k];
//eliminate the coefficients of X(I)
for (i = k; i < n; i++)
{
mult = a[i][k] / a[k][k];
b[i] = b[i] - b[k] * mult; //compute constants
for (j = k; j < n; j++) //compute coefficients
a[i][j] = a[i][j] - a[k][j] * mult;
}
}
//adjust the sign of the determinant
if(npivot % 2 == 1)
*det = *det * (-1.0);
//backsubstitution
b[n] = b[n] / a[n][n];
for(i = n - 1; i > 1; i--)
{
for(j = n; j > i + 1; j--)
b[i] = b[i] - a[i][j] * b[j];
b[i] = b[i] / a[i - 1][i];
}
flag = 0;
return flag;
}
The solution should be: 1.058824, 1.823529, 0.882353 with det as -102.000000
Any insight is appreciated...
//eliminate the coefficients of X(I)
for (i = k; i < n; i++)
Should this maybe be
for (i = k + 1; i < n; i++)
The way it is now, I believe this will cause you to divide the pivot row by itself, zeroing it out.
This probably doesn't answer your question in the way you intended, but programming your own numerically-stable matrix algorithms is about as well-advised as do-it-yourself surgery.
There's a very nice library called TNT/JAMA from a reputable source (NIST) which does elementary matrix math in C++. To solve Ax=B, first factor A (the QR decomposition is a good general method, you can use LU but it's less numerically stable), then call solve(B). This works both for square matrices, where it's exact (subject to numerical computation issues), and overdetermined systems, where you get a least-squares answer.