C++: replicating matlab's interp1 spline interpolation function

C++: replicating matlab's interp1 spline interpolation function - c++

Can anyone give me some direction to replicating MATLAB's interp1 function, using spline interpolation? I tried closely replicating the algorithm on the wikipedia page, but the results don't really match up.
#include <stdio.h>
#include <stdint.h>
#include <iostream>
#include <vector>
//MATLAB: interp1(x,test_array,query_points,'spline')
int main(){
int size = 10;
std::vector<float> test_array(10);
test_array[0] = test_array[4] = test_array[8] = 1;
test_array[1] = test_array[3] = test_array[5] = test_array[7] = test_array[9] = 4;
test_array[2] = test_array[6] = 7;
std::vector<float> query_points;
for (int i = 0; i < 10; i++)
query_points.push_back(i +.05);
int n = (size - 1);
std::vector<float> a(n+1);
std::vector<float> x(n+1); //sample_points vector
for (int i = 0; i < (n+1); i++){
x[i] = i + 1.0;
a[i] = test_array[i];
}
std::vector<float> b(n);
std::vector<float> d(n);
std::vector<float> h(n);
for (int i = 0; i < (n); ++i)
h[i] = x[i+1] - x[i];
std::vector<float> alpha(n);
for (int i = 1; i < n; ++i)
alpha[i] = ((3 / h[i]) * (a[i+1] - a[i])) - ((3 / h[i-1]) * (a[i] - a[i-1]));
std::vector<float> c(n+1);
std::vector<float> l(n+1);
std::vector<float> u(n+1);
std::vector<float> z(n+1);
l[0] = 1.0;
u[0] = z[0] = 0.0;
for (int i = 1; i < n; ++i){
l[i] = (2 * (x[i+1] - x[i-1])) - (h[i-1] * u[i-1]);
u[i] = h[i] / l[i];
z[i] = (alpha[i] - (h[i-1] * z[i-1])) / l[i];
}
l[n] = 1.0;
z[n] = c[n] = 0.0;
for (int j = (n - 1); j >= 0; j--){
c[j] = z[j] - (u[j] * c[j+1]);
b[j] = ((a[j+1] - a[j]) / h[j]) - ((h[j] / 3) * (c[j+1] + (2 * c[j])));
d[j] = (c[j+1] - c[j]) / (3 * h[j]);
}
std::vector<float> output_array(10);
for (int i = 0; i < n-1; i++){
float eval_point = (query_points[i] - x[i]);
output_array[i] = a[i] + (eval_point * b[i]) + ( eval_point * eval_point * c[i]) + (eval_point * eval_point * eval_point * d[i]);
std::cout << output_array[i] << std::endl;
}
system("pause");
return 0;
}

In hindsight, your code seems to be coded properly referring to the Wikipedia article. However, there is something you need to know about interp1 which I don't think you have taken into account when using it to check your answers.
MATLAB's interp1 when you specify the spline flag assumes that the end point conditions are not-a-knot. The algorithm specified on Wikipedia is the code for a natural spline.
As such, this is probably why your points do not match up. FWIW, consult: http://www.cs.tau.ac.il/~turkel/notes/numeng/spline_note.pdf and look at the diagram on the last page. You'll see that not-a-knot splines and natural splines bear the same shape, but have different y-values when your data consists of just the end points of your spline. However, should you have data points in between the end points, all of the different kinds of splines (more or less) have the same y values.
For the sake of completeness, here is the figure extracted from the PDF notes I referenced above:
If you want to use natural splines, use csape instead of interp1. This provides a cubic spline with end conditions. You call csape like this:
pp = csape(x,y);
x and y are the control points defined for your spline. By default, this returns a natural spline, which is what you're after, and is a struct of type ppform. You can then figure out what the spline evaluates to by using fnval:
yval = fnval(pp, xval);
xval and yval is the input x co-ordinate and the output evaluated for the spline at this particular x.
Use this, then check to see if your code matches up with the values provided by csape.
Minor Note
You need the Curve Fitting Toolbox in MATLAB to use csape. If you don't have this, then unfortunately this method will not work.

I think the interp1 is supported by MATLAB CODER.
Just use the CODER to generate the C code and you have what you need.

Related

How to multiply a sparse matrix and a dense vector?

I am trying the following:
Eigen::SparseMatrix<double> bijection(2 * face_count, 2 * vert_count);
/* initialization */
Eigen::VectorXd toggles(2 * vert_count);
toggles.setOnes();
Eigen::SparseMatrix<double> deformed;
deformed = bijection * toggles;
Eigen is returning an error claiming:
error: static assertion failed: THE_EVAL_EVALTO_FUNCTION_SHOULD_NEVER_BE_CALLED_FOR_DENSE_OBJECTS
586 | EIGEN_STATIC_ASSERT((internal::is_same<Dest,void>::value),THE_EVAL_EVALTO_FUNCTION_SHOULD_NEVER_BE_CALLED_FOR_DENSE_OBJECTS);
According to the eigen documentaion
Sparse matrix and vector products are allowed. What am I doing wrong?

The problem is you have the wrong output type for the product.
The Eigen documentation states that the following type of multiplication is defined:
dv2 = sm1 * dv1;
Sparse matrix times dense vector equals dense vector.
If you actually do need a sparse representation, I think there is no better way of getting one than performing the multiplication as above and then converting the product to a sparse matrix with the sparseView member function. e.g.
Eigen::SparseMatrix<double> bijection(2 * face_count, 2 * vert_count);
/* initialization */
Eigen::VectorXd toggles(2 * vert_count);
toggles.setOnes();
Eigen::VectorXd deformedDense = bijection * toggles;
Eigen::SparseMatrix<double> deformedSparse = deformedDense.sparseView();

This can be faster than outputting to a dense vector if it is very sparse. Otherwise, 99/100 times the conventional product is faster.
void sparsem_densev_sparsev(const SparseMatrix<double>& A, const VectorX<double>& x, SparseVector<double>& Ax)
{
Ax.resize(x.size());
for (int j = 0; j < A.outerSize(); ++j)
{
if (A.outerIndexPtr()[j + 1] - A.outerIndexPtr()[j] > 0)
{
Ax.insertBack(j) = 0;
}
}
for (int j_idx = 0; j_idx < Ax.nonZeros(); j_idx++)
{
int j = Ax.innerIndexPtr()[j_idx];
for (int k = A.outerIndexPtr()[j]; k < A.outerIndexPtr()[j + 1]; ++k)
{
int i = A.innerIndexPtr()[k];
Ax.valuePtr()[j_idx] += A.valuePtr()[k] * x.coeff(i);
}
}
}
For a (probably not optimal) self-adjoint version (lower triangle), change the j_idx loop to:
for (int j_idx = 0; j_idx < Ax.nonZeros(); j_idx++)
{
int j = Ax.innerIndexPtr()[j_idx];
int i_idx = j_idx;//i>= j, trick to improve binary search
for (int k = A.outerIndexPtr()[j]; k < A.outerIndexPtr()[j + 1]; ++k)
{
int i = A.innerIndexPtr()[k];
Ax.valuePtr()[j_idx] += A.valuePtr()[k] * x.coeff(i);
if (i != j)
{
i_idx = std::distance(Ax.innerIndexPtr(), std::lower_bound(Ax.innerIndexPtr() + i_idx, Ax.innerIndexPtr() + Ax.nonZeros(), i));
Ax.valuePtr()[i_idx] += A.valuePtr()[k] * x.coeff(j);
}
}
}

Implementing the Lanczos algorithm into C++ for a quantum anharmonic oscillator

Firstly, I would like to mention that I am a complete beginner when it comes to coding, let alone C++, so bear with me, as I need complete guidance. My task is to implement the Lanczos algorithm for the case of a 1-D anharmonic oscillator in C++, with reference to the paper linked Analytical Lanczos method.
The paper offers a step by step guide for the implementation of the algorithm:
Step by step guide here
with the initial trial function being: Psi_1 = (1 + x^2) * (exp(-x^2 - 1/4 * x^4).
The paper also contains code in MATHEMATICA for this particular case. Mathematica code
and thus, here is my attempt, which is greatly unfinished, however, I wanted to ensure I was going along the correct path with regards to the programming logic. There are still plentiful errors etc. (Also excuse the lack of fundamentals here, I am only a beginner. Thank you very much.)
int main() {
//Grid parameters.
const int Rmin = 1, Rmax = 31, nx = 300;//Grid length and stepsize.
double dx = (Rmax- Rmin) / nx; //Delta x.
double a, b;
std::vector<double> x, psi_1;
for (int j = 1; j < 64; ++j) { //Corresponds to each succesive Lanczos Vector.
for (int i = Rmin; i < nx + 1; i++) { //Defining the Hamiltonian on the grid.
x[i] = (nx / 2) + i;
psi_1[i] = (1 + pow(x[i] * dx, 2)) * exp(pow(-x[i] * dx, 2) - (1 / 4 * pow(x[i] * dx, 4 )) //Trial wavefunction.
H[i] = ((PSI[j][i + 1] - 2 * PSI[j][i] + PSI[j][i - 1]) / pow(dx, 2)) + PSI[j][i] * 1/2 * pow(x[i] * dx, 2) + PSI[j][i] * 2 * pow(x[i] * dx, 4) + PSI[j][i] * 1/2 * pow(x[i], 6); //Hamiltonian. ****
//First Lanczos step.
PSI[1][i] = psi_1[i]
}
//Normalisation of the wavefunction (b).
double b[j] = 0.0;
for (int i = Rmin; i < nx + 1; i++) {
PSI[1][i] = psi_1[i];
b[j] += abs(pow(PSI[j][i], 2));
}
b[j] = b[j] * dx;
for (int i = Rmin; i < nx + 1; i++) {
PSI[j] = PSI[j] / sqrt(b[j]);
}
//Expectation values (a). Main diagonal of the Hamiltonian matrix.
double a[j] = 0.0;
for (int i = Rmin; i < nx + 1; i++) {
a[j] += PSI[j] * H[i] * PSI[j] * dx
}
//Recursive expression.
PSI[j] = H[i] * PSI[j-1] - PSI[j-1] * a[j-1] - PSI[j-2] * b[j-1]
//Lanczos Matrix.
LanczosMatrix[R][C] =
for (int R = 1; R < 64; R++) {
row[R] =
}
}
I have yet to finish the code, but some experienced guidance would be greatly appreciated! (also, the code has to be cleaned up greatly, but this was an attempt to get the general idea down first.)

Time dependent 1D Schrodinger equation C++

I wrote the code in C++ which solves the time-dependent 1D Schrodinger equation for the anharmonic potential V = x^2/2 + lambda*x^4, using Thomas algorithm. My code is working and I animate the results in Mathematica, to check what is going on. I test the code against the known solution for the harmonic potential (I put lambda = 0), but the animation shows that abs(Psi) is changing with time, and I know that is not correct for the harmonic potential. Actually, I see that in one point it time it becomes constant, but before that is oscillating.
So I understand that I need to have constant magnitude of the wave function over the time interval, but I don't know how to do it, or where am I doing mistake.
Here is my code and the animation for 100 time steps and 100 points on the grid.
#include <iostream>
#include <iomanip>
#include <cmath>
#include <vector>
#include <cstdlib>
#include <complex>
#include <fstream>
using namespace std;
// Mandatory parameters
const int L = 1; //length of domain in x direction
const int tmax = 10; //end time
const int nx = 100, nt = 100; //number of the grid points and time steps respectively
double lambda; //dictates the shape of the potential (we can use lambda = 0.0
// to test the code against the known solution for the harmonic
// oscillator)
complex<double> I(0.0, 1.0); //imaginary unit
// Derived parameters
double delta_x = 1. / (nx - 1);
//spacing between the grid points
double delta_t = 1. / (nt - 1);
//the time step
double r = delta_t / (delta_x * delta_x); //used to simplify expressions for
// the coefficients of the lhs and
// rhs of the matrix eqn
// Algorithm for solving the tridiagonal matrix system
vector<complex<double> > thomas_algorithm(vector<double>& a,
vector<complex<double> >& b,
vector<double>& c,
vector<complex<double> >& d)
{
// Temporary wave function
vector<complex<double> > y(nx + 1, 0.0);
// Modified matrix coefficients
vector<complex<double> > c_prime(nx + 1, 0.0);
vector<complex<double> > d_prime(nx + 1, 0.0);
// This updates the coefficients in the first row
c_prime[0] = c[0] / b[0];
d_prime[0] = d[0] / b[0];
// Create the c_prime and d_prime coefficients in the forward sweep
for (int i = 1; i < nx + 1; i++)
{
complex<double> m = 1.0 / (b[i] - a[i] * c_prime[i - 1]);
c_prime[i] = c[i] * m;
d_prime[i] = (d[i] - a[i] * d_prime[i - 1]) * m;
}
// This gives the value of the last equation in the system
y[nx] = d_prime[nx];
// This is the reverse sweep, used to update the solution vector
for (int i = nx - 1; i > 0; i--)
{
y[i] = d_prime[i] - c_prime[i] * y[i + 1];
}
return y;
}
void calc()
{
// First create the vectors to store the coefficients
vector<double> a(nx + 1, 1.0);
vector<complex<double> > b(nx + 1, 0.0);
vector<double> c(nx + 1, 1.0);
vector<complex<double> > d(nx + 1, 0.0);
vector<complex<double> > psi(nx + 1, 0.0);
vector<complex<double> > phi(nx + 1, 0.0);
vector<double> V(nx + 1, 0.0);
vector<double> x(nx + 1, 0);
vector<vector<complex<double> > > PSI(nt + 1,
vector<complex<double> >(nx + 1,
0.0));
vector<double> prob(nx + 1, 0);
// We don't have the first member of the left diagonal and the last member
// of the right diagonal
a[0] = 0.0;
c[nx] = 0.0;
for (int i = 0; i < nx + 1; i++)
{
x[i] = (-nx / 2) + i; // Values on the x axis
// Eigenfunction of the harmonic oscillator in the ground state
phi[i] = exp(-pow(x[i] * delta_x, 2) / 2) / (pow(M_PI, 0.25));
// Anharmonic potential
V[i] = pow(x[i] * delta_x, 2) / 2 + lambda * pow(x[i] * delta_x, 4);
// The main diagonal coefficients
b[i] = 2.0 * I / r - 2.0 + V[i] * delta_x * delta_x;
}
double sum0 = 0.0;
for (int i = 0; i < nx + 1; i++)
{
PSI[0][i] = phi[i]; // Initial condition for the wave function
sum0 += abs(pow(PSI[0][i], 2)); // Needed for the normalization
}
sum0 = sum0 * delta_x;
for (int i = 0; i < nx + 1; i++)
{
PSI[0][i] = PSI[0][i] / sqrt(sum0); // Normalization of the initial
// wave function
}
for (int j = 0; j < nt; j++)
{
PSI[j][0] = 0.0;
PSI[j][nx] = 0.0; // Boundary conditions for the wave function
d[0] = 0.0;
d[nx] = 0.0; // Boundary conditions for the rhs
// Fill in the current time step vector d representing the rhs
for (int i = 1; i < nx + 1; i++)
{
d[i] = PSI[j][i + 1]
+ (2.0 - 2.0 * I / r - V[i] * delta_x * delta_x) * PSI[j][i]
+ PSI[j][i - 1];
}
// Now solve the tridiagonal system
psi = thomas_algorithm(a, b, c, d);
for (int i = 1; i < nx; i++)
{
PSI[j + 1][i] = psi[i]; // Assign values to the wave function
}
for (int i = 0; i < nx + 1; i++)
{
// Probability density of the wave function in the next time step
prob[i] = abs(PSI[j + 1][i] * conj(PSI[j + 1][i]));
}
double sum = 0.0;
for (int i = 0; i < nx + 1; i++)
{
sum += prob[i] * delta_x;
}
for (int i = 0; i < nx + 1; i++)
{
// Normalization of the wave function in the next time step
PSI[j + 1][i] /= sqrt(sum);
}
}
// Opening files for writing the results
ofstream file_psi_re, file_psi_imag, file_psi_abs, file_potential,
file_phi0;
file_psi_re.open("psi_re.dat");
file_psi_imag.open("psi_imag.dat");
file_psi_abs.open("psi_abs.dat");
for (int i = 0; i < nx + 1; i++)
{
file_psi_re << fixed << x[i] << " ";
file_psi_imag << fixed << x[i] << " ";
file_psi_abs << fixed << x[i] << " ";
for (int j = 0; j < nt + 1; j++)
{
file_psi_re << fixed << setprecision(6) << PSI[j][i].real() << " ";
file_psi_imag << fixed << setprecision(6) << PSI[j][i].imag()
<< " ";
file_psi_abs << fixed << setprecision(6) << abs(PSI[j][i]) << " ";
}
file_psi_re << endl;
file_psi_imag << endl;
file_psi_abs << endl;
}
}
int main(int argc, char **argv)
{
calc();
return 0;
}
The black line is abs(psi), the red one is Im(psi) and the blue one is Re(psi).

(Bear in mind that my computational physics course was ten years ago now)
You say you are solving a time-dependent system, but I don't see any time-dependence (even if lambda != 0). In the Schrodinger Equation, if the potential function does not depend on time then the different equation is called separable because you can solve the time component and spatial component of the differential equation separately.
The general solution in that case is just the solution to the time-independent Schrodinger Equation multiplied by exp(-iE/h_bar). When you plot the magnitude of the probability that term just becomes 1 and so the probability doesn't change over time. In these cases people quite typically just ignore the time component altogether.
All this is to say that since your potential function doesn't depend on time then you aren't solving a time-dependent Schrodinger Equation. The Tridiagonal Matrix Algorithm can only be used to solve ordinary differential equations, whereas if your potential depended on time you would have a partial differential equation and would need a different method to solve it. Also as a result of that plotting the probability density over time is rarely interesting.
As for why your potential is not constant, numerical methods for finding eigenvalues and eigenvectors rarely produce the normalised eigenvectors naturally, so are you manually normalising your eigenvector before computing your probabilities?

C++ Pattern Matching with FFT cross-correlation (Images)

everyone I am trying to implement patter matching with FFT but I am not sure what the result should be (I think I am missing something even though a read a lot of stuff about the problem and tried a lot of different implementations this one is the best so far). Here is my FFT correlation function.
void fft2d(fftw_complex**& a, int rows, int cols, bool forward = true)
{
fftw_plan p;
for (int i = 0; i < rows; ++i)
{
p = fftw_plan_dft_1d(cols, a[i], a[i], forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
}
fftw_complex* t = (fftw_complex*)fftw_malloc(rows * sizeof(fftw_complex));
for (int j = 0; j < cols; ++j)
{
for (int i = 0; i < rows; ++i)
{
t[i][0] = a[i][j][0];
t[i][1] = a[i][j][1];
}
p = fftw_plan_dft_1d(rows, t, t, forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
for (int i = 0; i < rows; ++i)
{
a[i][j][0] = t[i][0];
a[i][j][1] = t[i][1];
}
}
fftw_free(t);
}
int findCorrelation(int argc, char* argv[])
{
BMP bigImage;
BMP keyImage;
BMP result;
RGBApixel blackPixel = { 0, 0, 0, 1 };
const bool swapQuadrants = (argc == 4);
if (argc < 3 || argc > 4) {
cout << "correlation img1.bmp img2.bmp" << endl;
return 1;
}
if (!keyImage.ReadFromFile(argv[1])) {
return 1;
}
if (!bigImage.ReadFromFile(argv[2])) {
return 1;
}
//Preparations
const int maxWidth = std::max(bigImage.TellWidth(), keyImage.TellWidth());
const int maxHeight = std::max(bigImage.TellHeight(), keyImage.TellHeight());
const int rowsCount = maxHeight;
const int colsCount = maxWidth;
BMP bigTemp = bigImage;
BMP keyTemp = keyImage;
keyImage.SetSize(maxWidth, maxHeight);
bigImage.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
if (i < bigTemp.TellHeight() && j < bigTemp.TellWidth()) {
p1 = bigTemp.GetPixel(j, i);
} else {
p1 = blackPixel;
}
bigImage.SetPixel(j, i, p1);
RGBApixel p2;
if (i < keyTemp.TellHeight() && j < keyTemp.TellWidth()) {
p2 = keyTemp.GetPixel(j, i);
} else {
p2 = blackPixel;
}
keyImage.SetPixel(j, i, p2);
}
//Here is where the transforms begin
fftw_complex **a = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **b = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **c = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
for (int i = 0; i < rowsCount; ++i) {
a[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
b[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
c[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
p1 = bigImage.GetPixel(j, i);
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
RGBApixel p2;
p2 = keyImage.GetPixel(j, i);
b[i][j][0] = (0.299*p2.Red + 0.587*p2.Green + 0.114*p2.Blue);
b[i][j][1] = 0.0;
}
}
fft2d(a, rowsCount, colsCount);
fft2d(b, rowsCount, colsCount);
result.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
fftw_complex& y = a[i][j];
fftw_complex& x = b[i][j];
double u = x[0], v = x[1];
double m = y[0], n = y[1];
c[i][j][0] = u*m + n*v;
c[i][j][1] = v*m - u*n;
int fx = j;
if (fx>(colsCount / 2)) fx -= colsCount;
int fy = i;
if (fy>(rowsCount / 2)) fy -= rowsCount;
float r2 = (fx*fx + fy*fy);
const double cuttoffCoef = (maxWidth * maxHeight) / 37992.;
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
}
fft2d(c, rowsCount, colsCount, false);
const int halfCols = colsCount / 2;
const int halfRows = rowsCount / 2;
if (swapQuadrants) {
for (int i = 0; i < halfRows; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i + halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i + halfRows][j + halfCols][1]);
}
for (int i = halfRows; i < rowsCount; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i - halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i - halfRows][j + halfCols][1]);
}
}
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
const double& g = c[i][j][0];
RGBApixel pixel;
pixel.Alpha = 0;
int gInt = 255 - static_cast<int>(std::floor(g + 0.5));
pixel.Red = gInt;
pixel.Green = gInt;
pixel.Blue = gInt;
result.SetPixel(j, i, pixel);
}
BMP res;
res.SetSize(maxWidth, maxHeight);
result.WriteToFile("result.bmp");
return 0;
}
Sample output

This question would probably be more appropriately posted on another site like cross validated (metaoptimize.com used to also be a good one, but it appears to be gone)
That said:
There's two similar operations you can perform with FFT: convolution and correlation. Convolution is used for determining how two signals interact with each-other, whereas correlation can be used to express how similar two signals are to each-other. Make sure you're doing the right operation as they're both commonly implemented throught a DFT.
For this type of application of DFTs you usually wouldn't extract any useful information in the fourier spectrum unless you were looking for frequencies common to both data sources or whatever (eg, if you were comparing two bridges to see if their supports are spaced similarly).
Your 3rd image looks a lot like the power domain; normally I see the correlation output entirely grey except where overlap occurred. Your code definitely appears to be computing the inverse DFT, so unless I'm missing something the only other explanation I've come up with for the fuzzy look could be some of the "fudge factor" code in there like:
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
As for what you should expect: wherever there are common elements between the two images you'll see a peak. The larger the peak, the more similar the two images are near that region.
Some comments and/or recommended changes:
1) Convolution & correlation are not scale invariant operations. In other words, the size of your pattern image can make a significant difference in your output.
2) Normalize your images before correlation.
When you get the image data ready for the forward DFT pass:
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
/* ... */
How you grayscale the image is your business (though I would've picked something like sqrt( r*r + b*b + g*g )). However, I don't see you doing anything to normalize the image.
The word "normalize" can take on a few different meanings in this context. Two common types:
normalize the range of values between 0.0 and 1.0
normalize the "whiteness" of the images
3) Run your pattern image through an edge enhancement filter. I've personally made use of canny, sobel, and I think I messed with a few others. As I recall, canny was "quick'n dirty", sobel was more expensive, but I got comparable results when it came time to do correlation. See chapter 24 of the "dsp guide" book that's freely available online. The whole book is worth your time, but if you're low on time then at a minimum chapter 24 will help a lot.
4) Re-scale the output image between [0, 255]; if you want to implement thresholds, do it after this step because the thresholding step is lossy.
My memory on this one is hazy, but as I recall (edited for clarity):
You can scale the final image pixels (before rescaling) between [-1.0, 1.0] by dividing off the largest power spectrum value from the entire power spectrum
The largest power spectrum value is, conveniently enough, the center-most value in the power spectrum (corresponding to the lowest frequency)
If you divide it off the power spectrum, you'll end up doing twice the work; since FFTs are linear, you can delay the division until after the inverse DFT pass to when you're re-scaling the pixels between [0..255].
If after rescaling most of your values end up so black you can't see them, you can use a solution to the ODE y' = y(1 - y) (one example is the sigmoid f(x) = 1 / (1 + exp(-c*x) ), for some scaling factor c that gives better gradations). This has more to do with improving your ability to interpret the results visually than anything you might use to programmatically find peaks.
edit I said [0, 255] above. I suggest you rescale to [128, 255] or some other lower bound that is gray rather than black.

Optimize log entropy calculation in sparse matrix

I have a 3007 x 1644 dimensional matrix of terms and documents. I am trying to assign weights to frequency of terms in each document so I'm using this log entropy formula http://en.wikipedia.org/wiki/Latent_semantic_indexing#Term_Document_Matrix (See entropy formula in the last row).
I'm successfully doing this but my code is running for >7 minutes.
Here's the code:
int N = mat.cols();
for(int i=1;i<=mat.rows();i++){
double gfi = sum(mat(i,colon()))(1,1); //sum of occurrence of terms
double g =0;
if(gfi != 0){// to avoid divide by zero error
for(int j = 1;j<=N;j++){
double tfij = mat(i,j);
double pij = gfi==0?0.0:tfij/gfi;
pij = pij + 1; //avoid log0
double G = (pij * log(pij))/log(N);
g = g + G;
}
}
double gi = 1 - g;
for(int j=1;j<=N;j++){
double tfij = mat(i,j) + 1;//avoid log0
double aij = gi * log(tfij);
mat(i,j) = aij;
}
}
Anyone have ideas how I can optimize this to make it faster? Oh and mat is a RealSparseMatrix from amlpp matrix library.
UPDATE
Code runs on Linux mint with 4gb RAM and AMD Athlon II dual core
Running time before change: > 7mins
After #Kereks answer: 4.1sec

Here's a very naive rewrite that removes some redundancies:
int const N = mat.cols();
double const logN = log(N);
for (int i = 1; i <= mat.rows(); ++i)
{
double const gfi = sum(mat(i, colon()))(1, 1); // sum of occurrence of terms
double g = 0;
if (gfi != 0)
{
for (int j = 1; j <= N; ++j)
{
double const pij = mat(i, j) / gfi + 1;
g += pij * log(pij);
}
g /= logN;
}
for (int j = 1; j <= N; ++j)
{
mat(i,j) = (1 - g) * log(mat(i, j) + 1);
}
}
Also make sure that the matrix data structure is sane (e.g. a flat array accessed in strides; not a bunch of dynamically allocated rows).
Also, I think the first + 1 is a bit silly. You know that x -> x * log(x) is continuous at zero with limit zero, so you should write:
double const pij = mat(i, j) / gfi;
if (pij != 0) { g += pij + log(pij); }
In fact, you might even write the first inner for loop like this, avoiding a division when it isn't needed:
for (int j = 1; j <= N; ++j)
{
if (double pij = mat(i, j))
{
pij /= gfi;
g += pij * log(pij);
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++: replicating matlab's interp1 spline interpolation function - c++

I think the interp1 is supported by MATLAB CODER. Just use the CODER to generate the C code and you have what you need.

Related

How to multiply a sparse matrix and a dense vector?

Implementing the Lanczos algorithm into C++ for a quantum anharmonic oscillator

Time dependent 1D Schrodinger equation C++

C++ Pattern Matching with FFT cross-correlation (Images)

Optimize log entropy calculation in sparse matrix

Categories

Resources