Implementing Gaussian Blur - How to calculate convolution matrix (kernel) - c++

My question is very close to this question: How do I gaussian blur an image without using any in-built gaussian functions?
The answer to this question is very good, but it doesn't give an example of actually calculating a real Gaussian filter kernel. The answer gives an arbitrary kernel and shows how to apply the filter using that kernel but not how to calculate a real kernel itself. I am trying to implement a Gaussian blur in C++ or Matlab from scratch, so I need to know how to calculate the kernel from scratch.
I'd appreciate it if someone could calculate a real Gaussian filter kernel using any small example image matrix.

You can create a Gaussian kernel from scratch as noted in MATLAB documentation of fspecial. Please read the Gaussian kernel creation formula in the algorithms part in that page and follow the code below. The code is to create an m-by-n matrix with sigma = 1.
m = 5; n = 5;
sigma = 1;
[h1, h2] = meshgrid(-(m-1)/2:(m-1)/2, -(n-1)/2:(n-1)/2);
hg = exp(- (h1.^2+h2.^2) / (2*sigma^2));
h = hg ./ sum(hg(:));
h =
0.0030 0.0133 0.0219 0.0133 0.0030
0.0133 0.0596 0.0983 0.0596 0.0133
0.0219 0.0983 0.1621 0.0983 0.0219
0.0133 0.0596 0.0983 0.0596 0.0133
0.0030 0.0133 0.0219 0.0133 0.0030
Observe that this can be done by the built-in fspecial as follows:
fspecial('gaussian', [m n], sigma)
ans =
0.0030 0.0133 0.0219 0.0133 0.0030
0.0133 0.0596 0.0983 0.0596 0.0133
0.0219 0.0983 0.1621 0.0983 0.0219
0.0133 0.0596 0.0983 0.0596 0.0133
0.0030 0.0133 0.0219 0.0133 0.0030
I think it is straightforward to implement this in any language you like.
EDIT: Let me also add the values of h1 and h2 for the given case, since you may be unfamiliar with meshgrid if you code in C++.
h1 =
-2 -1 0 1 2
-2 -1 0 1 2
-2 -1 0 1 2
-2 -1 0 1 2
-2 -1 0 1 2
h2 =
-2 -2 -2 -2 -2
-1 -1 -1 -1 -1
0 0 0 0 0
1 1 1 1 1
2 2 2 2 2

It's as simple as it sounds:
double sigma = 1;
int W = 5;
double kernel[W][W];
double mean = W/2;
double sum = 0.0; // For accumulating the kernel values
for (int x = 0; x < W; ++x)
for (int y = 0; y < W; ++y) {
kernel[x][y] = exp( -0.5 * (pow((x-mean)/sigma, 2.0) + pow((y-mean)/sigma,2.0)) )
/ (2 * M_PI * sigma * sigma);
// Accumulate the kernel values
sum += kernel[x][y];
}
// Normalize the kernel
for (int x = 0; x < W; ++x)
for (int y = 0; y < W; ++y)
kernel[x][y] /= sum;

To implement the gaussian blur you simply take the gaussian function and compute one value for each of the elements in your kernel.
Usually you want to assign the maximum weight to the central element in your kernel and values close to zero for the elements at the kernel borders.
This implies that the kernel should have an odd height (resp. width) to ensure that there actually is a central element.
To compute the actual kernel elements you may scale the gaussian bell to the kernel grid (choose an arbitrary e.g. sigma = 1 and an arbitrary range e.g. -2*sigma ... 2*sigma) and normalize it, s.t. the elements sum to one.
To achieve this, if you want to support arbitrary kernel sizes, you might want to adapt the sigma to the required kernel size.
Here's a C++ example:
#include <cmath>
#include <vector>
#include <iostream>
#include <iomanip>
double gaussian( double x, double mu, double sigma ) {
const double a = ( x - mu ) / sigma;
return std::exp( -0.5 * a * a );
}
typedef std::vector<double> kernel_row;
typedef std::vector<kernel_row> kernel_type;
kernel_type produce2dGaussianKernel (int kernelRadius) {
double sigma = kernelRadius/2.;
kernel_type kernel2d(2*kernelRadius+1, kernel_row(2*kernelRadius+1));
double sum = 0;
// compute values
for (int row = 0; row < kernel2d.size(); row++)
for (int col = 0; col < kernel2d[row].size(); col++) {
double x = gaussian(row, kernelRadius, sigma)
* gaussian(col, kernelRadius, sigma);
kernel2d[row][col] = x;
sum += x;
}
// normalize
for (int row = 0; row < kernel2d.size(); row++)
for (int col = 0; col < kernel2d[row].size(); col++)
kernel2d[row][col] /= sum;
return kernel2d;
}
int main() {
kernel_type kernel2d = produce2dGaussianKernel(3);
std::cout << std::setprecision(5) << std::fixed;
for (int row = 0; row < kernel2d.size(); row++) {
for (int col = 0; col < kernel2d[row].size(); col++)
std::cout << kernel2d[row][col] << ' ';
std::cout << '\n';
}
}
The output is:
$ g++ test.cc && ./a.out
0.00134 0.00408 0.00794 0.00992 0.00794 0.00408 0.00134
0.00408 0.01238 0.02412 0.03012 0.02412 0.01238 0.00408
0.00794 0.02412 0.04698 0.05867 0.04698 0.02412 0.00794
0.00992 0.03012 0.05867 0.07327 0.05867 0.03012 0.00992
0.00794 0.02412 0.04698 0.05867 0.04698 0.02412 0.00794
0.00408 0.01238 0.02412 0.03012 0.02412 0.01238 0.00408
0.00134 0.00408 0.00794 0.00992 0.00794 0.00408 0.00134
As a simplification you don't need to use a 2d-kernel. Easier to implement and also more efficient to compute is to use two orthogonal 1d-kernels. This is possible due to the associativity of this type of a linear convolution (linear separability).
You may also want to see this section of the corresponding wikipedia article.
Here's the same in Python (in the hope someone might find it useful):
from math import exp
def gaussian(x, mu, sigma):
return exp( -(((x-mu)/(sigma))**2)/2.0 )
#kernel_height, kernel_width = 7, 7
kernel_radius = 3 # for an 7x7 filter
sigma = kernel_radius/2. # for [-2*sigma, 2*sigma]
# compute the actual kernel elements
hkernel = [gaussian(x, kernel_radius, sigma) for x in range(2*kernel_radius+1)]
vkernel = [x for x in hkernel]
kernel2d = [[xh*xv for xh in hkernel] for xv in vkernel]
# normalize the kernel elements
kernelsum = sum([sum(row) for row in kernel2d])
kernel2d = [[x/kernelsum for x in row] for row in kernel2d]
for line in kernel2d:
print ["%.3f" % x for x in line]
produces the kernel:
['0.001', '0.004', '0.008', '0.010', '0.008', '0.004', '0.001']
['0.004', '0.012', '0.024', '0.030', '0.024', '0.012', '0.004']
['0.008', '0.024', '0.047', '0.059', '0.047', '0.024', '0.008']
['0.010', '0.030', '0.059', '0.073', '0.059', '0.030', '0.010']
['0.008', '0.024', '0.047', '0.059', '0.047', '0.024', '0.008']
['0.004', '0.012', '0.024', '0.030', '0.024', '0.012', '0.004']
['0.001', '0.004', '0.008', '0.010', '0.008', '0.004', '0.001']

OK, a late answer but in case of...
Using the #moooeeeep answer, but with numpy;
import numpy as np
radius = 3
sigma = radius/2.
k = np.arange(2*radius +1)
row = np.exp( -(((k - radius)/(sigma))**2)/2.)
col = row.transpose()
out = np.outer(row, col)
out = out/np.sum(out)
for line in out:
print(["%.3f" % x for x in line])
Just a bit less of lines.

Gaussian blur in python using PIL image library. For more info read this: http://blog.ivank.net/fastest-gaussian-blur.html
from PIL import Image
import math
# img = Image.open('input.jpg').convert('L')
# r = radiuss
def gauss_blur(img, r):
imgData = list(img.getdata())
bluredImg = Image.new(img.mode, img.size)
bluredImgData = list(bluredImg.getdata())
rs = int(math.ceil(r * 2.57))
for i in range(0, img.height):
for j in range(0, img.width):
val = 0
wsum = 0
for iy in range(i - rs, i + rs + 1):
for ix in range(j - rs, j + rs + 1):
x = min(img.width - 1, max(0, ix))
y = min(img.height - 1, max(0, iy))
dsq = (ix - j) * (ix - j) + (iy - i) * (iy - i)
weight = math.exp(-dsq / (2 * r * r)) / (math.pi * 2 * r * r)
val += imgData[y * img.width + x] * weight
wsum += weight
bluredImgData[i * img.width + j] = round(val / wsum)
bluredImg.putdata(bluredImgData)
return bluredImg

// my_test.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <cmath>
#include <vector>
#include <iostream>
#include <iomanip>
#include <string>
//https://stackoverflow.com/questions/8204645/implementing-gaussian-blur-how-to-calculate-convolution-matrix-kernel
//https://docs.opencv.org/2.4/modules/imgproc/doc/filtering.html#getgaussiankernel
//http://dev.theomader.com/gaussian-kernel-calculator/
double gaussian(double x, double mu, double sigma) {
const double a = (x - mu) / sigma;
return std::exp(-0.5 * a * a);
}
typedef std::vector<double> kernel_row;
typedef std::vector<kernel_row> kernel_type;
kernel_type produce2dGaussianKernel(int kernelRadius, double sigma) {
kernel_type kernel2d(2 * kernelRadius + 1, kernel_row(2 * kernelRadius + 1));
double sum = 0;
// compute values
for (int row = 0; row < kernel2d.size(); row++)
for (int col = 0; col < kernel2d[row].size(); col++) {
double x = gaussian(row, kernelRadius, sigma)
* gaussian(col, kernelRadius, sigma);
kernel2d[row][col] = x;
sum += x;
}
// normalize
for (int row = 0; row < kernel2d.size(); row++)
for (int col = 0; col < kernel2d[row].size(); col++)
kernel2d[row][col] /= sum;
return kernel2d;
}
char* gMatChar[10] = {
" ",
" ",
" ",
" ",
" ",
" ",
" ",
" ",
" ",
" "
};
static int countSpace(float aValue)
{
int count = 0;
int value = (int)aValue;
while (value > 9)
{
count++;
value /= 10;
}
return count;
}
int main() {
while (1)
{
char str1[80]; // window size
char str2[80]; // sigma
char str3[80]; // coefficient
int space;
int i, ch;
printf("\n-----------------------------------------------------------------------------\n");
printf("Start generate Gaussian matrix\n");
printf("-----------------------------------------------------------------------------\n");
// input window size
printf("\nPlease enter window size (from 3 to 10) It should be odd (ksize/mod 2 = 1 ) and positive: Exit enter q \n");
for (i = 0; (i < 80) && ((ch = getchar()) != EOF)
&& (ch != '\n'); i++)
{
str1[i] = (char)ch;
}
// Terminate string with a null character
str1[i] = '\0';
if (str1[0] == 'q')
{
break;
}
int input1 = atoi(str1);
int window_size = input1 / 2;
printf("Input window_size was: %d\n", input1);
// input sigma
printf("Please enter sigma. Use default press Enter . Exit enter q \n");
str2[0] = '0';
for (i = 0; (i < 80) && ((ch = getchar()) != EOF)
&& (ch != '\n'); i++)
{
str2[i] = (char)ch;
}
// Terminate string with a null character
str2[i] = '\0';
if (str2[0] == 'q')
{
break;
}
float input2 = atof(str2);
float sigma;
if (input2 == 0)
{
// Open-CV sigma � Gaussian standard deviation. If it is non-positive, it is computed from ksize as sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8 .
sigma = 0.3*((input1 - 1)*0.5 - 1) + 0.8;
}
else
{
sigma = input2;
}
printf("Input sigma was: %f\n", sigma);
// input Coefficient K
printf("Please enter Coefficient K. Use default press Enter . Exit enter q \n");
str3[0] = '0';
for (i = 0; (i < 80) && ((ch = getchar()) != EOF)
&& (ch != '\n'); i++)
{
str3[i] = (char)ch;
}
// Terminate string with a null character
str3[i] = '\0';
if (str3[0] == 'q')
{
break;
}
int input3 = atoi(str3);
int cK;
if (input3 == 0)
{
cK = 1;
}
else
{
cK = input3;
}
float sum_f = 0;
float temp_f;
int sum = 0;
int temp;
printf("Input Coefficient K was: %d\n", cK);
printf("\nwindow size=%d | Sigma = %f Coefficient K = %d\n\n\n", input1, sigma, cK);
kernel_type kernel2d = produce2dGaussianKernel(window_size, sigma);
std::cout << std::setprecision(input1) << std::fixed;
for (int row = 0; row < kernel2d.size(); row++) {
for (int col = 0; col < kernel2d[row].size(); col++)
{
temp_f = cK* kernel2d[row][col];
sum_f += temp_f;
space = countSpace(temp_f);
std::cout << gMatChar[space] << temp_f << ' ';
}
std::cout << '\n';
}
printf("\n Sum array = %f | delta = %f", sum_f, sum_f - cK);
// rounding
printf("\nRecommend use round(): window size=%d | Sigma = %f Coefficient K = %d\n\n\n", input1, sigma, cK);
sum = 0;
std::cout << std::setprecision(0) << std::fixed;
for (int row = 0; row < kernel2d.size(); row++) {
for (int col = 0; col < kernel2d[row].size(); col++)
{
temp = round(cK* kernel2d[row][col]);
sum += temp;
space = countSpace((float)temp);
std::cout << gMatChar[space] << temp << ' ';
}
std::cout << '\n';
}
printf("\n Sum array = %d | delta = %d", sum, sum - cK);
// recommented
sum_f = 0;
int cK_d = 1 / kernel2d[0][0];
cK_d = cK_d / 2 * 2;
printf("\nRecommend: window size=%d | Sigma = %f Coefficient K = %d\n\n\n", input1, sigma, cK_d);
std::cout << std::setprecision(input1) << std::fixed;
for (int row = 0; row < kernel2d.size(); row++) {
for (int col = 0; col < kernel2d[row].size(); col++)
{
temp_f = cK_d* kernel2d[row][col];
sum_f += temp_f;
space = countSpace(temp_f);
std::cout << gMatChar[space] << temp_f << ' ';
}
std::cout << '\n';
}
printf("\n Sum array = %f | delta = %f", sum_f, sum_f - cK_d);
// rounding
printf("\nRecommend use round(): window size=%d | Sigma = %f Coefficient K = %d\n\n\n", input1, sigma, cK_d);
sum = 0;
std::cout << std::setprecision(0) << std::fixed;
for (int row = 0; row < kernel2d.size(); row++) {
for (int col = 0; col < kernel2d[row].size(); col++)
{
temp = round(cK_d* kernel2d[row][col]);
sum += temp;
space = countSpace((float)temp);
std::cout << gMatChar[space] << temp << ' ';
}
std::cout << '\n';
}
printf("\n Sum array = %d | delta = %d", sum, sum - cK_d);
}
}

function kernel = gauss_kernel(m, n, sigma)
% Generating Gauss Kernel
x = -(m-1)/2 : (m-1)/2;
y = -(n-1)/2 : (n-1)/2;
for i = 1:m
for j = 1:n
xx(i,j) = x(i);
yy(i,j) = y(j);
end
end
kernel = exp(-(xx.*xx + yy.*yy)/(2*sigma*sigma));
% Normalize the kernel
kernel = kernel/sum(kernel(:));
% Corresponding function in MATLAB
% fspecial('gaussian', [m n], sigma)

Here's a calculation in C#, which does not take single samples from the gaussian (or another kernel) function, but it calculates a large number of samples in a small grid and integrates the samples in the desired number of sections.
The calculation is for 1D, but it may easily be extended to 2D.
This calculation uses some other functions, which I did not add here, but I have added the function signatures so that you will know what they do.
This calculation produces the following discrete values for the limits +/- 3 (sum areaSum of integral is 0.997300):
kernel size: normalized kernel values, rounded to 6 decimals
3: 0.157731, 0.684538, 0.157731
5: 0.034674, 0.238968, 0.452716, 0.238968, 0.034674
7: 0.014752, 0.083434, 0.235482, 0.332663, 0.235482, 0.083434, 0.014752
This calculation produces the following discrete values for the limits +/- 2 (sum areaSum of integral is 0.954500):
kernel size: normalized kernel values, rounded to 6 decimals
3: 0.240694, 0.518612, 0.240694
5: 0.096720, 0.240449, 0.325661, 0.240449, 0.096720
7: 0.056379, 0.124798, 0.201012, 0.235624, 0.201012, 0.124798, 0.056379
Code:
using System.Linq;
private static void Main ()
{
int positionCount = 1024; // Number of samples in the range 0..1.
double positionStepSize = 1.0 / positionCount;
double limit = 3; // The calculation range of the kernel. +/- 3 is sufficient for gaussian.
int sectionCount = 3; // The number of elements in the kernel. May be 1, 3, 5, 7, ... (n*2+1)
// calculate the x positions for each kernel value to calculate.
double[] positions = CreateSeries (-limit, positionStepSize, (int)(limit * 2 * positionCount + 1));
// calculate the gaussian function value for each position
double[] values = positions.Select (pos => Gaussian (pos)).ToArray ();
// split the values into equal-sized sections and calculate the integral of each section.
double[] areas = IntegrateInSections (values, positionStepSize, sectionCount);
double areaSum = areas.Sum ();
// normalize to 1
double[] areas1 = areas.Select (a => a / areaSum).ToArray ();
double area1Sum = areas1.Sum (); // just to check it's 1 now
}
///-------------------------------------------------------------------
/// <summary>
/// Create a series of <paramref name="i_count"/> numbers, starting at <paramref name="i_start"/> and increasing by <paramref name="i_stepSize"/>.
/// </summary>
/// <param name="i_start">The start value of the series.</param>
/// <param name="i_stepSize">The step size between values in the series.</param>
/// <param name="i_count">The number of elements in the series.</param>
///-------------------------------------------------------------------
public static double[] CreateSeries (double i_start,
double i_stepSize,
int i_count)
{ ... }
private static readonly double s_gaussian_Divisor = Math.Sqrt (Math.PI * 2.0);
/// ------------------------------------------------------------------
/// <summary>
/// Calculate the value for the given position in a Gaussian kernel.
/// </summary>
/// <param name="i_position"> The position in the kernel for which the value will be calculated. </param>
/// <param name="i_bandwidth"> The width factor of the kernel. </param>
/// <returns> The value for the given position in a Gaussian kernel. </returns>
/// ------------------------------------------------------------------
public static double Gaussian (double i_position,
double i_bandwidth = 1)
{
double position = i_position / i_bandwidth;
return Math.Pow (Math.E, -0.5 * position * position) / s_gaussian_Divisor / i_bandwidth;
}
/// ------------------------------------------------------------------
/// <summary>
/// Calculate the integrals in the given number of sections of all given values with the given distance between the values.
/// </summary>
/// <param name="i_values"> The values for which the integral will be calculated. </param>
/// <param name="i_distance"> The distance between the values. </param>
/// <param name="i_sectionCount"> The number of sections in the integration. </param>
/// ------------------------------------------------------------------
public static double[] IntegrateInSections (IReadOnlyCollection<double> i_values,
double i_distance,
int i_sectionCount)
{ ... }

Related

Problem with getting calculations of an array inside of an array done right

Using the formula in the pic, I need to write a program that allows the user to calculate sin(x), cos(x), tan(x). The user should enter the angle in degrees, and then the program should transform it into radians before performing the three requested calculations. For each requested calculation (i.e., sin(x), cos(x), tan(x)), I only need to calculate the first 15 terms of the series.
The problem seems to be in the arrays of the last block in the code, it keeps returning wrong results of the tan(x) series; how can I fix it?
#include <iostream>
using namespace std;
//create a function to convert angles from degrees to radian
double convertToRadian(double deg)
{ //formula : radian = (degree * pi)/180
const double pi = 3.14159265359; //declaring pi's value as a constant
return (deg * (pi / 180)); //returning the radian value
}
//create a function to calculate the exponent/power
double power(double base, unsigned int exp)
{
double result = 1;
for(int i = 0; i < exp; i++){
result = result * base;
}
return result;
}
//create a function to get the factorial of a value
double factorial(int fac)
{
if(fac > 1)
return fac * factorial(fac - 1);
else
return 1;
}
//create a function to print out arrays as we will use it to print the terms in the series
void printTerms(double terms[15])
{ for (int i = 0; i < 15; i++)
{
cout<<terms[i]<<endl;
}
}
int main()
{
double degree; //declare the variables used in the program
double valueOfCos, valueOfSin, valueOfTan; //declare variables for terms of each function
cout << "Enter angle (x) in degrees: " << endl; //prompt for user to enter angle in deg
cin >> degree;
double radian = convertToRadian(degree); //first, converting from degrees to radian
//make an array for the first 15 terms of cos(x):
double cos[15];
//make a loop to insert values in the array
for (int n = 0; n < 15; n++)
{ //type the maclaurin series formula for cos(x):
valueOfCos = (( power(-1 , n)) / (factorial(2*n))) * (power(radian, (2*n)));
cos[n] = valueOfCos;
}
//print out the first 15 terms of cos(x) in the maclaurin series:
cout << "cos(x)= ";
printTerms (cos);
//make an array for the first 15 terms of sin(x):
double sin[15];
for (int n = 0; n < 15; n++)
{
valueOfSin = ((power(-1 , n)) / (factorial((2*n + 1)))) * (power(radian, (2*n + 1)));
sin[n] = valueOfSin;
}
cout << "sin(x)= ";
printTerms (sin);
double tan[15];
for (int n = 0; n < 15; n++)
{ double bernoulli[15] = {(1/6), (-1/30),(1/42), (-1/30), (5/66), (-691/2730),
(7/6), (-3617/510), (43867/798), (-174611/330), (854513/138), (-236364091/2730),
(8553103/6),(-23749461029/870),(8615841276005/14322) };
for (int i = 0; i < 15; i++)
{
double firstNum = 0, secondNum = 0 , thirdNum = 0 , denominator = 0;
firstNum = power(-1 , n);
secondNum = power(2 , 2*n + 2);
thirdNum = ((secondNum) - 1);
denominator = factorial(2*n + 2);
valueOfTan = ((firstNum * secondNum * thirdNum * (bernoulli[i])) / denominator) *
(power(radian, 2*n + 1));
tan [n] = valueOfTan;
}
}
cout << "tan(x)= ";
printTerms (tan);
return 0;
}
This loop : for (int n = 0; n < 15; n++) is not running or entire expression. You'll need to correct something like this :
double bernoulli[15] = {(1/6), (-1/30),(1/42), (-1/30), (5/66), (-691/2730),(7/6), (-3617/510), (43867/798), (-174611/330), (854513/138), (-236364091/2730),(8553103/6),(-23749461029/870),(8615841276005/14322) };
for (int n = 0; n < 15; n++){
double firstNum = 0, secondNum = 0 , thirdNum = 0 , denominator = 0;
firstNum = power(-1 , n);
secondNum = power(2 , 2*n + 2);
thirdNum = ((secondNum) - 1);
denominator = factorial(2*n + 2);
valueOfTan = ((firstNum * secondNum * thirdNum * (bernoulli[n])) / denominator) * (power(radian, 2*n + 1));
tan [n] = valueOfTan;
}
}
You are incorrectly calculating the tan value.
In valueOfTan = ((firstNum * secondNum * thirdNum * (bernoulli[i])) / denominator) * (power(radian, 2 * n + 1));
Instead of bernoulli[i], you need to have bernoulli[2*i+2] as per the formulae.
And one more suggestion please pull the double bernoulli[15] = {(1/6), (-1/30),(1/42), (-1/30), (5/66), (-691/2730), (7/6), (-3617/510), (43867/798), (-174611/330), (854513/138), (-236364091/2730), (8553103/6),(-23749461029/870),(8615841276005/14322) array initialization out of the for loop, as it's constant you don't need to initialize it every time unnecessarily. It will increase your code runtime

MPI and Segmentation Faults

Alright so this program is meant to simulate a solar system by semi-randomly generating a star, semi-randomly generating planets around the star, simulating the passing of time (using MPI to spread out the computational load), and determining habitability of resulting planets. I should have it commented for readability.
I am however having a problem with getting MPI working. As far as I can tell I'm doing something wrong that prevents it from initializing properly. Here's the errors I get.
OrbitPlus.cpp:323:50: error: invalid conversion from ‘char’ to ‘char**’ [-fpermissive]
system1 = Time( system, n , dt , argc, **argv);
^
OrbitPlus.cpp:191:33: error: initializing argument 5 of ‘std::vector<std::vector<float> > Time(std::vector<std::vector<float> >, int, float, int, char**)’ [-fpermissive]
std::vector<std::vector<float>> Time( std::vector<std::vector<float>> system , int n, float dt, int argc, char **argv){
^
I do find it interesting that both errors are considered fpermissive errors if when I compile it with -
mpic++ -std=c++11 -o OrbitPlus OrbitPlus.cpp
So it seems if I was feeling adventurous I could just run the code with -fpermissive option and roll the dice, but I don't feel like being so brave. Clearly the errors are related to each other.
Here's my code.
#include <cstdlib>
#include <fstream>
#include <iostream>
#include <tuple>
#include <vector>
#include <stdio.h>
#include <math.h>
#include <complex>
#include <stdint.h>
#include <time.h>
#include <string.h>
#include <algorithm>
#include "mpi.h"
double MyRandom(){
//////////////////////////
//Random Number Generator
//Returns number between 0-99
//////////////////////////
double y = 0;
unsigned seed = time(0);
std::srand(seed);
uint64_t x = std::rand();
x ^= x << 13;
x ^= x >> 7;
x ^= x << 17;
x = (1070739 * x) % 2199023255530;
y = x / 21990232555.31 ;
return y;
}
////////////////////////
///////////////////////
std::tuple< char , float , float , float , int > Star(){
////////////////////////////
//Star will generate a Star
//Randomly or User Selected
//Class, Luminosity, Probability, Radius, Mass, Temperature
//Stars always take up 99% of the mass of the system.
///////////////////////////
char Class;
int choice = 8;
float L, R, M, T;
double y = 4;
std::tuple< char , float , float , float , float > star( Class , L , R , M , T) ;
std::cout << "Select Star Class (OBAFGKM) or Select 8 for Random" << std::endl;
std::cout << "1 = O, 2 = B, 3 = A, 4 = F, 5 = G, 6 = K, 7 = M : ";
std::cin >> choice;
if ( choice == 8 ) {
y = MyRandom();
if (y <= 0.003) choice = 1;
if ((y > 0.003) && (y <= 0.133)) choice = 2;
if ((y > 0.133) && (y <= 0.733)) choice = 3;
if ((y > 0.733) && (y <= 3.733)) choice = 4;
if ((y > 3.733) && (y <= 11.333)) choice = 5;
if ((y > 11.333) && (y <= 23.433)) choice = 6;
else choice = 7;
}
if (choice == 1) {
Class = 'O';
L = 30000;
R = 0.0307;
M = 16;
T = 30000;
}
if (choice == 2) {
Class = 'B';
L = 15000;
R = 0.0195;
M = 9;
T = 20000;
}
if (choice == 3) {
Class = 'A';
L = 15;
R = 0.00744;
M = 1.7;
T = 8700;
}
if (choice == 4) {
Class = 'F';
L = 3.25;
R = 0.00488;
M = 1.2;
T = 6750;
}
if (choice == 5) {
Class = 'G';
L = 1;
R = 0.00465;
M = 1;
T = 5700;
}
if (choice == 6) {
Class = 'K';
L = 0.34;
R = 0.00356;
M = 0.62;
T = 4450;
}
if (choice == 7) {
Class = 'M';
L = 0.08;
R = 0.00326;
M = 0.26;
T = 3000;
}
return star;
}
////////////
///////////
std::vector< std::vector<float> > Planet( float L, float R, float M, int T, int n){
///////////////////////////
//Planet generates the Planets
//Random 1 - 10, Random distribution 0.06 - 6 JAU unless specified by User
//Frost line Calculated, First Planet after Frost line is the Jupiter
//The Jupiter will have the most mass of all Jovian worlds
//Otherwise divided into Jovian and Terrestrial Worlds, Random Masses within groups
//Also calculates if a planet is in the Habitable Zone
////////////////////////////
float frostline, innerCHZ, outerCHZ;
float a = 0.06; // a - albedo
float m = M / 100; //Mass of the Jupiter always 1/100th mass of the Star.
std::vector<float> sys;
std::vector<std::vector <float>> system;
for (int i = 0 ; i < n ; i++){
sys.push_back( MyRandom()/10 * 3 ) ; //Distances in terms of Sol AU
}
sort(sys.begin(), sys.end() );
for (int i = 0 ; i < n ; i++){
system[i].push_back(sys[i]);
system[i].push_back(0); //system[i][0] is x, system[i][1] is y
}
frostline = (0.6 * T / 150) * (0.6 * T/150) * R / sqrt(1 - a);
innerCHZ = sqrt(L / 1.1);
outerCHZ = sqrt(L / 0.53);
for (int i = 0 ; i < n ; i++){
if (system[i][0] <= frostline) {
float tmass = m * 0.0003 * MyRandom();
system[i].push_back(tmass) ; //system[i][2] is mass, [3] is marker for the Jupiter
system[i].push_back(0) ;
}
if ((system[i][0] >= frostline) && (system[i-1][0] < frostline)){
system[i].push_back(m) ;
float J = 1;
system[i].push_back(J) ;
}
if ((system[i][0] >= frostline) && (system[i-1][0] >= frostline)) {
float jmass = m * 0.01 * MyRandom();
system[i].push_back(jmass) ;
system[i].push_back(0) ;
}
if ((system[i][0] >= innerCHZ) && (system[i][0] <= outerCHZ)){
float H = 1;
system[i].push_back(H);
}
else system[i].push_back(0); //[4] is habitable marker
}
return system;
}
////////////
////////////
std::vector<std::vector<float>> Time( std::vector<std::vector<float>> system , int n, float dt, int argc, char **argv){
#define ASIZE 3 //Setup
int MPI_Init(int *argc, char ***argv);
int rank, numtasks = n, namelen, rc;
char processor_name[MPI_MAX_PROCESSOR_NAME];
MPI_Status status;
MPI_Init( &argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
rc = MPI_Bcast(&system, ASIZE, MPI_DOUBLE, 0, MPI_COMM_WORLD); //Master
// Broadcast computed initial values to all other processes
if (rc != MPI_SUCCESS) {
fprintf(stderr, "Oops! An error occurred in MPI_Bcast()\n");
MPI_Abort(MPI_COMM_WORLD, rc);
}
//Slaves
const float pi = 4 * atan(1.0);
const float G = 6.67 * pow(10,-11);
float a_x, a_y;
for (int i = 0 ; i < n; i++) {
if (rank != i){
a_x = G * system[i][2] * (system[i][0]-system[rank][0]) / ((system[i][0]-system[rank][0]) * (system[i][0]-system[rank][0]));
a_y = G * system[i][2] * (system[i][1]-system[rank][1]) / ((system[i][1]-system[rank][1]) * (system[i][1]-system[rank][1]));
}
if (rank == i){
a_x = G * system[i][2] * 100 * system[i][0] / (system[i][0] * system[i][0]);
a_y = G * system[i][2] * 100 * system[i][1] / (system[i][1] * system[i][1]);
}
a_x += a_x;
a_y += a_y;
}
for (int i=0; i < n; i++){
system[i][0] += system[i][5] * dt + 0.5 * a_x * dt * dt;
system[i][1] += system[i][6] * dt + 0.5 * a_y * dt * dt;
system[i][5] += a_x * dt;
system[i][6] += a_y * dt;
}
for(int i=0 ; i<n ; i++){
for(int j=0 ; j<i ; j++){
if (system[j][0] == 0 && system[j][1] == 0){
system.erase(system.begin() + j);
} // crash into star
if (system[j][0] == system[i][0] && system[j][1] == system[i][1]){
system[i][2] += system[j][2];
system.erase(system.begin() + j);
} // planet crash
} //check co-ordinates
} // planet destroy loop
for(int i = 0 ; i < n ; i++){
if (sqrt(system[i][0]*system[i][0] + system[i][1]*system[i][1]) >= 60) system.erase(system.begin() + i);
}
//Send results back to the first process
if (rank != 0){// All processes except the one of rank 0
MPI_Send(&system, 1, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD);
}
else {
for (int j = 1; j < numtasks; j++) {
MPI_Recv(&system, 1, MPI_DOUBLE, MPI_ANY_SOURCE, 1,
MPI_COMM_WORLD, &status);
}
}
MPI_Finalize();
///////////////////////////
//Time advances the solar system.
//Plots the Orbits
//Uses MPI to spread it's calculations.
///////////////////////////
return system;
}
////////////
////////////
std::vector<bool> FinalCheck( std::vector<std::vector<float>> system, std::vector<bool> Water, int n){
///////////////////////////
//Final Checks
//Reports if a Planet spent the whole Time in the Habitable Zone
///////////////////////////
for (int i = 0 ; i < n ; i++){
if (system[i][4] == 1.0) Water.push_back(true);
else Water.push_back(false);
}
return Water;
}
////////////
////////////
int main(int argc, char** argv){
char Class;
float L, R, M, T;
std::tuple< char , float , float , float , float > star( Class , L , R , M , T );
star = Star();
int n = MyRandom()/10 + 1;
std::vector<std::vector <float>> system ;
std::vector<std::vector <float>> system1 ;
system = Planet( L , R , M, T, n);
float G = 6.67 * pow(10,-11), pi = 4 * atan(1.0), dt;
for (int i = 0; i < n; i++){
if (system[i][3] == 1){
dt = 2 * pi * .01 * pow(system[i][0] * 1.5 * pow(10,8), 1.5) / sqrt(G * M * 2 * pow(10,30));
}
system[i].push_back(0.0); //system[i][5] is speed in x-axis
system[i].push_back( sqrt(6.67 * pow(10,-11) * 2 * pow(10,30) * M / system[i][0])); //system[i][6] is speed in y-axis
}
std::ofstream Finder;
std::ofstream Report;
Finder.open("plotdata.dat");
Report.open("report.txt");
Finder << "# Plot Co-ordinates" << std::endl;
for (int i = 0 ; i < 1000 ; i++) {
system1 = Time( system, n , dt , argc, argv);
for (int j=0 ; j<n ; j++){
Finder << "[color " << j << "] " << system[j][0] << " " << system[j][1] << std::endl;
if((system[j][4] == 1.0) && ( (sqrt(system[j][0] * system[j][0] + system[j][1] * system[j][1]) < sqrt(L / 1.1) ) || ((sqrt(system[j][0] * system[j][0] + system[j][1] * system[j][1]) > sqrt(L / 0.53)) ))) system[j][4] = 0.0;
}
system = system1;
}
Finder.close();
int m;
m = system.size()/system[0].size();
std::vector<bool> Water;
Water = FinalCheck( system, Water, n);
//Report
for (int i = 0 ; i < n ; i++){
Report << "Planet " << i << "ends up at" << system[i][0] << " and " << system[i][1] << "has mass " << system[i][2] ;
if (system[i][3] == 1) Report << ", which is the 'Jupiter' of the system." ;
if (system[i][4] == 1) Report << ", which can have liquid water on the surface." ;
}
Report.close();
///////////////////////////
//Report cleans everything up and gives the results
//Shows the plot, lists the Planets
//Reports the Positions and Masses of all Planets
//Reports which was the Jupiter and which if any were Habitable
//////////////////////////
return 0;
}
Any thoughts the gurus here have would be appreciated, especially with getting rid of those -fpermissive errors.
EDIT 1 - Code as presented will now completely compile - but will return a Segmentation fault during the Star routine. After the user inputs the star type but before it actually makes a star as far as I can tell.

Time dependent 1D Schrodinger equation C++

I wrote the code in C++ which solves the time-dependent 1D Schrodinger equation for the anharmonic potential V = x^2/2 + lambda*x^4, using Thomas algorithm. My code is working and I animate the results in Mathematica, to check what is going on. I test the code against the known solution for the harmonic potential (I put lambda = 0), but the animation shows that abs(Psi) is changing with time, and I know that is not correct for the harmonic potential. Actually, I see that in one point it time it becomes constant, but before that is oscillating.
So I understand that I need to have constant magnitude of the wave function over the time interval, but I don't know how to do it, or where am I doing mistake.
Here is my code and the animation for 100 time steps and 100 points on the grid.
#include <iostream>
#include <iomanip>
#include <cmath>
#include <vector>
#include <cstdlib>
#include <complex>
#include <fstream>
using namespace std;
// Mandatory parameters
const int L = 1; //length of domain in x direction
const int tmax = 10; //end time
const int nx = 100, nt = 100; //number of the grid points and time steps respectively
double lambda; //dictates the shape of the potential (we can use lambda = 0.0
// to test the code against the known solution for the harmonic
// oscillator)
complex<double> I(0.0, 1.0); //imaginary unit
// Derived parameters
double delta_x = 1. / (nx - 1);
//spacing between the grid points
double delta_t = 1. / (nt - 1);
//the time step
double r = delta_t / (delta_x * delta_x); //used to simplify expressions for
// the coefficients of the lhs and
// rhs of the matrix eqn
// Algorithm for solving the tridiagonal matrix system
vector<complex<double> > thomas_algorithm(vector<double>& a,
vector<complex<double> >& b,
vector<double>& c,
vector<complex<double> >& d)
{
// Temporary wave function
vector<complex<double> > y(nx + 1, 0.0);
// Modified matrix coefficients
vector<complex<double> > c_prime(nx + 1, 0.0);
vector<complex<double> > d_prime(nx + 1, 0.0);
// This updates the coefficients in the first row
c_prime[0] = c[0] / b[0];
d_prime[0] = d[0] / b[0];
// Create the c_prime and d_prime coefficients in the forward sweep
for (int i = 1; i < nx + 1; i++)
{
complex<double> m = 1.0 / (b[i] - a[i] * c_prime[i - 1]);
c_prime[i] = c[i] * m;
d_prime[i] = (d[i] - a[i] * d_prime[i - 1]) * m;
}
// This gives the value of the last equation in the system
y[nx] = d_prime[nx];
// This is the reverse sweep, used to update the solution vector
for (int i = nx - 1; i > 0; i--)
{
y[i] = d_prime[i] - c_prime[i] * y[i + 1];
}
return y;
}
void calc()
{
// First create the vectors to store the coefficients
vector<double> a(nx + 1, 1.0);
vector<complex<double> > b(nx + 1, 0.0);
vector<double> c(nx + 1, 1.0);
vector<complex<double> > d(nx + 1, 0.0);
vector<complex<double> > psi(nx + 1, 0.0);
vector<complex<double> > phi(nx + 1, 0.0);
vector<double> V(nx + 1, 0.0);
vector<double> x(nx + 1, 0);
vector<vector<complex<double> > > PSI(nt + 1,
vector<complex<double> >(nx + 1,
0.0));
vector<double> prob(nx + 1, 0);
// We don't have the first member of the left diagonal and the last member
// of the right diagonal
a[0] = 0.0;
c[nx] = 0.0;
for (int i = 0; i < nx + 1; i++)
{
x[i] = (-nx / 2) + i; // Values on the x axis
// Eigenfunction of the harmonic oscillator in the ground state
phi[i] = exp(-pow(x[i] * delta_x, 2) / 2) / (pow(M_PI, 0.25));
// Anharmonic potential
V[i] = pow(x[i] * delta_x, 2) / 2 + lambda * pow(x[i] * delta_x, 4);
// The main diagonal coefficients
b[i] = 2.0 * I / r - 2.0 + V[i] * delta_x * delta_x;
}
double sum0 = 0.0;
for (int i = 0; i < nx + 1; i++)
{
PSI[0][i] = phi[i]; // Initial condition for the wave function
sum0 += abs(pow(PSI[0][i], 2)); // Needed for the normalization
}
sum0 = sum0 * delta_x;
for (int i = 0; i < nx + 1; i++)
{
PSI[0][i] = PSI[0][i] / sqrt(sum0); // Normalization of the initial
// wave function
}
for (int j = 0; j < nt; j++)
{
PSI[j][0] = 0.0;
PSI[j][nx] = 0.0; // Boundary conditions for the wave function
d[0] = 0.0;
d[nx] = 0.0; // Boundary conditions for the rhs
// Fill in the current time step vector d representing the rhs
for (int i = 1; i < nx + 1; i++)
{
d[i] = PSI[j][i + 1]
+ (2.0 - 2.0 * I / r - V[i] * delta_x * delta_x) * PSI[j][i]
+ PSI[j][i - 1];
}
// Now solve the tridiagonal system
psi = thomas_algorithm(a, b, c, d);
for (int i = 1; i < nx; i++)
{
PSI[j + 1][i] = psi[i]; // Assign values to the wave function
}
for (int i = 0; i < nx + 1; i++)
{
// Probability density of the wave function in the next time step
prob[i] = abs(PSI[j + 1][i] * conj(PSI[j + 1][i]));
}
double sum = 0.0;
for (int i = 0; i < nx + 1; i++)
{
sum += prob[i] * delta_x;
}
for (int i = 0; i < nx + 1; i++)
{
// Normalization of the wave function in the next time step
PSI[j + 1][i] /= sqrt(sum);
}
}
// Opening files for writing the results
ofstream file_psi_re, file_psi_imag, file_psi_abs, file_potential,
file_phi0;
file_psi_re.open("psi_re.dat");
file_psi_imag.open("psi_imag.dat");
file_psi_abs.open("psi_abs.dat");
for (int i = 0; i < nx + 1; i++)
{
file_psi_re << fixed << x[i] << " ";
file_psi_imag << fixed << x[i] << " ";
file_psi_abs << fixed << x[i] << " ";
for (int j = 0; j < nt + 1; j++)
{
file_psi_re << fixed << setprecision(6) << PSI[j][i].real() << " ";
file_psi_imag << fixed << setprecision(6) << PSI[j][i].imag()
<< " ";
file_psi_abs << fixed << setprecision(6) << abs(PSI[j][i]) << " ";
}
file_psi_re << endl;
file_psi_imag << endl;
file_psi_abs << endl;
}
}
int main(int argc, char **argv)
{
calc();
return 0;
}
The black line is abs(psi), the red one is Im(psi) and the blue one is Re(psi).
(Bear in mind that my computational physics course was ten years ago now)
You say you are solving a time-dependent system, but I don't see any time-dependence (even if lambda != 0). In the Schrodinger Equation, if the potential function does not depend on time then the different equation is called separable because you can solve the time component and spatial component of the differential equation separately.
The general solution in that case is just the solution to the time-independent Schrodinger Equation multiplied by exp(-iE/h_bar). When you plot the magnitude of the probability that term just becomes 1 and so the probability doesn't change over time. In these cases people quite typically just ignore the time component altogether.
All this is to say that since your potential function doesn't depend on time then you aren't solving a time-dependent Schrodinger Equation. The Tridiagonal Matrix Algorithm can only be used to solve ordinary differential equations, whereas if your potential depended on time you would have a partial differential equation and would need a different method to solve it. Also as a result of that plotting the probability density over time is rarely interesting.
As for why your potential is not constant, numerical methods for finding eigenvalues and eigenvectors rarely produce the normalised eigenvectors naturally, so are you manually normalising your eigenvector before computing your probabilities?

Confusion testing fftw3 - poisson equation 2d test

I am having trouble explaining/understanding the following phenomenon:
To test fftw3 i am using the 2d poisson test case:
laplacian(f(x,y)) = - g(x,y) with periodic boundary conditions.
After applying the fourier transform to the equation we obtain : F(kx,ky) = G(kx,ky) /(kx² + ky²) (1)
if i take g(x,y) = sin (x) + sin(y) , (x,y) \in [0,2 \pi] i have immediately f(x,y) = g(x,y)
which is what i am trying to obtain with the fft :
i compute G from g with a forward Fourier transform
From this i can compute the Fourier transform of f with (1).
Finally, i compute f with the backward Fourier transform (without forgetting to normalize by 1/(nx*ny)).
In practice, the results are pretty bad?
(For instance, the amplitude for N = 256 is twice the amplitude obtained with N = 512)
Even worse, if i try g(x,y) = sin(x)*sin(y) , the curve has not even the same form of the solution.
(note that i must change the equation; i divide by two the laplacian in this case : (1) becomes F(kx,ky) = 2*G(kx,ky)/(kx²+ky²)
Here is the code:
/*
* fftw test -- double precision
*/
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <fftw3.h>
using namespace std;
int main()
{
int N = 128;
int i, j ;
double pi = 3.14159265359;
double *X, *Y ;
X = (double*) malloc(N*sizeof(double));
Y = (double*) malloc(N*sizeof(double));
fftw_complex *out1, *in2, *out2, *in1;
fftw_plan p1, p2;
double L = 2.*pi;
double dx = L/(N - 1);
in1 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
out2 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
out1 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
in2 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
p1 = fftw_plan_dft_2d(N, N, in1, out1, FFTW_FORWARD,FFTW_MEASURE );
p2 = fftw_plan_dft_2d(N, N, in2, out2, FFTW_BACKWARD,FFTW_MEASURE);
for(i = 0; i < N; i++){
X[i] = -pi + i*dx ;
for(j = 0; j < N; j++){
Y[j] = -pi + j*dx ;
in1[i*N + j][0] = sin(X[i]) + sin(Y[j]) ; // row major ordering
//in1[i*N + j][0] = sin(X[i]) * sin(Y[j]) ; // 2nd test case
in1[i*N + j][1] = 0 ;
}
}
fftw_execute(p1); // FFT forward
for ( i = 0; i < N; i++){ // f = g / ( kx² + ky² )
for( j = 0; j < N; j++){
in2[i*N + j][0] = out1[i*N + j][0]/ (i*i+j*j+1e-16);
in2[i*N + j][1] = out1[i*N + j][1]/ (i*i+j*j+1e-16);
//in2[i*N + j][0] = 2*out1[i*N + j][0]/ (i*i+j*j+1e-16); // 2nd test case
//in2[i*N + j][1] = 2*out1[i*N + j][1]/ (i*i+j*j+1e-16);
}
}
fftw_execute(p2); //FFT backward
// checking the results computed
double erl1 = 0.;
for ( i = 0; i < N; i++) {
for( j = 0; j < N; j++){
erl1 += fabs( in1[i*N + j][0] - out2[i*N + j][0]/N/N )*dx*dx;
cout<< i <<" "<< j<<" "<< sin(X[i])+sin(Y[j])<<" "<< out2[i*N+j][0]/N/N <<" "<< endl; // > output
}
}
cout<< erl1 << endl ; // L1 error
fftw_destroy_plan(p1);
fftw_destroy_plan(p2);
fftw_free(out1);
fftw_free(out2);
fftw_free(in1);
fftw_free(in2);
return 0;
}
I can't find any (more) mistakes in my code (i installed the fftw3 library last week) and i don't see a problem with the maths either but i don't think it's the fft's fault. Hence my predicament. I am all out of ideas and all out of google as well.
Any help solving this puzzle would be greatly appreciated.
note :
compiling : g++ test.cpp -lfftw3 -lm
executing : ./a.out > output
and i use gnuplot in order to plot the curves :
(in gnuplot ) splot "output" u 1:2:4 ( for the computed solution )
Here are some little points to be modified :
You need to account for all small frequencies, including the negative ones ! Index i corresponds to the frequency 2PI i/N but also to the frequency 2PI (i-N)/N. In the Fourier space, the end of the array matters as much as the beginning ! In our case, we keep the smallest frequency : it's 2PI i/N for the first half of the array, and 2PI(i-N)/N on the second half.
Of course, as Paul said, N-1 should be Nin double dx = L/(N - 1); => double dx = L/(N ); N-1 does not correspond to a continious periodic signal. It woud be hard to use it as a test case...
Scaling...I did it empirically
The result i obtain is closer to the expected one, for both cases. Here is the code :
/*
* fftw test -- double precision
*/
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <fftw3.h>
using namespace std;
int main()
{
int N = 128;
int i, j ;
double pi = 3.14159265359;
double *X, *Y ;
X = (double*) malloc(N*sizeof(double));
Y = (double*) malloc(N*sizeof(double));
fftw_complex *out1, *in2, *out2, *in1;
fftw_plan p1, p2;
double L = 2.*pi;
double dx = L/(N );
in1 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
out2 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
out1 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
in2 = (fftw_complex*) fftw_malloc(sizeof(fftw_complex)*(N*N) );
p1 = fftw_plan_dft_2d(N, N, in1, out1, FFTW_FORWARD,FFTW_MEASURE );
p2 = fftw_plan_dft_2d(N, N, in2, out2, FFTW_BACKWARD,FFTW_MEASURE);
for(i = 0; i < N; i++){
X[i] = -pi + i*dx ;
for(j = 0; j < N; j++){
Y[j] = -pi + j*dx ;
in1[i*N + j][0] = sin(X[i]) + sin(Y[j]) ; // row major ordering
// in1[i*N + j][0] = sin(X[i]) * sin(Y[j]) ; // 2nd test case
in1[i*N + j][1] = 0 ;
}
}
fftw_execute(p1); // FFT forward
for ( i = 0; i < N; i++){ // f = g / ( kx² + ky² )
for( j = 0; j < N; j++){
double fact=0;
in2[i*N + j][0]=0;
in2[i*N + j][1]=0;
if(2*i<N){
fact=((double)i*i);
}else{
fact=((double)(N-i)*(N-i));
}
if(2*j<N){
fact+=((double)j*j);
}else{
fact+=((double)(N-j)*(N-j));
}
if(fact!=0){
in2[i*N + j][0] = out1[i*N + j][0]/fact;
in2[i*N + j][1] = out1[i*N + j][1]/fact;
}else{
in2[i*N + j][0] = 0;
in2[i*N + j][1] = 0;
}
//in2[i*N + j][0] = out1[i*N + j][0];
//in2[i*N + j][1] = out1[i*N + j][1];
// in2[i*N + j][0] = out1[i*N + j][0]*(1.0/(i*i+1e-16)+1.0/(j*j+1e-16)+1.0/((N-i)*(N-i)+1e-16)+1.0/((N-j)*(N-j)+1e-16))*N*N;
// in2[i*N + j][1] = out1[i*N + j][1]*(1.0/(i*i+1e-16)+1.0/(j*j+1e-16)+1.0/((N-i)*(N-i)+1e-16)+1.0/((N-j)*(N-j)+1e-16))*N*N;
//in2[i*N + j][0] = 2*out1[i*N + j][0]/ (i*i+j*j+1e-16); // 2nd test case
//in2[i*N + j][1] = 2*out1[i*N + j][1]/ (i*i+j*j+1e-16);
}
}
fftw_execute(p2); //FFT backward
// checking the results computed
double erl1 = 0.;
for ( i = 0; i < N; i++) {
for( j = 0; j < N; j++){
erl1 += fabs( in1[i*N + j][0] - out2[i*N + j][0]/(N*N))*dx*dx;
cout<< i <<" "<< j<<" "<< sin(X[i])+sin(Y[j])<<" "<< out2[i*N+j][0]/(N*N) <<" "<< endl; // > output
// cout<< i <<" "<< j<<" "<< sin(X[i])*sin(Y[j])<<" "<< out2[i*N+j][0]/(N*N) <<" "<< endl; // > output
}
}
cout<< erl1 << endl ; // L1 error
fftw_destroy_plan(p1);
fftw_destroy_plan(p2);
fftw_free(out1);
fftw_free(out2);
fftw_free(in1);
fftw_free(in2);
return 0;
}
This code is far from being perfect, it is neither optimized nor beautiful. But it gives almost what is expected.
Bye,

Laguerre interpolation algorithm, something's wrong with my implementation

This is a problem I have been struggling for a week, coming back just to give up after wasted hours...
I am supposed to find coefficents for the following Laguerre polynomial:
P0(x) = 1
P1(x) = 1 - x
Pn(x) = ((2n - 1 - x) / n) * P(n-1) - ((n - 1) / n) * P(n-2)
I believe there is an error in my implementation, because for some reason the coefficents I get seem way too big. This is the output this program generates:
a1 = -190.234
a2 = -295.833
a3 = 378.283
a4 = -939.537
a5 = 774.861
a6 = -400.612
Description of code (given below):
If you scroll the code down a little to the part where I declare array, you'll find given x's and y's.
The function polynomial just fills an array with values of said polynomial for certain x. It's a recursive function. I believe it works well, because I have checked the output values.
The gauss function finds coefficents by performing Gaussian elimination on output array. I think this is where the problems begin. I am wondering, if there's a mistake in this code or perhaps my method of veryfying results is bad? I am trying to verify them like that:
-190.234 * 1.5 ^ 5 - 295.833 * 1.5 ^ 4 ... - 400.612 = -3017,817625 =/= 2
Code:
#include "stdafx.h"
#include <conio.h>
#include <iostream>
#include <iomanip>
#include <math.h>
using namespace std;
double polynomial(int i, int j, double **tab)
{
double n = i;
double **array = tab;
double x = array[j][0];
if (i == 0) {
return 1;
} else if (i == 1) {
return 1 - x;
} else {
double minusone = polynomial(i - 1, j, array);
double minustwo = polynomial(i - 2, j, array);
double result = (((2.0 * n) - 1 - x) / n) * minusone - ((n - 1.0) / n) * minustwo;
return result;
}
}
int gauss(int n, double tab[6][7], double results[7])
{
double multiplier, divider;
for (int m = 0; m <= n; m++)
{
for (int i = m + 1; i <= n; i++)
{
multiplier = tab[i][m];
divider = tab[m][m];
if (divider == 0) {
return 1;
}
for (int j = m; j <= n; j++)
{
if (i == n) {
break;
}
tab[i][j] = (tab[m][j] * multiplier / divider) - tab[i][j];
}
for (int j = m; j <= n; j++) {
tab[i - 1][j] = tab[i - 1][j] / divider;
}
}
}
double s = 0;
results[n - 1] = tab[n - 1][n];
int y = 0;
for (int i = n-2; i >= 0; i--)
{
s = 0;
y++;
for (int x = 0; x < n; x++)
{
s = s + (tab[i][n - 1 - x] * results[n-(x + 1)]);
if (y == x + 1) {
break;
}
}
results[i] = tab[i][n] - s;
}
}
int _tmain(int argc, _TCHAR* argv[])
{
int num;
double **array;
array = new double*[5];
for (int i = 0; i <= 5; i++)
{
array[i] = new double[2];
}
//i 0 1 2 3 4 5
array[0][0] = 1.5; //xi 1.5 2 2.5 3.5 3.8 4.1
array[0][1] = 2; //yi 2 5 -1 0.5 3 7
array[1][0] = 2;
array[1][1] = 5;
array[2][0] = 2.5;
array[2][1] = -1;
array[3][0] = 3.5;
array[3][1] = 0.5;
array[4][0] = 3.8;
array[4][1] = 3;
array[5][0] = 4.1;
array[5][1] = 7;
double W[6][7]; //n + 1
for (int i = 0; i <= 5; i++)
{
for (int j = 0; j <= 5; j++)
{
W[i][j] = polynomial(j, i, array);
}
W[i][6] = array[i][1];
}
for (int i = 0; i <= 5; i++)
{
for (int j = 0; j <= 6; j++)
{
cout << W[i][j] << "\t";
}
cout << endl;
}
double results[6];
gauss(6, W, results);
for (int i = 0; i < 6; i++) {
cout << "a" << i + 1 << " = " << results[i] << endl;
}
_getch();
return 0;
}
I believe your interpretation of the recursive polynomial generation either needs revising or is a bit too clever for me.
given P[0][5] = {1,0,0,0,0,...}; P[1][5]={1,-1,0,0,0,...};
then P[2] is a*P[0] + convolution(P[1], { c, d });
where a = -((n - 1) / n)
c = (2n - 1)/n and d= - 1/n
This can be generalized: P[n] == a*P[n-2] + conv(P[n-1], { c,d });
In every step there is involved a polynomial multiplication with (c + d*x), which increases the degree by one (just by one...) and adding to P[n-1] multiplied with a scalar a.
Then most likely the interpolation factor x is in range [0..1].
(convolution means, that you should implement polynomial multiplication, which luckily is easy...)
[a,b,c,d]
* [e,f]
------------------
af,bf,cf,df +
ae,be,ce,de, 0 +
--------------------------
(= coefficients of the final polynomial)
The definition of P1(x) = x - 1 is not implemented as stated. You have 1 - x in the computation.
I did not look any further.