C++ Image 2D Fast Fourier Transform - c++

I have to implement 2D FFT transform on the image (I cannot use library to do it for me - part of the course). I use CImg to load and save images. I have made the following code:
CImg<Complex> FastFourier(CImg<unsigned char> &originalImage)
{
//check size in the main.cpp
CImg<Complex> resultantImage = TransformToComplex(originalImage);
vector< vector< vector< Complex > > > vectorImage = imageToVector(resultantImage);
//cout << "Transform to complex" << endl;
int size = originalImage.width();
for(int i = 0; i < size; i++)
FastFourier1D(vectorImage[i], false);
vectorImage = rotateVector(vectorImage);
for(int i = 0; i < size; i++)
FastFourier1D(vectorImage[i], false);
vectorImage = rotateVector(vectorImage);
resultantImage = vectorToImage(vectorImage);
return resultantImage;
}
And:
void FastFourier1D(vector< vector< Complex > > &input, bool inverse)
{
int size = input.size();
double angle;
if(size <= 1)
return;
int channels = input[0].size();
vector< vector< Complex > > even;
vector< vector< Complex > > odd;
for(int i = 0; i < size; i+=2)
{
vector< Complex > tempEven;
vector< Complex > tempOdd;
for(int channelIterator = 0; channelIterator < channels; channelIterator++)
{
tempEven.push_back(input[i][channelIterator]);
tempOdd.push_back(input[i + 1][channelIterator]);
}
even.push_back(tempEven);
odd.push_back(tempOdd);
}
FastFourier1D(even, inverse);
FastFourier1D(odd, inverse);
for(int channelIterator = 0; channelIterator < channels; channelIterator++)
{
for(int i = 0; i < size / 2; i++)
{
if(inverse == false)
angle = -2.0 * (double)PI * (double)i / (double)size;
else
angle = 2.0 * (double)PI * (double)i / (double)size;
double real = cos(angle);
double imaginary = sin(angle);
Complex W;
W.setRP(real);
W.setIP(imaginary);
W = W * odd[i][channelIterator];
input[i][channelIterator] = even[i][channelIterator] + W;
input[(size / 2) + i][channelIterator] = even[i][channelIterator] - W;
}
}
}
However the results are not good. Input image:
FFT (without any transform):
Inverse FFT:
As you can see, it has colors of lena, but does not look like lena. Could you help me? Is there any mistake?

I found out that the answer was an incorrect implementation of multiplication operator in my Complex class.
Complex Complex::operator*(const Complex& a)
{
Complex number;
double RP = realPart * a.getRP() - imaginaryPart * a.getIP(); // this line was wrong
double IP = realPart * a.getIP() + imaginaryPart * a.getRP();
number.setRP(RP);
number.setIP(IP);
return number;
}
In real part, I forgot about minus. Now the whole implementation is working and fourier successfully converts an image into frequency domain and makes inverse into spatial domain as well.

Related

Is there a faster way to calculate the inverse of a given nxn matrix?

I'm working on a program that requires calculating the inverse of an 8x8 matrix as fast as possible. Here's the code I wrote:
class matrix
{
public:
int w, h;
std::vector<std::vector<float>> cell;
matrix(int width, int height)
{
w = width;
h = height;
cell.resize(width);
for (int i = 0; i < cell.size(); i++)
{
cell[i].resize(height);
}
}
};
matrix transponseMatrix(matrix M)
{
matrix A(M.h, M.w);
for (int i = 0; i < M.h; i++)
{
for (int j = 0; j < M.w; j++)
{
A.cell[i][j] = M.cell[j][i];
}
}
return A;
}
float getMatrixDeterminant(matrix M)
{
if (M.w != M.h)
{
std::cout << "ERROR! Matrix isn't of nXn type.\n";
return NULL;
}
float determinante = 0;
if (M.w == 1)
{
determinante = M.cell[0][0];
}
if (M.w == 2)
{
determinante = M.cell[0][0] * M.cell[1][1] - M.cell[1][0] * M.cell[0][1];
}
else
{
for (int i = 0; i < M.w; i++)
{
matrix A(M.w - 1, M.h - 1);
int cy = 0;
for (int y = 1; y < M.h; y++)
{
int cx = 0;
for (int x = 0; x < M.w; x++)
{
if (x != i)
{
A.cell[cx][cy] = M.cell[x][y];
cx++;
}
}
cy++;
}
determinante += M.cell[i][0] * pow(-1, i + 0) * getMatrixDeterminant(A);
}
}
return determinante;
}
float getComplementOf(matrix M, int X, int Y)
{
float det;
if (M.w != M.h)
{
std::cout << "ERROR! Matrix isn't of nXn type.\n";
return NULL;
}
if (M.w == 2)
{
det = M.cell[1 - X][1 - Y];
}
else
{
matrix A(M.w - 1, M.h - 1);
int cy = 0;
for (int y = 0; y < M.h; y++)
{
if (y != Y)
{
int cx = 0;
for (int x = 0; x < M.w; x++)
{
if (x != X)
{
A.cell[cx][cy] = M.cell[x][y];
cx++;
}
}
cy++;
}
}
det = getMatrixDeterminant(A);
}
return (pow(-1, X + Y) * det);
}
matrix invertMatrix(matrix M)
{
matrix A(M.w, M.h);
float det = getMatrixDeterminant(M);
if (det == 0)
{
std::cout << "ERROR! Matrix inversion impossible (determinant is equal to 0).\n";
return A;
}
for (int i = 0; i < M.h; i++)
{
for (int j = 0; j < M.w; j++)
{
A.cell[j][i] = getComplementOf(M, j, i) / det;
}
}
A = transponseMatrix(A);
return A;
}
While it does work, it does so way too slowly for my purposes, managing to calculate an 8x8 matrix's inverse about 6 times per second.
I've tried searching for more efficient ways to invert a matrix but was unsuccessfull in finding solutions for matrices of these dimensions.
However I did find conversations in which people claimed that for matrices below 50x50 or even 1000x1000 time shouldn't be a problem, so I was wondering if I have missed something, either a faster method or some unnecessary calculations in my code.
Does anyone have experience regarding this and/or advice?
Sorry for broken english.
Your implementation have problems as others commented on the question. The largest bottleneck is the algorithm itself, calculating tons of determinants.(It's O(n!)!)
If you want a simple implementation, just implement Gaussian elimination. See finding the inverse of a matrix and the pseudo code at Wikipedia. It'll perform fast enough for small sizes such as 8x8.
If you want a complex but more efficient implementation, use a library that is optimized for LU decomposition(Gaussian elimination), QR decomposition, etc.(Such as LAPACK or OpenCV.)

FFTW Complex to Real Segmentation Fault

I am attempting to write a naive implementation of the Short-Time Fourier Transform using consecutive FFT frames in time, calculated using the FFTW library, but I am getting a Segmentation fault and cannot work out why.
My code is as below:
// load in audio
AudioFile<double> audioFile;
audioFile.load ("assets/example-audio/file_example_WAV_1MG.wav");
int N = audioFile.getNumSamplesPerChannel();
// make stereo audio mono
double fileDataMono[N];
if (audioFile.isStereo())
for (int i = 0; i < N; i++)
fileDataMono[i] = ( audioFile.samples[0][i] + audioFile.samples[1][i] ) / 2;
// setup stft
// (test transform, presently unoptimized)
int stepSize = 512;
int M = 2048; // fft size
int noOfFrames = (N-(M-stepSize))/stepSize;
// create Hamming window vector
double w[M];
for (int m = 0; m < M; m++) {
w[m] = 0.53836 - 0.46164 * cos( 2*M_PI*m / M );
}
double* input;
// (pads input array if necessary)
if ( (N-(M-stepSize))%stepSize != 0) {
noOfFrames += 1;
int amountOfZeroPadding = stepSize - (N-(M-stepSize))%stepSize;
double ipt[N + amountOfZeroPadding];
for (int i = 0; i < N; i++) // copy values from fileDataMono into input
ipt[i] = fileDataMono[i];
for (int i = 0; i < amountOfZeroPadding; i++)
ipt[N + i] = 0;
input = ipt;
} else {
input = fileDataMono;
}
// compute stft
fftw_complex* stft[noOfFrames];
double frames[noOfFrames][M];
fftw_plan fftPlan;
for (int i = 0; i < noOfFrames; i++) {
stft[i] = (fftw_complex*)fftw_malloc(sizeof(fftw_complex) * M);
for (int m = 0; m < M; m++)
frames[i][m] = input[i*stepSize + m] * w[m];
fftPlan = fftw_plan_dft_r2c_1d(M, frames[i], stft[i], FFTW_ESTIMATE);
fftw_execute(fftPlan);
}
// compute istft
double* outputFrames[noOfFrames];
double output[N];
for (int i = 0; i < noOfFrames; i++) {
outputFrames[i] = (double*)fftw_malloc(sizeof(double) * M);
fftPlan = fftw_plan_dft_c2r_1d(M, stft[i], outputFrames[i], FFTW_ESTIMATE);
fftw_execute(fftPlan);
for (int m = 0; i < M; m++) {
output[i*stepSize + m] += outputFrames[i][m];
}
}
fftw_destroy_plan(fftPlan);
for (int i = 0; i < noOfFrames; i++) {
fftw_free(stft[i]);
fftw_free(outputFrames[i]);
}
// output audio
AudioFile<double>::AudioBuffer outputBuffer;
outputBuffer.resize (1);
outputBuffer[0].resize(N);
outputBuffer[0].assign(output, output+N);
bool ok = audioFile.setAudioBuffer(outputBuffer);
audioFile.setAudioBufferSize (1, N);
audioFile.setBitDepth (16);
audioFile.setSampleRate (8000);
audioFile.save ("out/audioOutput.wav");
The segfault seems to be being raised by the first fftw_malloc when computing the forward STFT.
Thanks in advance!
The relevant bit of code is:
double* input;
if ( (N-(M-stepSize))%stepSize != 0) {
double ipt[N + amountOfZeroPadding];
//...
input = ipt;
}
//...
input[i*stepSize + m];
Your input pointer points at memory that exists only inside the if statement. The closing brace denotes the end of the lifetime of the ipt array. When dereferencing the pointer later, you are addressing memory that no longer exists.

gaussian smoothing output misaligned

I am trying to perform gaussian smoothing on this image without using any opencv function (except displaying the image).
However, the output I got after convoluting the image with the gaussian kernel is as follow:
The output image seems to have misaligned and looks very weird. Any idea what is happening?
Generate gaussian kernel:
double gaussian(int x, int y,double sigma){
return (1/(2*M_PI*pow(sigma,2)))*exp(-1*(pow(x,2)+pow(y,2))/(2*pow(sigma,2)));
}
double generateFilter(vector<vector<double>> & kernel,int width,double sigma){
int value = 0;
double total =0;
if(width%2 == 1){
value = (width-1)/2;
}else{
value = width/2;
}
double smallest = gaussian(-1*value,-1*value,sigma);
for(int i = -1*value; i<=value; i++){
vector<double> temp;
for(int k = -1*value; k<=value; k++){
int gVal = round(gaussian(i,k,sigma)/smallest);
temp.push_back(gVal);
total += gVal;
}
kernel.push_back(temp);
}
cout<<total<<endl;
return total;
}
Convolution:
vector<vector<unsigned int>> convolution(vector<vector<unsigned int>> src, vector<vector<double>> kernel,double total){
int kCenterX = floor(kernel.size() / 2); //center of kernel
int kCenterY = kCenterX; //center of kernel
int kRows = kernel.size(); //height of kernel
int kCols = kRows; //width of kernel
int imgRows = src.size(); //height of input image
int imgCols = src[0].size(); //width of input image
vector<vector<unsigned int>> dst = vector<vector<unsigned int>> (imgRows, vector<unsigned int>(imgCols ,0));
for ( size_t row = 0; row < imgRows; row++ ) {
for ( size_t col = 0; col < imgCols; col++ ) {
float accumulation = 0;
float weightsum = 0;
for ( int i = -1*kCenterX; i <= 1*kCenterX; i++ ) {
for ( int j = -1*kCenterY; j <= 1*kCenterY; j++ ) {
int k = 0;
if((row+i)>=0 && (row+i)<imgRows && (col+j)>=0 && (col+j)<imgCols){
k = src[row+i][col+j];
weightsum += kernel[kCenterX+i][kCenterY+j];
}
accumulation += k * kernel[kCenterX +i][kCenterY+j];
}
}
dst[row][col] = round(accumulation/weightsum);
}
}
return dst;
}
Thank you.
The convolution function is basically correct, so the issue is with the input and output format.
Make sure you are reading the image as Grayscale (and not RGB):
cv::Mat I = cv::imread("img.png", cv::IMREAD_GRAYSCALE);
You are passing vector<vector<unsigned int>> argument to convolution.
I can't say if it's part of the problem or not, but it's recommended to pass argument of type cv::Mat (and return cv::Mat):
cv::Mat convolution(cv::Mat src, vector<vector<double>> kernel, double total)
I assume you can convert the input to and from vector<vector<unsigned int>>, but it's not necessary.
Here is a working code sample:
#include <vector>
#include <iostream>
#include "opencv2/opencv.hpp"
#include "opencv2/highgui.hpp"
using namespace std;
double gaussian(int x, int y, double sigma) {
return (1 / (2 * 3.141592653589793*pow(sigma, 2)))*exp(-1 * (pow(x, 2) + pow(y, 2)) / (2 * pow(sigma, 2)));
}
double generateFilter(vector<vector<double>> & kernel, int width, double sigma)
{
int value = 0;
double total = 0;
if (width % 2 == 1) {
value = (width - 1) / 2;
}
else {
value = width / 2;
}
double smallest = gaussian(-1 * value, -1 * value, sigma);
for (int i = -1 * value; i <= value; i++) {
vector<double> temp;
for (int k = -1 * value; k <= value; k++) {
int gVal = round(gaussian(i, k, sigma) / smallest);
temp.push_back(gVal);
total += gVal;
}
kernel.push_back(temp);
}
cout << total << endl;
return total;
}
//vector<vector<unsigned int>> convolution(vector<vector<unsigned int>> src, vector<vector<double>> kernel, double total) {
cv::Mat convolution(cv::Mat src, vector<vector<double>> kernel, double total) {
int kCenterX = floor(kernel.size() / 2); //center of kernel
int kCenterY = kCenterX; //center of kernel
int kRows = kernel.size(); //height of kernel
int kCols = kRows; //width of kernel
int imgRows = src.rows;//src.size(); //height of input image
int imgCols = src.cols;//src[0].size(); //width of input image
//vector<vector<unsigned int>> dst = vector<vector<unsigned int>> (imgRows, vector<unsigned int>(imgCols ,0));
cv::Mat dst = cv::Mat::zeros(src.size(), CV_8UC1); //Create destination matrix, and fill with zeros (dst is Grayscale image with byte per pixel).
for (size_t row = 0; row < imgRows; row++) {
for (size_t col = 0; col < imgCols; col++) {
double accumulation = 0;
double weightsum = 0;
for (int i = -1 * kCenterX; i <= 1 * kCenterX; i++) {
for (int j = -1 * kCenterY; j <= 1 * kCenterY; j++) {
int k = 0;
if ((row + i) >= 0 && (row + i) < imgRows && (col + j) >= 0 && (col + j) < imgCols) {
//k = src[row+i][col+j];
k = (int)src.at<uchar>(row + i, col + j); //Read pixel from row [row + i] and column [col + j]
weightsum += kernel[kCenterX + i][kCenterY + j];
}
accumulation += (double)k * kernel[kCenterX + i][kCenterY + j];
}
}
//dst[row][col] = round(accumulation/weightsum);
dst.at<uchar>(row, col) = (uchar)round(accumulation / weightsum); //Write pixel from to row [row] and column [col]
//dst.at<uchar>(row, col) = src.at<uchar>(row, col);
}
}
return dst;
}
int main()
{
vector<vector<double>> kernel;
double total = generateFilter(kernel, 11, 3.0);
//Read input image as Grayscale (one byte per pixel).
cv::Mat I = cv::imread("img.png", cv::IMREAD_GRAYSCALE);
cv::Mat J = convolution(I, kernel, total);
//Display input and output
cv::imshow("I", I);
cv::imshow("J", J);
cv::waitKey(0);
cv::destroyAllWindows();
return 0;
}
Result:

What to do with negative rho values in hough transform?

Here is my code for creating the hough accumulator for lines in image :
void hough_lines_acc(cv::Mat img_a_edges, std::vector<std::vector<int> > &hough_acc) {
for (size_t r = 0; r < img_a_edges.rows; r++) {
for (size_t c = 0; c < img_a_edges.cols; c++) {
int theta = static_cast<int> (std::atan2(r, c) * 180 / M_PI);
int rho = static_cast<int> ((c * cos(theta)) + (r * sin(theta)));
if (theta < -90) theta = -90;
if (theta > 89) theta = 89;
++hough_acc[abs(rho)][theta];
}
}
cv::Mat img_mat(hough_acc.size(), hough_acc[0].size(), CV_8U);
std::cout << hough_acc.size() << " " << hough_acc[0].size() << std::endl;
for (size_t i = 0; i < hough_acc.size(); i++) {
for (size_t j = 0; j < hough_acc[0].size(); j++) {
img_mat.at<int> (i,j) = hough_acc[i][j];
}
}
imwrite("../output/ps1-­2-­b-­1.png", img_mat);
}
theta varies from -90 to 89. I am getting negative rho values. Right now I am just replacing the negative who with a positive one but am not getting a correct answer. What do I do to the negative rho? Please explain the answer.
theta = arctan (y / x)
rho = x * cos(theta) + y * sin(theta)
Edited code :
bool hough_lines_acc(cv::Mat img_a_edges, std::vector<std::vector<int> > &hough_acc,\
std::vector<double> thetas, std::vector<double> rhos, int rho_resolution, int theta_resolution) {
int img_w = img_a_edges.cols;
int img_h = img_a_edges.rows;
int max_votes = 0;
int min_votes = INT_MAX;
for (size_t r = 0; r < img_h; r++) {
for (size_t c = 0; c < img_w; c++) {
if(img_a_edges.at<int>(r, c) == 255) {
for (size_t i = 0; i < thetas.size(); i++) {
thetas[i] = (thetas[i] * M_PI / 180);
double rho = ( (c * cos(thetas[i])) + (r * sin(thetas[i])) );
int buff = ++hough_acc[static_cast<int>(abs(rho))][static_cast<int>(i)];
if (buff > max_votes) {
max_votes = buff;
}
if (buff < min_votes) {
min_votes = buff;
}
}
}
}
}
double div = static_cast<double>(max_votes) / 255;
int threshold = 10;
int possible_edge = round(static_cast<double>(max_votes) / div) - threshold;
props({
{"max votes", max_votes},
{"min votes", min_votes},
{"scale", div}
});
// needed for scaling intensity for contrast
// not sure if I am doing it correctly
for (size_t r = 0; r < hough_acc.size(); r++) {
for (size_t c = 0; c < hough_acc[0].size(); c++) {
double val = hough_acc[r][c] / div;
if (val < 0) {
val = 0;
}
hough_acc[r][c] = static_cast<int>(val);
}
}
cv::Mat img_mat = cv::Mat(hough_acc.size(), hough_acc[0].size(), CV_8UC1, cv::Scalar(0));
for (size_t i = 0; i < hough_acc.size(); i++) {
for (size_t j = 0; j < hough_acc[0].size(); j++) {
img_mat.at<uint8_t> (i,j) = static_cast<uint8_t>(hough_acc[i][j]);
}
}
imwrite("../output/ps1-­2-­b-­1.png", img_mat);
return true;
}
Still not correct output. What is the error here?
atan2 of two positive numbers... should not be giving you negative angles, it should only be giving you a range of 0-90
also for the hough transform, I think you want everything relative to one point (ie 0,0 in this case). I think for that you would actually want to make theta=90-atan2(r,c)
Admittedly though, I am a bit confused as I thought you had to encode line direction, rather than just "edge pt". ie I thought at each edge point you had to provide a discrete array of guessed line trajectories and calculate rho and theta for each one and throw all of those into your accumulator. As is... I am not sure what you are calculating.

Quad precision values storing to file

I am trying to run an oscillator and storing its fourier spectrum values, with high precision using Quad math in C++. I am able to compute the high precision value but I am not able to save it to a file as a quad precise value.
It gives me an error as:
error: cannot convert âstd::complex<__complex__ __float128>â to â__complex128 {aka __complex__ __float128}â for argument â1â to â__float128 cabsq(__complex128)â
My code is:
//Fourier transform
int size_dft=size_org;
int size_dfty=2e5;
int increment=0;
int initial_size_dft=0;
double pi2 = -2.0 * M_PI;
double angleTerm,cosineA,sineA;
int N_dft= 1e3;
double y_dft_deeper=0;
double invs = 1.0 / N_dft;
std::vector< std::complex< __complex128 > > output_seq(size_dft);
//std::complex<double> output_seq[size_dft];
for( int y = initial_size_dft;y < size_dfty;y++)
{
output_seq[y] = 0;
y_dft_deeper =2.4316321+(0.0000001*y);
if(y_dft_deeper<2.4318321)
{
int first_1 = 0;
increment = first_1;
for(unsigned int x =5786;x <end_dft;x++)
{
angleTerm = pi2 * y_dft_deeper * x * invs;
cosineA = cosq(angleTerm);
sineA = sinq(angleTerm);
std::real(output_seq[y]) += V2[x] * cosineA ;
std::imag(output_seq[y]) += V2[x] * sineA;
}
output_seq[y] *= invs;
cout<<"iteration = "<<y;//<<" DFT = "<< output_seq[y]<<"\n";
y=y+increment;
}
//Writing data to file
ofstream myfile_dft;
myfile_dft.open ("aug_colpits_deep_1e8_first20000_quadmath.txt");
for (int i = initial_size_dft; i < size_dfty; i++)
{
if (i<2000)
{
increment=0;
myfile_dft << cabsq(output_seq[i]) <<"\n";
i=i+increment;
}
}
myfile_dft.close();
`