FFT of an image - c++

I have an assignment about fftw and I was trying to write a small program to create an fft of an image. I am using CImg to read and write images. But all I get is a dark image with a single white dot :(
I'm most likely doing this the wrong way and I would appreciate if someone could explain how this should be done. I don't need the code, I just need to know what is the right way to do this.
Here is my code:
CImg<double> input("test3.bmp");
CImg<double> image_fft(input, false);
unsigned int nx = input.dimx(), ny = input.dimy();
size_t align = sizeof(Complex);
array2<Complex> in (nx, ny, align);
fft2d Forward(-1, in);
for (int i = 0; i < input.dimx(); ++i) {
for (int j = 0; j < input.dimy(); ++j) {
in(i,j) = input(i,j);
}
}
Forward.fft(in);
for (int i = 0; i < input.dimx(); ++i) {
for (int j = 0; j < input.dimy(); ++j) {
image_fft(i,j,0) = image_fft(i,j,1) = image_fft(i,j,2) = std::abs(in(i,j));
}
}
image_fft.normalize(0, 255);
image_fft.save("test.bmp");

You need to take the log of the magnitude. The single white dot is the base value (0 Hz, DC, whatever you want to call it), so it will almost ALWAYS be by far the largest component of any image you take (Since pixel values cannot be negative, the DC value will always be positive and large).
What you need to do is calculate the log (ln, whatever, some type of logarithmic calculation) of the magnitude (so after you've converted from complex to magnitude/phase form (phasor notation iirc?)) on each point before you normalize it.
Please note that the values are there, they are just REALLY small compared to the DC value, taking the log (Which makes smaller values bigger by a lot, and bigger values only slightly larger) will make the other frequencies visible.

Related

Opencv obatin certain pixel RGB value based on mask

My title may not be clear enough, but please look carefully on the following description.Thanks in advance.
I have a RGB image and a binary mask image:
Mat img = imread("test.jpg")
Mat mask = Mat::zeros(img.rows, img.cols, CV_8U);
Give some ones to the mask, assume the number of ones is N. Now the nonzero coordinates are known, based on these coordinates, we can surely obtain the corresponding pixel RGB value of the origin image.I know this can be accomplished by the following code:
Mat colors = Mat::zeros(N, 3, CV_8U);
int counter = 0;
for (int i = 0; i < mask.rows; i++)
{
for (int j = 0; j < mask.cols; j++)
{
if (mask.at<uchar>(i, j) == 1)
{
colors.at<uchar>(counter, 0) = img.at<Vec3b>(i, j)[0];
colors.at<uchar>(counter, 1) = img.at<Vec3b>(i, j)[1];
colors.at<uchar>(counter, 2) = img.at<Vec3b>(i, j)[2];
counter++;
}
}
}
And the coords will be as follows:
enter image description here
However, this two layer of for loop costs too much time. I was wondering if there is a faster method to obatin colors, hope you guys can understand what I was trying to convey.
PS:If I can use python, this can be done in only one sentence:
colors = img[mask == 1]
The .at() method is the slowest way to access Mat values in C++. Fastest is to use pointers, but best practice is an iterator. See the OpenCV tutorial on scanning images.
Just a note, even though Python's syntax is nice for something like this, it still has to loop through all of the elements at the end of the day---and since it has some overhead before this, it's de-facto slower than C++ loops with pointers. You necessarily need to loop through all the elements regardless of your library, you're doing comparisons with the mask for every element.
If you are flexible with using any other open source library using C++, try Armadillo. You can do all linear algebra operations with it and also, you can reduce above code to one line(similar to your Python code snippet).
Or
Try findNonZero()function and find all coordinates in image containing non-zero values. Check this: https://stackoverflow.com/a/19244484/7514664
Compile with optimization enabled, try profiling this version and tell us if it is faster:
vector<Vec3b> colors;
if (img.isContinuous() && mask.isContinuous()) {
auto pimg = img.ptr<Vec3b>();
for (auto pmask = mask.datastart; pmask < mask.dataend; ++pmask, ++pimg) {
if (*pmask)
colors.emplace_back(*pimg);
}
}
else {
for (int r = 0; r < img.rows; ++r) {
auto prowimg = img.ptr<Vec3b>(r);
auto prowmask = img.ptr(r);
for (int c = 0; c < img.cols; ++c) {
if (prowmask[c])
colors.emplace_back(prowimg[c]);
}
}
}
If you know the size of colors, reserve the space for it beforehand.

OpenCV not recognizing Mat size

I'm trying to print an image using OpenCV defining a 400x400 Mat:
plot2 = cv::Mat(400,400, CV_8U, 255);
But when I try print the points, something strange happens. The y coordinate only prints to the first 100 values. That is, if I print the point (50,100), it does not print it in the 100/400th part of the columns, but at the end. Somehow, 400 columns have turned into 100.
For example, when running this:
for (int j = 0; j < 95; ++j){
plot2.at<int>(20, j) = 0;
}
cv::imshow("segunda pared", plot2);
Shows this (the underlined part is the part corresponding to the code above):
A line that goes to 95 almost occupies all of the 400 points when it should only occupy 95/400th of the screen.
What am I doing wrong?
When you defined your cv::Mat, you told clearly that it is from the type CV_8U:
plot2 = cv::Mat(400,400, CV_8U, 255);
But when you are trying to print it, you are telling that its type is int which is usually a signed 32 bit not unsigned 8 bit. So the solution is:
for (int j = 0; j < 95; ++j){
plot2.at<uchar>(20, j) = 0;
}
Important note: Be aware that OpenCV uses the standard C++ types not the fixed ones. So, there is no need to use fixed size types like uint16_t or similar. because when compiling OpenCV & your code on another platform both of them will change together.
BTW, one of the good way to iterate through your cv::Mat is:
for (size_t row = 0; j < my_mat.rows; ++row){
auto row_ptr=my_mat.ptr<uchar>(row);
for(size_t col=0;col<my_mat.cols;++col){
//do whatever you want with row_ptr[col] (read/write)
}
}

Weird but close fft and ifft of image in c++

I wrote a program that loads, saves, and performs the fft and ifft on black and white png images. After much debugging headache, I finally got some coherent output only to find that it distorted the original image.
input:
fft:
ifft:
As far as I have tested, the pixel data in each array is stored and converted correctly. Pixels are stored in two arrays, 'data' which contains the b/w value of each pixel and 'complex_data' which is twice as long as 'data' and stores real b/w value and imaginary parts of each pixel in alternating indices. My fft algorithm operates on an array structured like 'complex_data'. After code to read commands from the user, here's the code in question:
if (cmd == "fft")
{
if (height > width) size = height;
else size = width;
N = (int)pow(2.0, ceil(log((double)size)/log(2.0)));
temp_data = (double*) malloc(sizeof(double) * width * 2); //array to hold each row of the image for processing in FFT()
for (i = 0; i < (int) height; i++)
{
for (j = 0; j < (int) width; j++)
{
temp_data[j*2] = complex_data[(i*width*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*width*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) width; j++)
{
complex_data[(i*width*2)+(j*2)] = temp_data[j*2];
complex_data[(i*width*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, width, height); //tested
free(temp_data);
temp_data = (double*) malloc(sizeof(double) * height * 2);
for (i = 0; i < (int) width; i++)
{
for (j = 0; j < (int) height; j++)
{
temp_data[j*2] = complex_data[(i*height*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*height*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) height; j++)
{
complex_data[(i*height*2)+(j*2)] = temp_data[j*2];
complex_data[(i*height*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, height, width);
free(temp_data);
free(data);
data = complex_to_real(complex_data, image.size()/4); //tested
image = bw_data_to_vector(data, image.size()/4); //tested
cout << "*** fft success ***" << endl << endl;
void FFT(double* data, unsigned long nn, int f_or_b){ // f_or_b is 1 for fft, -1 for ifft
unsigned long n, mmax, m, j, istep, i;
double wtemp, w_real, wp_real, wp_imaginary, w_imaginary, theta;
double temp_real, temp_imaginary;
// reverse-binary reindexing to separate even and odd indices
// and to allow us to compute the FFT in place
n = nn<<1;
j = 1;
for (i = 1; i < n; i += 2) {
if (j > i) {
swap(data[j-1], data[i-1]);
swap(data[j], data[i]);
}
m = nn;
while (m >= 2 && j > m) {
j -= m;
m >>= 1;
}
j += m;
};
// here begins the Danielson-Lanczos section
mmax = 2;
while (n > mmax) {
istep = mmax<<1;
theta = f_or_b * (2 * M_PI/mmax);
wtemp = sin(0.5 * theta);
wp_real = -2.0 * wtemp * wtemp;
wp_imaginary = sin(theta);
w_real = 1.0;
w_imaginary = 0.0;
for (m = 1; m < mmax; m += 2) {
for (i = m; i <= n; i += istep) {
j = i + mmax;
temp_real = w_real * data[j-1] - w_imaginary * data[j];
temp_imaginary = w_real * data[j] + w_imaginary * data[j-1];
data[j-1] = data[i-1] - temp_real;
data[j] = data[i] - temp_imaginary;
data[i-1] += temp_real;
data[i] += temp_imaginary;
}
wtemp = w_real;
w_real += w_real * wp_real - w_imaginary * wp_imaginary;
w_imaginary += w_imaginary * wp_real + wtemp * wp_imaginary;
}
mmax=istep;
}}
My ifft is the same only with the f_or_b set to -1 instead of 1. My program calls FFT() on each row, transposes the image, calls FFT() on each row again, then transposes back. Is there maybe an error with my indexing?
Not an actual answer as this question is Debug only so some hints instead:
your results are really bad
it should look like this:
first line is the actual DFFT result
Re,Im,Power is amplified by a constant otherwise you would see a black image
the last image is IDFFT of the original not amplified Re,IM result
the second line is the same but the DFFT result is wrapped by half size of image in booth x,y to match the common results in most DIP/CV texts
As you can see if you IDFFT back the wrapped results the result is not correct (checker board mask)
You have just single image as DFFT result
is it power spectrum?
or you forget to include imaginary part? to view only or perhaps also to computation somewhere as well?
is your 1D **DFFT working?**
for real data the result should be symmetric
check the links from my comment and compare the results for some sample 1D array
debug/repair your 1D FFT first and only then move to the next level
do not forget to test Real and complex data ...
your IDFFT looks BW (no gray) saturated
so did you amplify the DFFT results to see the image and used that for IDFFT instead of the original DFFT result?
also check if you do not round to integers somewhere along the computation
beware of (I)DFFT overflows/underflows
If your image pixel intensities are big and the resolution of image too then your computation could loss precision. Newer saw this in images but if your image is HDR then it is possible. This is a common problem with convolution computed by DFFT for big polynomials.
Thank you everyone for your opinions. All that stuff about memory corruption, while it makes a point, is not the root of the problem. The sizes of data I'm mallocing are not overly large, and I am freeing them in the right places. I had a lot of practice with this while learning c. The problem was not the fft algorithm either, nor even my 2D implementation of it.
All I missed was the scaling by 1/(M*N) at the very end of my ifft code. Because the image is 512x512, I needed to scale my ifft output by 1/(512*512). Also, my fft looks like white noise because the pixel data was not rescaled to fit between 0 and 255.
Suggest you look at the article http://www.yolinux.com/TUTORIALS/C++MemoryCorruptionAndMemoryLeaks.html
Christophe has a good point but he is wrong about it not being related to the problem because it seems that in modern times using malloc instead of new()/free() does not initialise memory or select best data type which would result in all problems listed below:-
Possibly causes are:
Sign of a number changing somewhere, I have seen similar issues when a platform invoke has been used on a dll and a value is passed by value instead of reference. It is caused by memory not necessarily being empty so when your image data enters it will have boolean maths performed on its values. I would suggest that you make sure memory is empty before you put your image data there.
Memory rotating right (ROR in assembly langauge) or left (ROL) . This will occur if data types are being used which do not necessarily match, eg. a signed value entering an unsigned data type or if the number of bits is different in one variable to another.
Data being lost due to an unsigned value entering a signed variable. Outcomes are 1 bit being lost because it will be used to determine negative or positive, or at extremes if twos complement takes place the number will become inverted in meaning, look for twos complement on wikipedia.
Also see how memory should be cleared/assigned before use. http://www.cprogramming.com/tutorial/memory_debugging_parallel_inspector.html

Add 1 to vector<unsigned char> value - Histogram in C++

I guess it's such an easy question (I'm coming from Java), but I can't figure out how it works.
I simply want to increment an vector element by one. The reason for this is, that I want to compute a histogram out of image values. But whatever I try I just can accomplish to assign a value to the vector. But not to increment it by one!
This is my histogram function:
void histogram(unsigned char** image, int height,
int width, vector<unsigned char>& histogramArray) {
for (int i = 0; i < width; i++) {
for (int j = 0; j < height; j++) {
// histogramArray[1] = (int)histogramArray[1] + (int)1;
// add histogram position by one if greylevel occured
histogramArray[(int)image[i][j]]++;
}
}
// display output
for (int i = 0; i < 256; i++) {
cout << "Position: " << i << endl;
cout << "Histogram Value: " << (int)histogramArray[i] << endl;
}
}
But whatever I try to add one to the histogramArray position, it leads to just 0 in the output. I'm only allowed to assign concrete values like:
histogramArray[1] = 2;
Is there any simple and easy way? I though iterators are hopefully not necesarry at this point, because I know the exakt index position where I want to increment something.
EDIT:
I'm so sorry, I should have been more precise with my question, thank you for your help so far! The code above is working, but it shows a different mean value out of the histogram (difference of around 90) than it should. Also the histogram values are way different than in a graphic program - even though the image values are exactly the same! Thats why I investigated the function and found out if I set the histogram to zeros and then just try to increase one element, nothing happens! This is the commented code above:
for (int i = 0; i < width; i++) {
for (int j = 0; j < height; j++) {
histogramArray[1]++;
// add histogram position by one if greylevel occured
// histogramArray[(int)image[i][j]]++;
}
}
So the position 1 remains 0, instead of having the value height*width. Because of this, I think the correct calculation histogramArray[image[i][j]]++; is also not working properly.
Do you have any explanation for this? This was my main question, I'm sorry.
Just for completeness, this is my mean function for the histogram:
unsigned char meanHistogram(vector<unsigned char>& histogram) {
int allOccurences = 0;
int allValues = 0;
for (int i = 0; i < 256; i++) {
allOccurences += histogram[i] * i;
allValues += histogram[i];
}
return (allOccurences / (float) allValues) + 0.5f;
}
And I initialize the image like this:
unsigned char** image= new unsigned char*[width];
for (int i = 0; i < width; i++) {
image[i] = new unsigned char[height];
}
But there shouldn't be any problem with the initialization code, since all other computations work perfectly and I am able to manipulate and safe the original image. But it's true, that I should change width and height - since I had only square images it didn't matter so far.
The Histogram is created like this and then the function is called like that:
vector<unsigned char> histogramArray(256);
histogram(array, adaptedHeight, adaptedWidth, histogramArray);
So do you have any clue why this part histogramArray[1]++; don't increases my histogram? histogramArray[1] remains 0 all the time! histogramArray[1] = 2; is working perfectly. Also histogramArray[(int)image[i][j]]++; seems to calculate something, but as I said, I think it's wrongly calculating.
I appreciate any help very much! The reason why I used a 2D Array is simply because it is asked for. I like the 1D version also much more, because it's way simpler!
You see, the current problem in your code is not incrementing a value versus assigning to it; it's the way you index your image. The way you've written your histogram function and the image access part puts very fine restrictions on how you need to allocate your images for this code to work.
For example, assuming your histogram function is as you've written it above, none of these image allocation strategies will work: (I've used char instead of unsigned char for brevity.)
char image [width * height]; // Obvious; "char[]" != "char **"
char * image = new char [width * height]; // "char*" != "char **"
char image [height][width]; // Most surprisingly, this won't work either.
The reason why the third case won't work is tough to explain simply. Suffice it to say that a 2D array like this will not implicitly decay into a pointer to pointer, and if it did, it would be meaningless. Contrary to what you might read in some books or hear from some people, in C/C++, arrays and pointers are not the same thing!
Anyway, for your histogram function to work correctly, you have to allocate your image like this:
char** image = new char* [height];
for (int i = 0; i < height; ++i)
image[i] = new char [width];
Now you can fill the image, for example:
for (int i = 0; i < height; ++i)
for (int j = 0; j < width; ++j)
image[i][j] = rand() % 256; // Or whatever...
On an image allocated like this, you can call your histogram function and it will work. After you're done with this image, you have to free it like this:
for (int i = 0; i < height; ++i)
delete[] image[i];
delete[] image;
For now, that's enough about allocation. I'll come back to it later.
In addition to the above, it is vital to note the order of iteration over your image. The way you've written it, you iterate over your columns on the outside, and your inner loop walks over the rows. Most (all?) image file formats and many (most?) image processing applications I've seen do it the other way around. The memory allocations I've shown above also assume that the first index is for the row, and the second is for the column. I suggest you do this too, unless you've very good reasons not to.
No matter which layout you choose for your images (the recommended row-major, or your current column-major,) it is in issue that you should always keep in your mind and take notice of.
Now, on to my recommended way of allocating and accessing images and calculating histograms.
I suggest that you allocate and free images like this:
// Allocate:
char * image = new char [height * width];
// Free:
delete[] image;
That's it; no nasty (de)allocation loops, and every image is one contiguous block of memory. When you want to access row i and column j (note which is which) you do it like this:
image[i * width + j] = 42;
char x = image[i * width + j];
And you'd calculate the histogram like this:
void histogram (
unsigned char * image, int height, int width,
// Note that the elements here are pixel-counts, not colors!
vector<unsigned> & histogram
) {
// Make sure histogram has enough room; you can do this outside as well.
if (histogram.size() < 256)
histogram.resize (256, 0);
int pixels = height * width;
for (int i = 0; i < pixels; ++i)
histogram[image[i]]++;
}
I've eliminated the printing code, which should not be there anyway. Note that I've used a single loop to go through the whole image; this is another advantage of allocating a 1D array. Also, for this particular function, it doesn't matter whether your images are row-major or column major, since it doesn't matter in what order we go through the pixels; it only matters that we go through all the pixels and nothing more.
UPDATE: After the question update, I think all of the above discussion is moot and notwithstanding! I believe the problem could be in the declaration of the histogram vector. It should be a vector of unsigned ints, not single bytes. Your problem seems to be that the value of the vector elements seem to stay at zero when your simplify the code and increment just one element, and are off from the values they need to be when you run the actual code. Well, this could be a symptom of numeric wrap-around. If the number of pixels in your image are a a multiple of 256 (e.g. 32x32 or 1024x1024 image) then it is natural that the sum of their number would be 0 mod 256.
I've already alluded to this point in my original answer. If you read my implementation of the histogram function, you see in the signature that I've declared my vector as vector<unsigned> and have put a comment above it that says this victor counts pixels, so its data type should be suitable.
I guess I should have made it bolder and clearer! I hope this solves your problem.

generating correct spectrogram using fftw and window function

For a project I need to be able to generate a spectrogram from a .WAV file. I've read the following should be done:
Get N (transform size) samples
Apply a window function
Do a Fast Fourier Transform using the samples
Normalise the output
Generate spectrogram
On the image below you see two spectrograms of a 10000 Hz sine wave both using the hanning window function. On the left you see a spectrogram generated by audacity and on the right my version. As you can see my version has a lot more lines/noise. Is this leakage in different bins? How would I get a clear image like the one audacity generates. Should I do some post-processing? I have not yet done any normalisation because do not fully understand how to do so.
update
I found this tutorial explaining how to generate a spectrogram in c++. I compiled the source to see what differences I could find.
My math is very rusty to be honest so I'm not sure what the normalisation does here:
for(i = 0; i < half; i++){
out[i][0] *= (2./transform_size);
out[i][6] *= (2./transform_size);
processed[i] = out[i][0]*out[i][0] + out[i][7]*out[i][8];
//sets values between 0 and 1?
processed[i] =10. * (log (processed[i] + 1e-6)/log(10)) /-60.;
}
after doing this I got this image (btw I've inverted the colors):
I then took a look at difference of the input samples provided by my sound library and the one of the tutorial. Mine were way higher so I manually normalised is by dividing it by the factor 32767.9. I then go this image which looks pretty ok I think. But dividing it by this number seems wrong. And I would like to see a different solution.
Here is the full relevant source code.
void Spectrogram::process(){
int i;
int transform_size = 1024;
int half = transform_size/2;
int step_size = transform_size/2;
double in[transform_size];
double processed[half];
fftw_complex *out;
fftw_plan p;
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
for(int x=0; x < wavFile->getSamples()/step_size; x++){
int j = 0;
for(i = step_size*x; i < (x * step_size) + transform_size - 1; i++, j++){
in[j] = wavFile->getSample(i)/32767.9;
}
//apply window function
for(i = 0; i < transform_size; i++){
in[i] *= windowHanning(i, transform_size);
// in[i] *= windowBlackmanHarris(i, transform_size);
}
p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE);
fftw_execute(p); /* repeat as needed */
for(i = 0; i < half; i++){
out[i][0] *= (2./transform_size);
out[i][11] *= (2./transform_size);
processed[i] = out[i][0]*out[i][0] + out[i][12]*out[i][13];
processed[i] =10. * (log (processed[i] + 1e-6)/log(10)) /-60.;
}
for (i = 0; i < half; i++){
if(processed[i] > 0.99)
processed[i] = 1;
In->setPixel(x,(half-1)-i,processed[i]*255);
}
}
fftw_destroy_plan(p);
fftw_free(out);
}
This is not exactly an answer as to what is wrong but rather a step by step procedure to debug this.
What do you think this line does? processed[i] = out[i][0]*out[i][0] + out[i][12]*out[i][13] Likely that is incorrect: fftw_complex is typedef double fftw_complex[2], so you only have out[i][0] and out[i][1], where the first is the real and the second the imaginary part of the result for that bin. If the array is contiguous in memory (which it is), then out[i][12] is likely the same as out[i+6][0] and so forth. Some of these will go past the end of the array, adding random values.
Is your window function correct? Print out windowHanning(i, transform_size) for every i and compare with a reference version (for example numpy.hanning or the matlab equivalent). This is the most likely cause, what you see looks like a bad window function, kind of.
Print out processed, and compare with a reference version (given the same input, of course you'd have to print the input and reformat it to feed into pylab/matlab etc). However, the -60 and 1e-6 are fudge factors which you don't want, the same effect is better done in a different way. Calculate like this:
power_in_db[i] = 10 * log(out[i][0]*out[i][0] + out[i][1]*out[i][1])/log(10)
Print out the values of power_in_db[i] for the same i but for all x (a horizontal line). Are they approximately the same?
If everything so far is good, the remaining suspect is setting the pixel values. Be very explicit about clipping to range, scaling and rounding.
int pixel_value = (int)round( 255 * (power_in_db[i] - min_db) / (max_db - min_db) );
if (pixel_value < 0) { pixel_value = 0; }
if (pixel_value > 255) { pixel_value = 255; }
Here, again, print out the values in a horizontal line, and compare with the grayscale values in your pgm (by hand, using the colorpicker in photoshop or gimp or similar).
At this point, you will have validated everything from end to end, and likely found the bug.
The code you produced, was almost correct. So, you didn't left me much to correct:
void Spectrogram::process(){
int transform_size = 1024;
int half = transform_size/2;
int step_size = transform_size/2;
double in[transform_size];
double processed[half];
fftw_complex *out;
fftw_plan p;
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
for (int x=0; x < wavFile->getSamples()/step_size; x++) {
// Fill the transformation array with a sample frame and apply the window function.
// Normalization is performed later
// (One error was here: you didn't set the last value of the array in)
for (int j = 0, int i = x * step_size; i < x * step_size + transform_size; i++, j++)
in[j] = wavFile->getSample(i) * windowHanning(j, transform_size);
p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE);
fftw_execute(p); /* repeat as needed */
for (int i=0; i < half; i++) {
// (Here were some flaws concerning the access of the complex values)
out[i][0] *= (2./transform_size); // real values
out[i][1] *= (2./transform_size); // complex values
processed[i] = out[i][0]*out[i][0] + out[i][1]*out[i][1]; // power spectrum
processed[i] = 10./log(10.) * log(processed[i] + 1e-6); // dB
// The resulting spectral values in 'processed' are in dB and related to a maximum
// value of about 96dB. Normalization to a value range between 0 and 1 can be done
// in several ways. I would suggest to set values below 0dB to 0dB and divide by 96dB:
// Transform all dB values to a range between 0 and 1:
if (processed[i] <= 0) {
processed[i] = 0;
} else {
processed[i] /= 96.; // Reduce the divisor if you prefer darker peaks
if (processed[i] > 1)
processed[i] = 1;
}
In->setPixel(x,(half-1)-i,processed[i]*255);
}
// This should be called each time fftw_plan_dft_r2c_1d()
// was called to avoid a memory leak:
fftw_destroy_plan(p);
}
fftw_free(out);
}
The two corrected bugs were most probably responsible for the slight variation of successive transformation results. The Hanning window is very vell suited to minimize the "noise" so a different window would not have solved the problem (actually #Alex I already pointed to the 2nd bug in his point 2. But in his point 3. he added a -Inf-bug as log(0) is not defined which can happen if your wave file containts a stretch of exact 0-values. To avoid this the constant 1e-6 is good enough).
Not asked, but there are some optimizations:
put p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE); outside the main loop,
precalculate the window function outside the main loop,
abandon the array processed and just use a temporary variable to hold one spectral line at a time,
the two multiplications of out[i][0] and out[i][1] can be abandoned in favour of one multiplication with a constant in the following line. I left this (and other things) for you to improve
Thanks to #Maxime Coorevits additionally a memory leak could be avoided: "Each time you call fftw_plan_dft_rc2_1d() memory are allocated by FFTW3. In your code, you only call fftw_destroy_plan() outside the outer loop. But in fact, you need to call this each time you request a plan."
Audacity typically doesn't map one frequency bin to one horizontal line, nor one sample period to one vertical line. The visual effect in Audacity may be due to resampling of the spectrogram picture in order to fit the drawing area.