Predict smearing of FFT - c++

After applying a Fourier transform to a signal, the energy of a single sine wave is often spread out across multiple bins (aka smearing). Have a look at the right side of the image below for an illustration:
I want to extract a list of peak frequencies. Just finding the highest bin is easy. But after that smearing becomes a problem.
I would like to have a heuristic which tells me if the magnitude of a specific bin is possibly the result of smearing or if there has to be another peak frequency in order to explain the signal. (It is better if I miss some than have false positives)
My naive approach would be to just calculate a few thousand examples and take the maximum of these to get an envelope curve so that any smearing is likely below that envelope.
But is there a better way to do this?

The FFT result of any rectangularly windowed pure unmodulated sinusoid is a Sinc function. This Sinc (sin(pix)/(pix)) function is only zero for all bins except one (at the peak magnitude) when the frequency of the input sinusoid has exactly an integer number of periods in the FFT width.
For all other frequencies that are not at the exact center of an FFT result bin, if you know that exact frequency and magnitude (which won't be in any single FFT bin), you can calculate all the other bins by sampling the Sinc function.
And, of course, if the input sinusoid isn't perfectly pure, but modulated in any way (amplitude, frequency or phase), this modulation will produce various sidebands in the FFT result as well.

Related

How to calculate efficiently and accurately the Fourier transform of a radial function in Fortran

As my question states, I want to calculate the Fourier transform F(q) of a radial function f(r) (defined on [0,infinity[ and which decays like an exponential exp(-Ar +b) at large r) as accurately as possible in Fortran. The function values come from a data file (which I can easily interpolate through cubic interpolation for example and extrapolate since the behaviour at large r is known).
I'm using the "physics" definition of the Fourier transform in 3D, which gives (because f is radial) :
I first tried to calculate this integral for some chosen values of q by using Gauss-Legendre quadrature, by generating some 60 or 100 abscissas and weights via the NAG routine D01BCF (D01BCF link). In the case of Gauss Legendre quadrature, the problem is to choose the interval [0,B] on which to integrate. While the function f loses 4 to 5 orders of magnitude from r=10 to r=20 (example), the choice of B as a strong influence on the result of the calculation... When I compared the result I get to a "nearly exact" calculation (made with matlab but with a veeeery long computation time), I saw that in fact this was only valid for small values of q (of the order of 5, when I have to deal with values as large as 150). A Gauss-Laguerre quadrature does not give any better result, probably because of the oscillatory part of the integrand.
I then tried to compute this Fourier transform for some given values of q with the routine D01ASF (D01ASF link). It is a "one-dimensional quadrature, adaptive, semi-infinite interval, weight function cos(ωx) or sin(ωx) ", which is exactly what I need. The results are quite convincing for q up to 80 or 100 if I input absolute error tolerances of 10E-5. Problems are : I would need to go at larger q, and the Fourier transform F(q) oscillates with a magnitude of ~ 10E-6 at such q's. Lowering the tolerance to 10E-5 already takes some time and even makes the whole thing to output some error message from the subroutine so I don't know if 10E-6 would be feasible.
I'm thus currently wondering if trying to calculate this Fourier transform with FFT wouldn't be a good idea ? The problems I face are that I don't know how to calculate radial wave functions with FFT (and also that I don't even know how to use FFT properly either since the definition of the transform is not even the same (exponent sign and argument) and that I never used it before).
Would you have ideas ? :)
EDIT 2 : I tried by FFT (using the routine C06FAF from NAG library). It works quite well up to some large values of q. The problem I face is that there is always some constant normalising factor to account for. I don't get why. This normalising factor evolves with the number N of points used in the mesh. It has the for of a power law : Normalising Factor F = N^(-0.5) x exp(9.9) approximately (see figure where the black line is the "exact" Fourier Transform and the green, magenta, blue, red and yellow lines are the FFT calculated for different values of N)
EDIT3 : I found the factor to be A*N^(-0.5) where A is the length of the integration mesh

Estimate gaussian height from its area

We (I and my colleague) were given a device, which sends to us each second a large amount of discrete integer data (intensities) that tend to have gaussian distribution. These pseudo gaussians flows one by one and we are supposed to pick the largest intensity from center of each gaussian as fast as possible. Moreover, these data contain a noise, so we cannot say that each gaussian can be separated to two monotone parts => we cannot rely on simple fact that if data start to decline, we will find the maximum.
My colleauge came up with an idea:
introduce an intensity threshold to separate gaussians from each other
sum intensities of each gaussian to estimate its area and then estimate its height
But the question is, how can I fast estimate height of this pseudo gaussian from its area?
UPDATE:
To be more clear, the intensities that I get represents "function values" of a gaussian, or batter they represent heights of histogram bins.
You could use a moving average filter, and when that starts to decline, take the maximum value in that window as your height. As long as the noise in the signal is fairly low amplitude and high frequency, that should work reasonably well. You can always combine it with thresholding if required. The people on the DSP site will probably have much better ideas though, so I'd ask there.

Compute frequency of sinusoidal signal, c++

i have a sinusoidal-like shaped signal,and i would like to compute the frequency.
I tried to implement something but looks very difficult, any idea?
So far i have a vector with timestep and value, how can i get the frequency from this?
thank you
If the input signal is a perfect sinusoid, you can calculate the frequency using the time between positive 0 crossings. Find 2 consecutive instances where the signal goes from negative to positive and measure the time between, then invert this number to convert from period to frequency. Note this is only as accurate as your sample interval and it does not account for any potential aliasing.
You could try auto correlating the signal. An auto correlation can be rapidly calculated by following these steps:
Perform FFT of the audio.
Multiply each complex value with its complex conjugate.
Perform the inverse FFT of the audio.
The left most peak will always be the highest (as the signal always correlates best with itself). The second highest peak, however, can be used to calculate the sinusoid's frequency.
For example if the second peak occurs at an offset (lag) of 50 points and the sample rate is 16kHz and the window is 1 second then the end frequency is 16000 / 50 or 320Hz. You can even use interpolation to get a more accurate estimation of the peak position and thus a more accurate sinusoid frequency. This method is quite intense but is very good for estimating the frequency after significant amounts of noise have been added!

Parameters to improve a music frequency analyzer

I'm using a FFT on audio data to output an analyzer, like you'd see in Winamp or Windows Media Player. However the output doesn't look that great. I'm plotting using a logarithmic scale and I average the linear results from the FFT into the corresponding logarithmic bins. As an example, I'm using bins like:
16k,8k,4k,2k,1k,500,250,125,62,31,15 [hz]
Then I plot the magnitude (dB) against frequency [hz]. The graph definitely 'reacts' to the music, and I can see the response of a drum sample or a high pitched voice. But the graph is very 'saturated' close to the lower frequencies, and overall doesn't look much like what you see in applications, which tend to be more evenly distributed. I feel that apps that display visual output tend to do different things to the data to make it look better.
What things could I do to the data to make it look more like the typical music player app?
Some useful information:
I downsample to single channel, 32kHz, and specify a time window of 35ms. That means the FFT gets ~1100 points. I vary these values to experiment (ie tried 16kHz, and increasing/decreasing interval length) but I get similar results.
With an FFT of 1100 points, you probably aren't able to capture the low frequencies with a lot of frequency resolution.
Think about it, 30 Hz corresponds to a period of 33ms, which at 32kHz is roughly 1000 samples. So you'll only be able to capture about 1 period in this time.
Thus, you'll need a longer FFT window to capture those low frequencies with sharp frequency resolution.
You'll likely need a time window of 4000 samples or more to start getting noticeably more frequency resolution at the low frequencies. This will be fine too, since you'll still get about 8-10 spectrum updates per second.
One option too, if you want very fast updates for the high frequency bins but good frequency resolution at the low frequencies, is to update the high frequency bins more quickly (such as with the windows you're currently using) but compute the low frequency bins less often (and with larger windows necessary for the good freq. resolution.)
I think a lot of these applications have variable FFT bins.
What you could do is start with very wide evenly spaced FFT bins like you have and then keep track of the number of elements that are placed in each FFT bin. If some of the bins are not used significantly at all (usually the higher frequencies) then widen those bins so that they are larger (and thus have more frequency entries) and shring the low frequency bins.
I have worked on projects were we just spend a lot of time tuning bins for specific input sources but it is much nicer to have the software adjust in real time.
A typical visualizer would use constant-Q bandpass filters, not a single FFT.
You could emulate a set of constant-Q bandpass filters by multiplying the FFT results by a set of constant-Q filter responses in the frequency domain, then sum. For low frequencies, you should use an FFT longer than the significant impulse response of the lowest frequency filter. For high frequencies, you can use shorter FFTs for better responsiveness. You can slide any length FFTs along at any desired update rate by overlapping (re-using) data, or you might consider interpolation. You might also want to pre-window each FFT to reduce "spectral leakage" between frequency bands.

Gaussian Blur with FFT Questions

I have a current implementation of Gaussian Blur using regular convolution. It is efficient enough for small kernels, but once the kernels size gets a little bigger, the performance takes a hit. So, I am thinking to implement the convolution using FFT. I've never had any experience with FFT related image processing so I have a few questions.
Is a 2D FFT based convolution also separable into two 1D convolutions ?
If true, does it go like this - 1D FFT on every row, and then 1D FFT on every column, then multiply with the 2D kernel and then inverse transform of every column and the inverse transform of every row? Or do I have to multiply with a 1D kernel after each 1D FFT Transform?
Now I understand that the kernel size should be the same size as the image (row in case of 1D). But how will it affect the edges? Do I have to pad the image edges with zeros? If so the kernel size should be equal to the image size before or after padding?
Also, this is a C++ project, and I plan on using kissFFT, since this is a commercial project. You are welcome to suggest any better alternatives. Thank you.
EDIT: Thanks for the responses, but I have a few more questions.
I see that the imaginary part of the input image will be all zeros. But will the output imaginary part will also be zeros? Do I have to multiply the Gaussian kernel to both real and imaginary parts?
I have instances of the same image to be blurred at different scales, i.e. the same image is scaled to different sizes and blurred at different kernel sizes. Do I have to perform a FFT every time I scale the image or can I use the same FFT?
Lastly, If I wanted to visualize the FFT, I understand that a log filter has to be applied to the FFT. But I am really lost on which part should be used to visualize FFT? The real part or the imaginary part.
Also for an image of size 512x512, what will be the size of real and imaginary parts. Will they be the same length?
Thank you again for your detailed replies.
The 2-D FFT is seperable and you are correct in how to perform it except that you must multiply by the 2-D FFT of the 2D kernel. If you are using kissfft, an easier way to perform the 2-D FFT is to just use kiss_fftnd in the tools directory of the kissfft package. This will do multi-dimensional FFTs.
The kernel size does not have to be any particular size. If the kernel is smaller than the image, you just need to zero-pad up to the image size before performing the 2-D FFT. You should also zero pad the image edges since the convoulution being performed by multiplication in the frequency domain is actually circular convolution and results wrap around at the edges.
So to summarize (given that the image size is M x N):
come up with a 2-D kernel of any size (U x V)
zero-pad the kernel up to (M+U-1) x (N+V-1)
take the 2-D fft of the kernel
zero-pad the image up to (M+U-1) x (N+V-1)
take the 2-D FFT of the image
multiply FFT of kernel by FFT of image
take inverse 2-D FFT of result
trim off garbage at edges
If you are performing the same filter multiple times on different images, you don't have to perform 1-3 every time.
Note: The kernel size will have to be rather large for this to be faster than direct computation of convolution. Also, did you implement your direct convolution taking advantage of the fact that a 2-D gaussian filter is separable (see this a few paragraphs into the "Mechanics" section)? That is, you can perform the 2-D convolution as 1-D convolutions on the rows and then the columns. I have found this to be faster than most FFT-based approaches unless the kernels are quite large.
Response to Edit
If the input is real, the output will still be complex except for rare circumstances. The FFT of your gaussian kernel will also be complex, so the multiply must be a complex multiplication. When you perform the inverse FFT, the output should be real since your input image and kernel are real. The output will be returned in a complex array, but the imaginary components should be zero or very small (floating point error) and can be discarded.
If you are using the same image, you can reuse the image FFT, but you will need to zero pad based on your biggest kernel size. You will have to compute the FFTs of all of the different kernels.
For visualization, the magnitude of the complex output should be used. The log scale just helps to visualize smaller components of the output when larger components would drown them out in a linear scale. The Decibel scale is often used and is given by either 20*log10(abs(x)) or 10*log10(x*x') which are equivalent. (x is the complex fft output and x' is the complex conjugate of x).
The input and output of the FFT will be the same size. Also the real and imaginary parts will be the same size since one real and one imaginary value form a single sample.
Remember that convolution in space is equivalent to multiplication in frequency domain. This means that once you perform FFT of both image and mask (kernel), you only have to do point-by-point multiplication, and then IFFT of the result. Having said that, here are a few words of caution.
You probably know that in digital signal processing, we often use circular convolution, not linear convolution. This happens because of curious periodicity. What this means in simple terms is that DFT (and FFT which is its computationally efficient variant) assumes that you signal is periodic, and when you filter your signal in such manner -- suppose your image is N x M pixels -- that it takes pixel at (1,m) to the the neighbor or pixel at (N, m) for some m<M. You signal virtually wraps around onto itself. This means that your Gaussian mask will be averaging pixels on the far right with pixels on the far left, and same goes for top and bottom. This might or might not be desired, but in general one has to deal with edging artifacts anyway. It is however much easier to forget about this issue when dealing with FFT multiplication because the problem stops being apparent. There are many ways to take care of this problem. The best way is to simply pad your image with zeros and remove the extra pixels later.
A very neat thing about using a Gaussian filter in frequency domain is that you never really have to take its FFT. It si a well-know fact that Fourier transform of a Gaussian is a Gaussian (some technical details here). All you would have to do then is pad you image with zeros (both top and bottom), generate a Gaussian in the frequency domain, multiply them together and take IFFT. Then you're done.
Hope this helps.