How can i get frequency in time domain from a wav file? - c++

In other words, I am trying to play a .wav file and for doing this I need to know the frequency and how much it lasts; the API I am using has a method that need as parameter a vector with two fields (frequency and time)!
I tried to use fast fourier transformation but it gives me the frequency and the magnitude!
mag
/\
|
|
-|------> freq
But I need something like this:
freq
/\
|
|
-|------>time
I want to know if is possible to get these informations from a wav file!

A digital audio signal is series of pairs (amplitude, time).
Or you can say it is a function of time.
If you take a sequence of an audio signal and perform a Fourier Transformation (DFT/FFT) on this sequence, you will get a new sequence which contains pairs of (amplitude, frequency).
Or you can say a function of frequency.
This sequence describes the properties of a signal in frequency domain.
It does not contain any time information at all.
I guess, what you want, is a function, which describes the change of an audio signal's frequency components over time. This can not be done by a simple FFT.
What you can do is:
Take N samples of the audio data stream, samples (0, ..., N-1)
Perform a FFT
Take another N samples of the audio data stream, (m, ..., m+N-1) with m << N
Perform a FFT
Take another N samples of the audio data stream, (2m, ..., 2m+N-1)
Perform a FFT
and so on
If your sampling time is ts, you will get a new frequency analysis after T = m*ts.
Maybe, that is what you you want.

Related

How can i make 16000Hz sample from 44100Hz sample in a real-time stream?

I use portaudio in a Cpp work.
My signal model treats the only 16000Hz audio input and
When the First released my work, I don't need to use 44100 sample rate. It was just about 48000Hz microphone.
So I resampled my signal like 48000 -> 16000 -> 48000 with a simple decimation algorithm and linear interpolation.
But now I want to use a 44100 microphone. In real-time processing, My buffer is 256 points in 16000 Hz. So it is hard to find the input buffer size in 44100 Hz and downsample from 44100 to 16000.
When I used just decimation or average filter(https://github.com/mattdiamond/Recorderjs/issues/186), the output speech is higher then input and windowed sinc function interpolation makes a distortion.
is there any method to make 44100->16000 downsampling for realtime processing? please let me know...
thank you.
I had to implement a similar problem in the past, not for audio, but to simulate an asynchronism between a transmitte signal sampling frequency and a receiver sampling frequency.
This is how I will proceed:
Let us call T1 the sampling time duration of the incoming signal x: T1=1/44100 and
let us call T2 the sampling time duration of the signal to be generated y.
To calculate the value of the signal y[n*T2], select the two input values x[k*T1]and x[(k+1)*T2]
that surround the value to be calculated:
k*T1 <= n*T2 < (k+1)*T1
Then perform a linear interpolation from these two values. The interpolation factor must be recalculated for each sample.
If t = n*T2, a = k*T1 and b = (k+1)*T2, then
p = (x[b] - x[a])/T1
y[t] = p*(t-a) + x[a]
With a 44.1kHz frequency, x|a]and x[a+T1] should be rather well correlated, and the linear interpolation could be goood enough.
With the obtained quality is not good enough, you can interpolate the incoming signal with a fixed interpolation ratio,
for example 2, with a classical well defined good interpolation filter.
Then you can apply the previous procedure, with the help of the new calculated signal,
the sampling duration of which is T1/2.
If the incoming signal has some high frequencies, then, in order to avoid aliasing, you need to apply a low-pas filter to the incoming signal prior to the downsampling. Note that this is necessary even in your previous case 48kHz -> 16kHz

Compute FFT in frequency axis when signal is in rawData in Matlab

I have a signal of frequency 10 MHz sampled at 100 MS/sec. How to compute FFT in matlab in terms of frequency when my signal is in rawData (length of this rawData is 100000), also
what should be the optimum length of NFFT.(i.e., on what factor does NFFT depend)
why does my Amplitude (Y axis) change with NFFT
whats difference between NFFT, N and L. How to compute length of a signal
How to separate Noise and signal from a single signal (which is in rawData)
Here is my code,
t=(1:40);
f=10e6;
fs=100e6;
NFFT=1024;
y=abs(rawData(:1000,2));
X=abs(fft(y,NFFT));
f=[-fs/2:fs/NFFT:(fs/2-fs/NFFT)];
subplot(1,1,1);
semilogy(f(513:1024),X(513:1024));
axis([0 10e6 0 10]);
As you can find the corresponding frequencies in another post, I will just answer your other questions:
Including all your data is most of the time the best option. fft just truncates your input data to the requested length, which is probably not what you want. If you known the period of your input single, you can truncate it to include a whole number of periods. If you don't know it, a window (ex. Hanning) may be interesting.
If you change NFFT, you use more data in your fft calculation, which may change the amplitude for a given frequency slightly. You also calculate the amplitude at more frequencies between 0 and Fs/2 (half of the sampling frequency).
Question is not clear, please provide the definition of N and L.
It depends on your application. If the noise is at the same frequency as your signal, you are not able to separate it. Otherwise, you can a filter (ex. bandpass) to extract the frequencies of interest.

Length of FFT and IFFT

I have some signals which I add up to a larger signal, where each signal is located in a different frequency region.
Now, I perform the FFT operation on the big signal with FFTW and cut the concrete FFT bins (where the signals are located) out.
For example: The big signal is FFT transformed with 1024 points,
the sample rate of the signal is fs=200000.
I calculate the concrete bin positions for given start and stop frequencies in the following way:
tIndex.iStartPos = (int64_t) ((tFreqs.i64fstart) / (mSampleRate / uFFTLen));
and e.g. I get for the first signal to be cut out 16 bins.
Now I do the IFFT transformation again with FFTW and get the 16 complex values back (because I reserved the vector for 16 bins).
But when I compare the extracted signal with the original small signal in MATLAB, then I can see that the original signal (is a wav-File) has xxxxx data and my signal (which I saved as raw binary file) has only 16 complex values.
So how do I obtain the length of the IFFT operation to be correctly transformed? What is wrong here?
EDIT
The logic itself is split over 3 programs, each line is in a multithreaded environment. For that reason I post here some pseudo-code:
ReadWavFile(); //returns the signal data and the RIFF/FMT header information
CalculateFFT_using_CUFFTW(); //calculates FFT with user given parameters, like FFT length, polyphase factor, and applies polyphased window to reduce leakage effect
GetFFTData(); //copy/get FFT data from CUDA device
SendDataToSignalDetector(); //detects signals and returns center frequency and bandwith for each sigal
Freq2Index(); // calculates positions with the returned data from the signal detector
CutConcreteBins(position);
AddPaddingZeroToConcreteBins(); // adds zeros till next power of 2
ApplyPolyphaseAndWindow(); //appends the signal itself polyphase-factor times and applies polyphased window
PerformIFFT_using_FFTW();
NormalizeFFTData();
Save2BinaryFile();
-->Then analyse data in MATLAB (is at the moment in work).
If you have a real signal consisting of 1024 samples, the contribution from the 16 frequency bins of interest could be obtained by multiplying the frequency spectrum by a rectangular window then taking the IFFT. This essentially amounts to:
filling a buffer with zeros before and after the frequency bins of interest
copying the frequency bins of interest at the same locations in that buffer
if using a full-spectrum representation (if you are using fftw_plan_dft_1d(..., FFTW_BACKWARD,... for the inverse transform), computing the Hermitian symmetry for the upper half of the spectrum (or simply use a half-spectrum representation and perform the inverse transform through fftw_plan_dft_c2r_1d).
That said, you would get a better frequency decomposition by using specially designed filters instead of just using a rectangular window in the frequency domain.
The output length of the FT is equal to the input length. I don't know how you got to 16 bins; the FT of 1024 inputs is 1024 bins. Now for a real input (not complex) the 1024 bins will be mirrorwise identical around 512/513, so your FFT library may return only the lower 512 bins for a real input. Still, that's more than 16 bins.
You'll probably need to fill all 1024 bins when doing the IFFT, as it generally doesn't assume that its output will become a real signal. But that's just a matter of mirroring the lower 512 bins then.

Drawing audio spectrum with Bass library

How can I draw an spectrum for an given audio file with Bass library?
I mean the chart similar to what Audacity generates:
I know that I can get the FFT data for given time t (when I play the audio) with:
float fft[1024];
BASS_ChannelGetData(chan, fft, BASS_DATA_FFT2048); // get the FFT data
That way I get 1024 values in array for each time t. Am I right that the values in that array are signal amplitudes (dB)? If so, how the frequency (Hz) is associated with those values? By the index?
I am an programmer, but I am not experienced with audio processing at all. So I don't know what to do, with the data I have, to plot the needed spectrum.
I am working with C++ version, but examples in other languages are just fine (I can convert them).
From the documentation, that flag will cause the FFT magnitude to be computed, and from the sounds of it, it is the linear magnitude.
dB = 10 * log10(intensity);
dB = 20 * log10(pressure);
(I'm not sure whether audio file samples are a measurement of intensity or pressure. What's a microphone output linearly related to?)
Also, it indicates the length of the input and the length of the FFT match, but half the FFT (corresponding to negative frequencies) is discarded. Therefore the highest FFT frequency will be one-half the sampling frequency. This occurs at N/2. The docs actually say
For example, with a 2048 sample FFT, there will be 1024 floating-point values returned. If the BASS_DATA_FIXED flag is used, then the FFT values will be in 8.24 fixed-point form rather than floating-point. Each value, or "bin", ranges from 0 to 1 (can actually go higher if the sample data is floating-point and not clipped). The 1st bin contains the DC component, the 2nd contains the amplitude at 1/2048 of the channel's sample rate, followed by the amplitude at 2/2048, 3/2048, etc.
That seems pretty clear.

FFT of large data (16gB) using Matlab

I am trying to compute a fast fourier transform of a large chunk of data imported from a text file which is around 16 gB in size. I was trying to think of a way to compute its fft in matlab, but due to my computer memory (8gB) it is giving me an out of memory error. I tried using memmap, textscan, but was not able to apply to get FFT of the combined data.
Can anyone kindly guide me as to how should I approach to get the fourier transform? I am also trying to get the fourier transform (using definition) using C++ code on a remote server, but it's taking a long time to execute. Can anyone give me a proper insight as to how should I handle this large data?
It depends on the resolution of the FFT that you require. If you only need an FFT of, say, 1024 points, then you can reshape your data to, or sequentially read it as N x 1024 blocks. Once you have it in this format, you can then add the output of each FFT result to a 1024 point complex accumulator.
If you need the same resolution after the FFT, then you need more memory, or a special fft routine that is not included in Matlab (but I'm not sure if it is even mathematically possible to do a partial FFT by buffering small chunks through for full resolution).
It may be better you implement FFT with your own code.
The FFT algorithm has a "butterfly" operation. Hence you can split the whole step into smaller blocks.
The file size is too large for a typical pc to handle. But FFT doesn't need all data at once. It can always start with 2-point (maybe 8-point is better) FFT, and you can build up by cascading the stages. It means you can read only a few points at a time, do some calculation, and save your data to disk. Next time you doing another iteration, you can read the saved data from disk.
Depending on how you build the data structure, you can either store all the data in one single file, and read/save it with pointers (in Matlab it's merely a number); or you can store every single point in one individual file, generating billions of files and distinguishing them by file names.
The idea is you can dump your calculation to disk, instead of memory. Of course it requires such amount of disk space, which is more feasible.
I can show you a piece of pseudo-code. Depending on the data structure of your original data (that 16GB txt file), the implementation will be different, but you can easily operate as you own the file. I will start with 2-point FFT and do with the 8-point sample in this wikipedia picture.
1.Do 2-point FFT on x, generating y, the 3rd column of white circles from left.
read x[0], x[4] from file 'origin'
y[0] = x[0] + x[4]*W(N,0);
y[1] = x[0] - x[4]*W(N,0);
save y[0], y[1] to file 'temp'
remove x[0], x[4], y[0], y[1] from memory
read x[2], x[6] from file 'origin'
y[2] = x[2] + x[6]*W(N,0);
y[3] = x[2] - x[6]*W(N,0);
save y[2], y[3] to file 'temp'
remove x[2], x[6], y[2], y[3] from memory
....
2.Do 2-point FFT on y, generating z, the 5th column of white circles.
3.Do 2-point FFT on z, generating final result, X.
Basically the Cooley–Tukey FFT algorithm is designed to enable you cut up the data and calculate piece by piece, so it's possible to handle large-amount data. I know it's not a regular way but if you can take a look at the Chinese version of that Wikipedia page, you may find a number of pictures that may help you understand how it splits up the points.
I've encountered this same problem. I ended up finding a solution in a paper:
Extending sizes of effective convolution algorithms. It essentially involves loading shorter chunks, multiplying by a phase factor and FFT-ing, then loading the next chunk in the series. This gives a sampled of the total FFT of the full signal. The process is then repeated with a number of times with different phase factors to fill in the remaining points. I will attempt to summarize here (adapted from Table II in the paper):
For a total signal f(j) of length N, decide on a number m or shorter chunks each of length N/m that you can store in memory (if needed, zero-pad the signal such that N is a multiple of m)
For beta = 0, 1, 2, ... ,m - 1 do the following:
Divide the new series into m subintervals of N/m successive points.
For each subinterval, multiply each jth element by exp(i*2*pi*j*beta/N). Here, j is indexed according to the position of the point relative to the first in the whole data stream.
Sum the first elements of each subinterval to produce a single number, sum the second elements, and so forth. This can be done as points are read from file, so there is no need to have the full set of N points in memory.
Fourier transform the resultant series, which contains N/m points.
This will give F(k) for k = ml + beta, for l = 0, ..., N/m-1. Save these values to disk.
Go to 2, and proceed with the next value of beta.