locating frequencies in the spectrum after FFT [duplicate] - c++

This question already has answers here:
How do I obtain the frequencies of each value in an FFT?
(5 answers)
Closed 8 years ago.
I've got 16-bit mono audio data in raw format, sampled at 48 KHz. I'm using Aquila C++ library to get the spectrum of it, since I need to perform EQ on it. Here's a code snippet:
Aquila::SampleType samples[512];
Aquila::SpectrumType spect;
std::shared_ptr<Aquila::Fft> fft;
('samples' is filled with audio)
fft = Aquila::FftFactory::getFft(512);
spect = fft->fft(samples);
So the audio data is split into 512 samples, and each piece is converted to frequency domain (FFT). I want to change the "magnitude" of e.g. 2KHz and to set the magnitude of all the frequencies beyond e.g. 10 KHz to 0 (low pass filter).
My only problem with this is that I don't know the frequency range of the spectrum generated by Aquila. I mean, I personally know that the sampling rate of the audio was 48 KHz, but Aquila FFT isn't told this value, it doesn't even need it to perform FFT. How can I determine to exactly which frequency each array entry is mapped to? E.g. spect[0] = 1 Hz, spect[10] = 126 Hz, spect[511] = 22.13 KHz etc.

As it turns out from the comments, FFT doesn't have to be explicitly told the sampling frequency used for sampling the audio signal. One another thing to watch for is that only half of the bins holds relevant information. The frequency of n'th bin in case of an 512-step FFT is:
freq = (n * sample_rate) / 512

Related

Ultrafast 2x lossy audio/image compression algorithm?

I'm looking for an audio or image compression algorithm that can compress a torrent of 16-bit samples
by a fairly predictable amount (2-3x)
at very high speed (say, 60 cycles per sample at most: >100MB/s)
with lossiness being acceptable but, of course, undesirable
My data has characteristics of images and audio (2-dimensional, correlated in both dimensions and audiolike in one dimension) so algorithms for audio or images might both be appropriate.
An obvious thing to try would be this one-dimensional algorithm:
break up the data into segments of 64 samples
measure the range of values among those samples (as an example, the samples might be between 3101 and 9779 in one segment, a difference of 6678)
use 2 to 4 additional bytes to encode the range
linearly downsample each 16-bit sample to 8 bits in that segment.
For example, I could store 3101 in 16 bits, and store a scaling factor ceil(6678/256) = 27 in 8 bits, then convert each 16-bit sample to 8-bit as s8 = (s16 - base) / scale where base = 3101 + 27>>1, scale = 27, with the obvious decompression "algorithm" of s16 = s8 * 27 + 3101.) Compression ratio: 128/67 = 1.91.
I've played with some ideas to avoid the division operation, but hasn't someone by now invented a superfast algorithm that could preserve fidelity better than this one?
Note: this page says that FLAC compresses at 22 million samples per second (44MB/s) at -q6 which is pretty darn good (assuming its implementation is still single-threaded), if not quite enough for my application. Another page says FLAC has similar performance (40MB/s on a 3.4GHz i3-3240, -q5) as 3 other codecs, depending on quality level.
Take a look at the PNG filters for examples of how to tease out your correlations. The most obvious filter is "sub", which simply subtracts successive samples. The differences should be more clustered around zero. You can then run that through a fast compressor like lz4. Other filter choices may result in even better clustering around zero, if they can find advantage in the correlations in your other dimension.
For lossy compression, you can decimate the differences before compressing them, dropping a few low bits until you get the compression you want, and still retain the character of the data that you would like to preserve.

Compute FFT in frequency axis when signal is in rawData in Matlab

I have a signal of frequency 10 MHz sampled at 100 MS/sec. How to compute FFT in matlab in terms of frequency when my signal is in rawData (length of this rawData is 100000), also
what should be the optimum length of NFFT.(i.e., on what factor does NFFT depend)
why does my Amplitude (Y axis) change with NFFT
whats difference between NFFT, N and L. How to compute length of a signal
How to separate Noise and signal from a single signal (which is in rawData)
Here is my code,
t=(1:40);
f=10e6;
fs=100e6;
NFFT=1024;
y=abs(rawData(:1000,2));
X=abs(fft(y,NFFT));
f=[-fs/2:fs/NFFT:(fs/2-fs/NFFT)];
subplot(1,1,1);
semilogy(f(513:1024),X(513:1024));
axis([0 10e6 0 10]);
As you can find the corresponding frequencies in another post, I will just answer your other questions:
Including all your data is most of the time the best option. fft just truncates your input data to the requested length, which is probably not what you want. If you known the period of your input single, you can truncate it to include a whole number of periods. If you don't know it, a window (ex. Hanning) may be interesting.
If you change NFFT, you use more data in your fft calculation, which may change the amplitude for a given frequency slightly. You also calculate the amplitude at more frequencies between 0 and Fs/2 (half of the sampling frequency).
Question is not clear, please provide the definition of N and L.
It depends on your application. If the noise is at the same frequency as your signal, you are not able to separate it. Otherwise, you can a filter (ex. bandpass) to extract the frequencies of interest.

Length of FFT and IFFT

I have some signals which I add up to a larger signal, where each signal is located in a different frequency region.
Now, I perform the FFT operation on the big signal with FFTW and cut the concrete FFT bins (where the signals are located) out.
For example: The big signal is FFT transformed with 1024 points,
the sample rate of the signal is fs=200000.
I calculate the concrete bin positions for given start and stop frequencies in the following way:
tIndex.iStartPos = (int64_t) ((tFreqs.i64fstart) / (mSampleRate / uFFTLen));
and e.g. I get for the first signal to be cut out 16 bins.
Now I do the IFFT transformation again with FFTW and get the 16 complex values back (because I reserved the vector for 16 bins).
But when I compare the extracted signal with the original small signal in MATLAB, then I can see that the original signal (is a wav-File) has xxxxx data and my signal (which I saved as raw binary file) has only 16 complex values.
So how do I obtain the length of the IFFT operation to be correctly transformed? What is wrong here?
EDIT
The logic itself is split over 3 programs, each line is in a multithreaded environment. For that reason I post here some pseudo-code:
ReadWavFile(); //returns the signal data and the RIFF/FMT header information
CalculateFFT_using_CUFFTW(); //calculates FFT with user given parameters, like FFT length, polyphase factor, and applies polyphased window to reduce leakage effect
GetFFTData(); //copy/get FFT data from CUDA device
SendDataToSignalDetector(); //detects signals and returns center frequency and bandwith for each sigal
Freq2Index(); // calculates positions with the returned data from the signal detector
CutConcreteBins(position);
AddPaddingZeroToConcreteBins(); // adds zeros till next power of 2
ApplyPolyphaseAndWindow(); //appends the signal itself polyphase-factor times and applies polyphased window
PerformIFFT_using_FFTW();
NormalizeFFTData();
Save2BinaryFile();
-->Then analyse data in MATLAB (is at the moment in work).
If you have a real signal consisting of 1024 samples, the contribution from the 16 frequency bins of interest could be obtained by multiplying the frequency spectrum by a rectangular window then taking the IFFT. This essentially amounts to:
filling a buffer with zeros before and after the frequency bins of interest
copying the frequency bins of interest at the same locations in that buffer
if using a full-spectrum representation (if you are using fftw_plan_dft_1d(..., FFTW_BACKWARD,... for the inverse transform), computing the Hermitian symmetry for the upper half of the spectrum (or simply use a half-spectrum representation and perform the inverse transform through fftw_plan_dft_c2r_1d).
That said, you would get a better frequency decomposition by using specially designed filters instead of just using a rectangular window in the frequency domain.
The output length of the FT is equal to the input length. I don't know how you got to 16 bins; the FT of 1024 inputs is 1024 bins. Now for a real input (not complex) the 1024 bins will be mirrorwise identical around 512/513, so your FFT library may return only the lower 512 bins for a real input. Still, that's more than 16 bins.
You'll probably need to fill all 1024 bins when doing the IFFT, as it generally doesn't assume that its output will become a real signal. But that's just a matter of mirroring the lower 512 bins then.

Drawing audio spectrum with Bass library

How can I draw an spectrum for an given audio file with Bass library?
I mean the chart similar to what Audacity generates:
I know that I can get the FFT data for given time t (when I play the audio) with:
float fft[1024];
BASS_ChannelGetData(chan, fft, BASS_DATA_FFT2048); // get the FFT data
That way I get 1024 values in array for each time t. Am I right that the values in that array are signal amplitudes (dB)? If so, how the frequency (Hz) is associated with those values? By the index?
I am an programmer, but I am not experienced with audio processing at all. So I don't know what to do, with the data I have, to plot the needed spectrum.
I am working with C++ version, but examples in other languages are just fine (I can convert them).
From the documentation, that flag will cause the FFT magnitude to be computed, and from the sounds of it, it is the linear magnitude.
dB = 10 * log10(intensity);
dB = 20 * log10(pressure);
(I'm not sure whether audio file samples are a measurement of intensity or pressure. What's a microphone output linearly related to?)
Also, it indicates the length of the input and the length of the FFT match, but half the FFT (corresponding to negative frequencies) is discarded. Therefore the highest FFT frequency will be one-half the sampling frequency. This occurs at N/2. The docs actually say
For example, with a 2048 sample FFT, there will be 1024 floating-point values returned. If the BASS_DATA_FIXED flag is used, then the FFT values will be in 8.24 fixed-point form rather than floating-point. Each value, or "bin", ranges from 0 to 1 (can actually go higher if the sample data is floating-point and not clipped). The 1st bin contains the DC component, the 2nd contains the amplitude at 1/2048 of the channel's sample rate, followed by the amplitude at 2/2048, 3/2048, etc.
That seems pretty clear.

FIR filter design: how to input sine wave form

I am currently taking a class in school and I have to code FIR/IIR filter in C/C++.
As an input to the filter, 2kHz sine wave with white noise is used. Then, by inputting the sine wave to the C/C++ code, I need to observe the clean sine wave output. It's all done in software level.
My problem is that I don't know how to deal with this input/output of sine wave. For example, I don't know what type of file format I can use or need to use, I don't know how to make the sine wave form and etc.
This might be a very trivial question, but I have no clue where to begin.
Does anyone have any experience in this type of question or have any tips?
Any help would be really appreciated.
Generating the sine wave at 2kHz means that you want to generate values over time that, when graphed, follow a sine wave. Pick an amplitude (you didn't mention one), and pick your sample rate. See the graph here (http://en.wikipedia.org/wiki/Sine_wave); you want values that when plotted follow the sine wave graphed in 2D with the X axis being time, and the Y axis being the amplitude of the value you are measuring.
amplitude (volts, degrees, pascals, milliamps, etc)
frequency (2kHz, that is 2000 sine waves/second)
sample rate (how many samples do you want per second)
Suppose you generate a file that has a time value and an amplitude measurement, which you would want to scale to your amplitude (more on this later). So a device might give an 8-bit or 16-bit digital reading which represents either an absolute, or logarithmic measurement against some scale.
struct sample
{
long usec; //microseconds (1/1,000,000 second)
short value; //many devices give a value between 0 and 255
}
Suppose you generate exactly 2000 samples/second. If you were actually measuring an external value, you would get the same value every time (see that?), which when graphed would look like a straight line.
So you want a sample rate higher than the frequency. Suppose you sample as 2x the frequency. Then you would see points 180deg off on the sine wave, which might be peaks, up or down slope, or where sine wave crosses zero. A sample rate 4x the frequency would show a sawtooth pattern. And as you increase the number of samples, your graph looks closer to the actual sine wave. This is similar to the pixelization you see in 8-bit game sprites.
How many samples for any given sine wave would you think would give you a good approximation of a sine wave? 8? 16? 100? 500? Suppose you sampled 1,000,000 times per second, then you would have 1,000,000/2,000 = 500 samples per sine wave.
pick your sample rate (500)
define your frequency (2000)
decide how long to record your samples (5 seconds?)
define your amplitude (device measures 0-255, but what is measured max?)
Here is code to generate some samples,
#define MAXJITTER (10)
#define MAXNOISE (20)
int
generate_samples( long duration, //duration in microseconds
int amplitude, //scaled peak measurement from device
int frequency, //Hz > 0
int samplerate ) //how many samples/second > 0
{
long ts; //timestamp in microseconds, usec
long sdelay; //sample delay in usec
if(frequency<1) frequency1=1; //avoid division by zero
if(samplerate<1) samplerate=1; //avoid division by zero
sdelay = 1000000/samplerate; //usec delay between each sample
sample m;
int jitter, noise; //introduce noise here
for( long ts=0; ts<duration; ts+=sdelay ) // //in usec (microseconds)
{
//jitter, sample not exactly sdelay
jitter = drand48()*MAXJITTER - (MAXJITTER/2); // +/-1/2 MAXJITTER
//noise is mismeasurement
noise = drand48()*MAXNOISE - (MAXNOISE/2); // +/-1/2 MAXNOISE
m.usec = ts + jitter;
//2PI in a full sine wave
float period = 2*PI * (ts*1.0/frequency);
m.value = sin( period );
//write m to file or save me to array/vector
}
return 0; //return number of samples, or sample array, etc
}
First generate some samples,
generate_samples( 5*1000000, 100, 2000, 2000*50 );
You could graph the samples generated as a view of the noisy signal.
The above certainly answers many of your questions about how to record measurements, and what format is typically used. And it shows how transit through the period of multiple sine waves, generate random samples with jitter and noise, and record samples over some time duration.
Building your filter is a second issue. Writing the code to emulate the filter(s) described below is left as an exercise, or a second question as you glean more understanding,
http://en.wikipedia.org/wiki/Finite_impulse_response
http://en.wikipedia.org/wiki/Infinite_impulse_response
The generated sample of the signal (above) would be fed into the code you write to build the filter. Expect that the output of the filter would be a new set of samples, perhaps with jitter, but expect that your filter would eliminate at least some of the noise. You would then be able to graph the samples produced by the filter.
You might consider that converting the samples into a comma delimited file would enable you to load them into excel and graph them. And it might help if you elucidated your electronics background, your trig knowledge, and how much you know about filters, etc.
Good luck!