Create spectrum from KissFFT and QAudioProbe - c++

I'm trying to create a simple spectrum via QAudioProbe but my spectrum does not "feel the beat". every bin in spectrum just goes high and low.
Here is my code processing buffer from QAudioProbe :
void Waveform::bufferReady(QAudioBuffer buffer){
int n = buffer.frameCount();
cfg = kiss_fft_alloc(n, 0/*is_inverse_fft*/, NULL, NULL);
QAudioBuffer::S16U *frames = buffer.data<QAudioBuffer::S16U>();
qDeleteAll(m_finalData);
m_finalData.clear();
kiss_fft_cpx output[n],input[n];
for (int i=0; i < n; i++)
{
// frames[i].right contains the i-th sample from the right channel
// frames[i].left contains the i-th sample from the left channel
// if the signal is mono and not stereo, then only one of the channels will have data
qreal hanawindow = 0.5 * (1 - qCos((2 * M_PI * i) / (n - 1)));
input[i].r = frames[i].right * hanawindow; // WindowFunction
input[i].i = 0;
}
kiss_fft(cfg, input, output); // DO FFT
int step = n/(2*60); // distance to take value for bin from list. Here is 60bins
for(int i=0; i< n/2;i+=step){
qreal magnitude = qSqrt(output[i].i*output[i].i + output[i].r*output[i].r);
qreal amplitude = 0.15 * log10(magnitude);
amplitude = qMax(qreal(0.0), amplitude);
amplitude = qMin(qreal(1.0), amplitude);
m_finalData.append(new Sample(amplitude));
}
qDebug() << "Number of Bins : " << m_finalData.count();
emit dataReady();
}
I don't know what are problems with the above code. I've been trying a lot of other ways but the spectrum still weird.

Related

abs() returns the same output for different FFT inputs

I have a 1024 samples and I chucked it into 32 chunks in order to perform FFT on it, below is the output from FFT:
(3.13704,2.94588) (12.9193,14.7706) (-4.4401,-6.21331) (-1.60103,-2.78147) (-0.84114,-1.86292) (-0.483564,-1.43068) (-0.272469,-1.17551) (-0.130891,-1.00437) (-0.0276415,-0.879568) (0.0523422,-0.782884) (0.117249,-0.704425) (0.171934,-0.638322) (0.219483,-0.580845) (0.261974,-0.529482) (0.300883,-0.48245) (0.337316,-0.438409) (0.372151,-0.396301) (0.40613,-0.355227) (0.439926,-0.314376) (0.474196,-0.27295) (0.509637,-0.23011) (0.54704,-0.184897) (0.587371,-0.136145) (0.631877,-0.0823468) (0.682262,-0.021441) (0.740984,0.0495408) (0.811778,0.135117) (0.900701,0.242606) (1.01833,0.384795) (1.18506,0.586337) (1.44608,0.901859) (1.92578,1.48171)
(-3.48153,2.52948) (-16.9298,9.92273) (6.93524,-3.19719) (3.0322,-1.05148) (1.98753,-0.477165) (1.49595,-0.206915) (1.20575,-0.047374) (1.01111,0.0596283) (0.869167,0.137663) (0.759209,0.198113) (0.669978,0.247168) (0.594799,0.288498) (0.52943,0.324435) (0.471015,0.356549) (0.417524,0.385956) (0.367437,0.413491) (0.319547,0.439819) (0.272834,0.4655) (0.226373,0.491042) (0.17926,0.516942) (0.130538,0.543728) (0.0791167,0.571997) (0.0236714,0.602478) (-0.0375137,0.636115) (-0.106782,0.674195) (-0.18751,0.718576) (-0.284836,0.772081) (-0.407084,0.839288) (-0.568795,0.928189) (-0.798009,1.0542) (-1.15685,1.25148) (-1.81632,1.61402)
(-1.8323,-3.89383) (-6.57464,-18.4893) (1.84103,7.4115) (0.464674,3.17552) (0.0962861,2.04174) (-0.0770633,1.50823) (-0.1794,1.19327) (-0.248036,0.982028) (-0.29809,0.827977) (-0.336865,0.708638) (-0.368331,0.611796) (-0.394842,0.530204) (-0.417894,0.459259) (-0.438493,0.395861) (-0.457355,0.337808) (-0.475018,0.283448) (-0.491906,0.231473) (-0.508378,0.180775) (-0.524762,0.130352) (-0.541376,0.0792195) (-0.558557,0.0263409) (-0.57669,-0.0294661) (-0.596242,-0.089641) (-0.617818,-0.156045) (-0.642245,-0.231222) (-0.670712,-0.318836) (-0.705033,-0.424464) (-0.748142,-0.55714) (-0.805167,-0.732645) (-0.885996,-0.981412) (-1.01254,-1.37087) (-1.24509,-2.08658)
I only included 3 chunks of 32 in order to prove they are each different values.
After taking this output and giving it to abs() function to calculate magnitude I noticed I get the same output for every chunk! (example below)
4.3034 19.6234 7.63673 3.20934 2.04401 1.51019 1.20668 1.01287 0.880002 0.784632 0.714117 0.661072 0.62093 0.590747 0.568584 0.553159 0.543646 0.539563 0.54071 0.547141 0.559178 0.577442 0.602943 0.63722 0.682599 0.742638 0.822946 0.932803 1.08861 1.32218 1.70426 2.42983
4.3034 19.6234 7.63673 3.20934 2.04401 1.51019 1.20668 1.01287 0.880002 0.784632 0.714117 0.661072 0.62093 0.590747 0.568584 0.553159 0.543646 0.539563 0.54071 0.547141 0.559178 0.577442 0.602943 0.63722 0.682599 0.742638 0.822946 0.932803 1.08861 1.32218 1.70426 2.42983
4.3034 19.6234 7.63673 3.20934 2.04401 1.51019 1.20668 1.01287 0.880002 0.784632 0.714117 0.661072 0.62093 0.590747 0.568584 0.553159 0.543646 0.539563 0.54071 0.547141 0.559178 0.577442 0.602943 0.63722 0.682599 0.742638 0.822946 0.932803 1.08861 1.32218 1.70426 2.42983
Why am I getting the exact same output out of different inputs? is this normal?
Here is a part of my code which I'm performing all of these calculations:
int main(int argc, char** argv)
{
int i;
double y;
const double Fs = 100;//How many time points are needed i,e., Sampling Frequency
const double T = 1 / Fs;//# At what intervals time points are sampled
const double f = 4;//frequency
int chuck_size = 32; // chunk size (N / 32=32 chunks)
Complex chuck[32];
int j = 0;
int counter = 0;
for (int i = 0; i < N; i++)
{
t[i] = i * T;
in[i] = { (0.7 * cos(2 * M_PI * f * t[i])), (0.7 * sin(2 * M_PI * f * t[i])) };// generate (complex) sine waveform
chuck[j] = in[i];
//compute FFT for each chunk
if (i + 1 == chuck_size) // for each set of 32 chunks, apply FFT and save it all in a 1d array (magnitude)
{
chuck_size += 32;
CArray data(chuck, 32);
fft(data);
j = 0;
for (int h = 0; h < 32; h++)
{
magnitude[counter] = abs(data[h]);
std::cout << abs(data[h]) << " ";
counter++;
}
printf("\n\n");
}
else
j++;
}
}
spectrogram (normalized):
Your signal is a sine wave. You chop it up. Each segment will have the same frequency components, just a different phase (shift). The FFT gives you both the magnitude and phase for each frequency component, but after abs only the magnitude remains. These magnitudes are necessarily the same for all your chunks.

Detecting linear interpolation of two frequnecies on embedded system

I am trying to recognise a sequence of audio frames on an embedded system - an audio frame being a frequency or interpolation of two frequencies for a variable amount of time. I know the sounds I am trying to recognise (i.e. the start and end frequencies which are being linearly interpolated and the duration of each audio frame), but they are produced by a another embedded system so the microphone and speaker are cheap and somewhat inaccurate. The output is a square wave. Any suggestions how to go about doing this?
What I am trying to do now is to use FFT to get the magnitude of all frequencies, detect the peaks, look at the detection duration/2 ms ago and check if that somewhat matches an audio frame, and finally just checking if any sound I am looking for matched the sequence.
So far I used the FFT to process the microphone input - after applying a Hann window - and then assigning each frequency bin a coefficient that it's a peak based on how many standard deviations is away from the mean. This hasn't worked great since it thought there are peaks when it was silence in the room. Any ideas on how to more accurately detect the peaks? Also I think there are a lot of harmonics because of the square wave / interpolation? Can I do harmonic product spectrum if the peaks don't really line up at double the frequency?
Here I graphed noise (almost silent room) with somewhere in the interpolation of 2226 and 1624 Hz.
https://i.stack.imgur.com/R5Gs2.png
I sample at 91 microseconds -> 10989 Hz. Should I sample more often?
I added here samples of how the interpolation sounds when recorded on my laptop and on the embedded system.
https://easyupload.io/m/5l72b0
#define MIC_SAMPLE_RATE 10989 // Hz
#define AUDIO_SAMPLES_NUMBER 1024
MicroBitAudioProcessor::MicroBitAudioProcessor(DataSource& source) : audiostream(source)
{
arm_rfft_fast_init_f32(&fft_instance, AUDIO_SAMPLES_NUMBER);
buf = (float *)malloc(sizeof(float) * (AUDIO_SAMPLES_NUMBER * 2));
output = (float *)malloc(sizeof(float) * AUDIO_SAMPLES_NUMBER);
mag = (float *)malloc(sizeof(float) * AUDIO_SAMPLES_NUMBER / 2);
}
float henn(int i){
return 0.5 * (1 - arm_cos_f32(2 * 3.14159265 * i / AUDIO_SAMPLES_NUMBER));
}
int MicroBitAudioProcessor::pullRequest()
{
int s;
int result;
auto mic_samples = audiostream.pull();
if (!recording)
return DEVICE_OK;
int8_t *data = (int8_t *) &mic_samples[0];
int samples = mic_samples.length() / 2;
for (int i=0; i < samples; i++)
{
s = (int) *data;
result = s;
data++;
buf[(position++)] = (float)result;
if (position % AUDIO_SAMPLES_NUMBER == 0)
{
position = 0;
float maxValue = 0;
uint32_t index = 0;
// Apply a Henn window
for(int i=0; i< AUDIO_SAMPLES_NUMBER; i++)
buf[i] *= henn(i);
arm_rfft_fast_f32(&fft_instance, buf, output, 0);
arm_cmplx_mag_f32(output, mag, AUDIO_SAMPLES_NUMBER / 2);
}
}
return DEVICE_OK;
}
uint32_t frequencyToIndex(int freq) {
return (freq / ((uint32_t)MIC_SAMPLE_RATE / AUDIO_SAMPLES_NUMBER));
}
float MicroBitAudioProcessor::getFrequencyIntensity(int freq){
uint32_t index = frequencyToIndex(freq);
if (index <= 0 || index >= (AUDIO_SAMPLES_NUMBER / 2) - 1) return 0;
return mag[index];
}

fftw slight peak inaccuracy/drifting

I am using fftw to get spectrum of an Audio Signal. I get Audio Samples in float32 and have also tried other formats using PortAudio. Then I take fft of it using fftw. In all the formats, I have slight drift of frequency peak from actual frequency. My setup is.
Signal Generator to supply sinusoidal signal.
Sampling rate of 44.1KHz.
I am getting correct/relatively accurate readings till 10KHz. However as I gradually increase the frequency of Generator, I start getting offset. For example above 10KHz, this is what happens.
Actual frequency Peak on Spectrum
10KHz 10.5KHz
12KHz 12.9KHz
14KHz 15.5KHz
16KHz 18.2KHz
The fft Code is as like this.
//Take Samples and do Windowing
for( i=0; i<framesPerBuffer; i++ )
{
samples[i] = in[i];
fft->fftIn[i] = (0.54-(0.46*cos(scale_fact*i))) * samples[i];
}
//Zero Padding
for(i=framesPerBuffer; i<fftSize; i++)
{
fft->fftIn[i] = 0.0;
}
//FFTW Code
{ fftSize = fftSize;
this->fftSize = fftSize;
cout << "Plan start " << endl;
outArraySize = fftSize/2+1;
cout << "fft Processor start \n";
fftIn = ((double*) fftw_malloc(sizeof(double) * fftSize));
fftOut = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * outArraySize );
fftOutAbs = (double*) fftw_malloc(sizeof(double) * outArraySize );
fftwPlan = fftw_plan_dft_r2c_1d(fftSize, fftIn, fftOut, FFTW_ESTIMATE);
// fftwPlan = fftw_plan_dft_r2c_1d(fftSize, fftIn, fftOut, FFTW_MEASURE);
}
//Absolute response
int n=fftSize/2+1;
for(int i=0; i < n; i++)
{
fftOutAbs[i] = sqrt(fftOut[i][0]*fftOut[i][0] + fftOut[i][1]*fftOut[i][1]);
}
for(unsigned int i=0; i < n; i++)
{
mainCureYData[i] = 20.0 * log10(one_over_n * fft->fftOutAbs[i]);
}
I need some hints as to where/why this problem could be ?
The Hardware setup look fine as the peak shows correct on sndpeek application.
Thanks,
You don't actually show the code where you calculate the frequency of the peak, but my guess is that the problem is here:
int n=fftSize/2+1;
This should most likely be:
int n=fftSize/2;
See this answer and/or this answer for an explanation of how to determine frequency from FFT output. (TL;DR: f = Fs * i / n)

Calculate frequency from FFT sample?

I'm using the below code in Unreal Engine 4 to capture microphone input and get the resulting FFT.
I'm having trouble calculating the frequency based on this data.
I've tried finding the max amplitude and taking that as the frequency, but that doesn't seem to be correct.
// Additional includes:
#include "Voice.h"
#include "OnlineSubsystemUtils.h"
// New class member:
TSharedPtr<class IVoiceCapture> voiceCapture;
// Initialisation:
voiceCapture = FVoiceModule::Get().CreateVoiceCapture();
voiceCapture->Start();
// Capturing samples:
uint32 bytesAvailable = 0;
EVoiceCaptureState::Type captureState = voiceCapture->GetCaptureState(bytesAvailable);
if (captureState == EVoiceCaptureState::Ok && bytesAvailable > 0)
{
uint8 buf[maxBytes];
memset(buf, 0, maxBytes);
uint32 readBytes = 0;
voiceCapture->GetVoiceData(buf, maxBytes, readBytes);
uint32 samples = readBytes / 2;
float* sampleBuf = new float[samples];
int16_t sample;
for (uint32 i = 0; i < samples; i++)
{
sample = (buf[i * 2 + 1] << 8) | buf[i * 2];
sampleBuf[i] = float(sample) / 32768.0f;
}
// Do fun stuff here
delete[] sampleBuf;
}
I don't see a Fourier transform to be carried out in your code snippet. Any way, using a DFT given N samples at an average sampling frequency R, the frequency corresponding to the bin k is k·R/2N

How do I get most accurate audio frequency data possible from real time FFT on Tizen?

currently i m working on the Tizen IDE.
I had read the input data from the microPhone and apply the FFT on it... but everytime i gets the nan output.
here is my code..
ShortBuffer *pBuffer1 = pData->AsShortBufferN();
fft = new KissFFT(BUFFER_SIZE);
std::vector<short> input(pBuffer1->GetPointer(),
pBuffer1->GetPointer() + BUFFER_SIZE); // this contains audio data
std::vector<float> specturm(BUFFER_SIZE);
fft->spectrum(input, specturm);
applying FFT..
void KissFFT::spectrum(KissFFTO* fft, std::vector<short>& samples2,
std::vector<float>& spectrum) {
int len = fft->numSamples / 2 + 1;
kiss_fft_scalar* samples = (kiss_fft_scalar*) &samples2[0];
kiss_fftr(fft->config, samples, fft->spectrum);
for (int i = 0; i < len; i++) {
float re = scale(fft->spectrum[i].r) * fft->numSamples;
float im = scale(fft->spectrum[i].i) * fft->numSamples;
if (i > 0)
spectrum[i] = sqrtf(re * re + im * im) / (fft->numSamples / 2);
else
spectrum[i] = sqrtf(re * re + im * im) / (fft->numSamples);
AppLog("specturm %d",spectrum[i]); // everytime returns returns nan output
}
}
KissFFTO* KissFFT::create(int numSamples) {
KissFFTO* fft = new KissFFTO();
fft->config = kiss_fftr_alloc(numSamples/2, 0, NULL, NULL);
fft->spectrum = new kiss_fft_cpx[numSamples / 2 + 1];
fft->numSamples = numSamples;
return fft;
}
In fft->config there should be some parameters about the size of FFT like 2048, 4096, i.e. powers of 2. If you increase this value, you can get more resolution in frequency.