Computing analytical signal using FFT in C++ - c++

I'm currently lost trying to figure out how to implement an equivalent version of MATLAB's hilbert() function in C++. I'm very new to signal processing, but, ultimately, I would like to figure out a way to phase shift any given signal by 90 degrees. I was attempting to follow the method suggested in this question on MATLAB central, which appears to work based on tests using GNU Octave.
I have what I believe to be a working implementation of both FFT and the inverse FFT, and I have tried implementing the method described in the answer to this post in order to compute the analytical signal. I have tried doing this by applying the FFT, setting the upper half of the array to zero, and then applying the inverse FFT, but, based on graphs I made of output from a test, there must be a problem with the way I have implemented finding the analytical signal.
What would be a suitable way to implement the hilbert() function from MATLAB in C++ given a working implementation of FFT and inverse FFT? Is there a better way of achieving the 90 degree phase shift?

Checking the MATLAB implementation the following should return the same result as the hilbert function. You'll obviously have to modify it to match your specific implementation. I'm assuming a signal class of some sort exists.
signal hilbert(const signal &x)
{
int limit1, limit2;
signal xfreq = fft(x);
if (x.numel % 2 == 0) {
limit1 = x.numel/2;
limit2 = limit1 + 1;
} else {
limit1 = (x.numel + 1)/2;
limit2 = limit1;
}
// multiply the first half by 2 (except the first element)
for (int i = 1; i < limit1; ++i) {
xfreq[i].real *= 2;
xfreq[i].imag *= 2;
}
for (int i = limit2; i < x.numel; ++i) {
xfreq[i].real = 0;
xfreq[i].imag = 0;
}
return ifft(xfreq);
}
Edit: Forgot to set the second half to zeros.
Edit2: Fixed a logical error. I coded the following up in MATLAB which matches hilbert.
function h = hil(x)
n = numel(x);
if (mod(n,2) == 0)
limit1 = n/2;
limit2 = limit1 + 2;
else
limit1 = (n+1)/2;
limit2 = limit1+1;
end
xfreq = fft(x);
for i = 2:limit1
xfreq(i) = xfreq(i)*2;
end
for i = limit2:n
xfreq(i) = 0;
end
h = ifft(xfreq);
end

Related

Getting values for specific frequencies in a short time fourier transform

I'm trying to use C++ to recreate the spectrogram function used by Matlab. The function uses a Short Time Fourier Transform (STFT). I found some C++ code here that performs a STFT. The code seems to work perfectly for all frequencies but I only want a few. I found this post for a similar question with the following answer:
Just take the inner product of your data with a complex exponential at
the frequency of interest. If g is your data, then just substitute for
f the value of the frequency you want (e.g., 1, 3, 10, ...)
Having no background in mathematics, I can't figure out how to do this. The inner product part seems simple enough from the Wikipedia page but I have absolutely no idea what he means by (with regard to the formula for a DFT)
a complex exponential at frequency of interest
Could someone explain how I might be able to do this? My data structure after the STFT is a matrix filled with complex numbers. I just don't know how to extract my desired frequencies.
Relevant function, where window is Hamming, and vector of desired frequencies isn't yet an input because I don't know what to do with them:
Matrix<complex<double>> ShortTimeFourierTransform::Calculate(const vector<double> &signal,
const vector<double> &window, int windowSize, int hopSize)
{
int signalLength = signal.size();
int nOverlap = hopSize;
int cols = (signal.size() - nOverlap) / (windowSize - nOverlap);
Matrix<complex<double>> results(window.size(), cols);
int chunkPosition = 0;
int readIndex;
// Should we stop reading in chunks?
bool shouldStop = false;
int numChunksCompleted = 0;
int i;
// Process each chunk of the signal
while (chunkPosition < signalLength && !shouldStop)
{
// Copy the chunk into our buffer
for (i = 0; i < windowSize; i++)
{
readIndex = chunkPosition + i;
if (readIndex < signalLength)
{
// Note the windowing!
data[i][0] = signal[readIndex] * window[i];
data[i][1] = 0.0;
}
else
{
// we have read beyond the signal, so zero-pad it!
data[i][0] = 0.0;
data[i][1] = 0.0;
shouldStop = true;
}
}
// Perform the FFT on our chunk
fftw_execute(plan_forward);
// Copy the first (windowSize/2 + 1) data points into your spectrogram.
// We do this because the FFT output is mirrored about the nyquist
// frequency, so the second half of the data is redundant. This is how
// Matlab's spectrogram routine works.
for (i = 0; i < windowSize / 2 + 1; i++)
{
double real = fft_result[i][0];
double imaginary = fft_result[i][1];
results(i, numChunksCompleted) = complex<double>(real, imaginary);
}
chunkPosition += hopSize;
numChunksCompleted++;
} // Excuse the formatting, the while ends here.
return results;
}
Look up the Goertzel algorithm or filter for example code that uses the computational equivalent of an inner product against a complex exponential to measure the presence or magnitude of a specific stationary sinusoidal frequency in a signal. Performance or resolution will depend on the length of the filter and your signal.

Multinomial Naive Bayes Classifier sliding window (MOA implementation, weka)

I am facing the following problem:
I am trying to implement a MNB classifier in a sliding window. I implemented a LinkedList of the size of the window and store all instances of the stream which have to be considered in it. When a new instance arrives which doesn't fit in the window anymore the first instance is removed. To remove the corresponding word counts I implemented the following method which basically is the same as trainOnInstanceImpl() by moa just backwards:
private void removeInstance(Instance instToRemove) {
int classIndex = instToRemove.classIndex();
int classValue = (int) instToRemove.value(classIndex);
double w = instToRemove.weight();
m_probOfClass[classValue] -= w;
m_classTotals[classValue] -= w * totalSize(instToRemove);
double total = m_classTotals[classValue];
for (int i = 0; i < instToRemove.numValues(); i++) {
int index = instToRemove.index(i);
if (index != classIndex && !instToRemove.isMissing(i)) {
double laplaceCorrection = 0.0;
if (m_wordTotalForClass[classValue].getValue(index) == w*instToRemove.valueSparse(i) + this.laplaceCorrectionOption.getValue()) {
laplaceCorrection = this.laplaceCorrectionOption.getValue(); //1.0
}
m_wordTotalForClass[classValue].addToValue(index,
(-1)*(w * instToRemove.valueSparse(i) + laplaceCorrection));
}
}
Now if I output the m_wordTotalForClass[classValue] I get different results for a classical MNB over a stream with 3000 instances from instance 2000-3000 as from the sliding window MNB (see above) with a window Size of 1000. And the only differences are that it outputs a 1 instead of a 0 at some points but not always. I guess this has something to do with the laplace correction. Maybe there is a problem with rounding in the if-statement:
if (m_wordTotalForClass[classValue].getValue(index) == w*instToRemove.valueSparse(i) + this.laplaceCorrectionOption.getValue())
so that we not always enter the part where the laplace value is set.
Does anybody have an idea?
I am kind of going crazy as I have been thinking about where the problem may be for the last three days. Any help would be greatly appreciated!

sqrt Matlab and C++ numerical differences

I witness numerical differences between Matlab and C++ code. Discrepancy seems to stem from different output for sqrt method in Matlab and C++. For very small numbers (< 10-5) it seems that relative difference is quite big.
Which approach would you suggest to
make sure differences come from sqrt
tune the cpp code as to replicate to float precision the Matlab code
EDIT
I add more precision about the code.
float* buttonVar = new float[nBut];
for (int_T ibut = 0; ibut < nBut; ibut++)
{
for (int_T id = start_idx; id <= stop_idx; id++)
{
inputArray[id - start_idx] = arr[ibut * nDepth + id];
}
reduceVector(inputArray, reducedArray, inputarray_size, d1, d2);
buttonMean[ibut] = 0;
buttonVar[ibut] = 0;
for (int_T id = 0; id < min(nd, nDepth); id++)
{
buttonMean[ibut] += reducedArray[id] / float(nd);
}
for (int_T id = 0; id < min(nd, nDepth); id++)
{
buttonVar[ibut] += (reducedArray[id] - buttonMean[ibut])
*(reducedArray[id] - buttonMean[ibut]);
}
buttonVar[ibut] = sqrtf(buttonVar[ibut] / float(nd));
}
In Matlab, I am converting to single the number to be sqrt. Discrepancy in the code appears in buttonVar.
Final results that are compared in Google Tests are results from several more operations with no other call to mathematics functions. These additional operations are in methods which were thouroughly Google Tested, and there is perfect match to float precision of outputs for these tests.
Numerical difference in buttonVar is up to 15% relative difference (=100*abs(cpp_res - matlab_res)/matlab_res. Significant relative difference occurs when buttonVar his of order of magnitude 10e-6.
Converting to double in C++ inside the computation solves the difference problem. We reach a very satisfactory match after converting to double.

Implementing FFT low-pass filter in C with FFTW

I am trying to create a very simple C++ program that given an argument in range [0-100] applies a low-pass filter to a grayscale image that should "compress" it proprotionally to the value of the given argument.
I am using the FFTW library.
I have some doubts about how I define the frequency threshold, cut. Is there any more effective way to define such value?
//fftw_complex *fft
//double[] magnitude
// . . .
int percent = 100;
if (percent < 0 || percent > 100) {
cerr << "Compression rate must be a value between 0 and 100." << endl;
return -1;
}
double cut =(double)(w*h) * ((double)percent / (double)100);
for (i = 0; i < (w * h); i++) {
magnitude[i] = sqrt(pow(fft[i][0], 2.0) + pow(fft[i][1], 2.0));
if (magnitude[i] < cut) {
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
Update1:
I've changed my code to this, but again I'm not sure this is a proper way to filter frequencies. The image is surely compressed, but non-square images are messed up and setting compression to 100% isn't the real maximum compression available (I can go up to ~140%).
Here you can find an image of what I see now.
int cX = w/2;
int cY = h/2;
cout<<"TEST "<<((double)percent/(double)100)*h<<endl;
for(i = 0; i<(w*h);i++){
int row = i/s;
int col = i%s;
int distance = sqrt((col-cX)*(col-cX)+(row-cY)*(row-cY));
if(distance<((double)percent/(double)100)*min(cX,cY)){
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
This is not a low-pass filter at all. A low-pass filter passes low frequencies, i.e. it removes fine details (blurring). You obviously need a 2D FFT for that.
This code just removes random bits, essentially.
[edit]
The new code looks a lot more like a low-pass filter. The 141% setting is expected: the diagonal of a square is sqrt(2)=1.41 times its side. Converting an index into a row/column pair should use the image width, not some random unexplained s.
I don't know where your zero frequency is located. That should be easy to spot (largest value) but it might be in (0,0) instead of (w/2,h/2)

generating correct spectrogram using fftw and window function

For a project I need to be able to generate a spectrogram from a .WAV file. I've read the following should be done:
Get N (transform size) samples
Apply a window function
Do a Fast Fourier Transform using the samples
Normalise the output
Generate spectrogram
On the image below you see two spectrograms of a 10000 Hz sine wave both using the hanning window function. On the left you see a spectrogram generated by audacity and on the right my version. As you can see my version has a lot more lines/noise. Is this leakage in different bins? How would I get a clear image like the one audacity generates. Should I do some post-processing? I have not yet done any normalisation because do not fully understand how to do so.
update
I found this tutorial explaining how to generate a spectrogram in c++. I compiled the source to see what differences I could find.
My math is very rusty to be honest so I'm not sure what the normalisation does here:
for(i = 0; i < half; i++){
out[i][0] *= (2./transform_size);
out[i][6] *= (2./transform_size);
processed[i] = out[i][0]*out[i][0] + out[i][7]*out[i][8];
//sets values between 0 and 1?
processed[i] =10. * (log (processed[i] + 1e-6)/log(10)) /-60.;
}
after doing this I got this image (btw I've inverted the colors):
I then took a look at difference of the input samples provided by my sound library and the one of the tutorial. Mine were way higher so I manually normalised is by dividing it by the factor 32767.9. I then go this image which looks pretty ok I think. But dividing it by this number seems wrong. And I would like to see a different solution.
Here is the full relevant source code.
void Spectrogram::process(){
int i;
int transform_size = 1024;
int half = transform_size/2;
int step_size = transform_size/2;
double in[transform_size];
double processed[half];
fftw_complex *out;
fftw_plan p;
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
for(int x=0; x < wavFile->getSamples()/step_size; x++){
int j = 0;
for(i = step_size*x; i < (x * step_size) + transform_size - 1; i++, j++){
in[j] = wavFile->getSample(i)/32767.9;
}
//apply window function
for(i = 0; i < transform_size; i++){
in[i] *= windowHanning(i, transform_size);
// in[i] *= windowBlackmanHarris(i, transform_size);
}
p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE);
fftw_execute(p); /* repeat as needed */
for(i = 0; i < half; i++){
out[i][0] *= (2./transform_size);
out[i][11] *= (2./transform_size);
processed[i] = out[i][0]*out[i][0] + out[i][12]*out[i][13];
processed[i] =10. * (log (processed[i] + 1e-6)/log(10)) /-60.;
}
for (i = 0; i < half; i++){
if(processed[i] > 0.99)
processed[i] = 1;
In->setPixel(x,(half-1)-i,processed[i]*255);
}
}
fftw_destroy_plan(p);
fftw_free(out);
}
This is not exactly an answer as to what is wrong but rather a step by step procedure to debug this.
What do you think this line does? processed[i] = out[i][0]*out[i][0] + out[i][12]*out[i][13] Likely that is incorrect: fftw_complex is typedef double fftw_complex[2], so you only have out[i][0] and out[i][1], where the first is the real and the second the imaginary part of the result for that bin. If the array is contiguous in memory (which it is), then out[i][12] is likely the same as out[i+6][0] and so forth. Some of these will go past the end of the array, adding random values.
Is your window function correct? Print out windowHanning(i, transform_size) for every i and compare with a reference version (for example numpy.hanning or the matlab equivalent). This is the most likely cause, what you see looks like a bad window function, kind of.
Print out processed, and compare with a reference version (given the same input, of course you'd have to print the input and reformat it to feed into pylab/matlab etc). However, the -60 and 1e-6 are fudge factors which you don't want, the same effect is better done in a different way. Calculate like this:
power_in_db[i] = 10 * log(out[i][0]*out[i][0] + out[i][1]*out[i][1])/log(10)
Print out the values of power_in_db[i] for the same i but for all x (a horizontal line). Are they approximately the same?
If everything so far is good, the remaining suspect is setting the pixel values. Be very explicit about clipping to range, scaling and rounding.
int pixel_value = (int)round( 255 * (power_in_db[i] - min_db) / (max_db - min_db) );
if (pixel_value < 0) { pixel_value = 0; }
if (pixel_value > 255) { pixel_value = 255; }
Here, again, print out the values in a horizontal line, and compare with the grayscale values in your pgm (by hand, using the colorpicker in photoshop or gimp or similar).
At this point, you will have validated everything from end to end, and likely found the bug.
The code you produced, was almost correct. So, you didn't left me much to correct:
void Spectrogram::process(){
int transform_size = 1024;
int half = transform_size/2;
int step_size = transform_size/2;
double in[transform_size];
double processed[half];
fftw_complex *out;
fftw_plan p;
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * transform_size);
for (int x=0; x < wavFile->getSamples()/step_size; x++) {
// Fill the transformation array with a sample frame and apply the window function.
// Normalization is performed later
// (One error was here: you didn't set the last value of the array in)
for (int j = 0, int i = x * step_size; i < x * step_size + transform_size; i++, j++)
in[j] = wavFile->getSample(i) * windowHanning(j, transform_size);
p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE);
fftw_execute(p); /* repeat as needed */
for (int i=0; i < half; i++) {
// (Here were some flaws concerning the access of the complex values)
out[i][0] *= (2./transform_size); // real values
out[i][1] *= (2./transform_size); // complex values
processed[i] = out[i][0]*out[i][0] + out[i][1]*out[i][1]; // power spectrum
processed[i] = 10./log(10.) * log(processed[i] + 1e-6); // dB
// The resulting spectral values in 'processed' are in dB and related to a maximum
// value of about 96dB. Normalization to a value range between 0 and 1 can be done
// in several ways. I would suggest to set values below 0dB to 0dB and divide by 96dB:
// Transform all dB values to a range between 0 and 1:
if (processed[i] <= 0) {
processed[i] = 0;
} else {
processed[i] /= 96.; // Reduce the divisor if you prefer darker peaks
if (processed[i] > 1)
processed[i] = 1;
}
In->setPixel(x,(half-1)-i,processed[i]*255);
}
// This should be called each time fftw_plan_dft_r2c_1d()
// was called to avoid a memory leak:
fftw_destroy_plan(p);
}
fftw_free(out);
}
The two corrected bugs were most probably responsible for the slight variation of successive transformation results. The Hanning window is very vell suited to minimize the "noise" so a different window would not have solved the problem (actually #Alex I already pointed to the 2nd bug in his point 2. But in his point 3. he added a -Inf-bug as log(0) is not defined which can happen if your wave file containts a stretch of exact 0-values. To avoid this the constant 1e-6 is good enough).
Not asked, but there are some optimizations:
put p = fftw_plan_dft_r2c_1d(transform_size, in, out, FFTW_ESTIMATE); outside the main loop,
precalculate the window function outside the main loop,
abandon the array processed and just use a temporary variable to hold one spectral line at a time,
the two multiplications of out[i][0] and out[i][1] can be abandoned in favour of one multiplication with a constant in the following line. I left this (and other things) for you to improve
Thanks to #Maxime Coorevits additionally a memory leak could be avoided: "Each time you call fftw_plan_dft_rc2_1d() memory are allocated by FFTW3. In your code, you only call fftw_destroy_plan() outside the outer loop. But in fact, you need to call this each time you request a plan."
Audacity typically doesn't map one frequency bin to one horizontal line, nor one sample period to one vertical line. The visual effect in Audacity may be due to resampling of the spectrogram picture in order to fit the drawing area.