Difference between logspace generators - c++

Looking through ncmpcpp's spectrum visualizer code, I found a method that generates a "logspace," a vector used to group frequencies into log-scaled bins after applying an fft.
Here is the (isolated) code:
// Lowest frequency in display
const double HZ_MIN = 20;
// Highest frequency in display
const double HZ_MAX = 20000;
// Number of bars in spectrum
const size_t width = 100;
std::vector<double> dft_logspace;
void GenLogspace() {
// Calculate number of extra bins needed between 0 HZ and HZ_MIN
const size_t left_bins = (log10(HZ_MIN) - width*log10(HZ_MIN)) / (log10(HZ_MIN) - log10(HZ_MAX));
// Generate logspaced frequencies
dft_logspace.resize(width);
const double log_scale = log10(HZ_MAX) / (left_bins + dft_logspace.size() - 1);
for (size_t i = left_bins; i < dft_logspace.size() + left_bins; ++i) {
dft_logspace[i - left_bins] = pow(10, i * log_scale);
}
}
I spent a while trying to understand how this works... and it seems to be an awfully complicated way to get the same result as the following function, which works the way you'd expect:
Given limits a and b so that a < b, divide the interval [log10(a), log10(b)] into equal subintervals and exponential-map your way back.
// a = HZ_MIN, and
// b = HZ_MAX
void my_GenLogspace() {
dft_logspace.resize(width);
// Generate log-scaled frequency bins between HZ_MAX and HZ_MIN
for (size_t i = 0; i < width; i++) {
dft_logspace[i] = HZ_MIN * pow((HZ_MAX/HZ_MIN), ((double) i/(width-1)));
}
}
I'm fairly sure that these are mathematically identical.
Are they? Is there any reason to use original method over my rewrite? Does the author of the commit that introduced this code know something I don't?
Edit: (width-1), per Bob__'s suggestion

Got it. If anyone happens to need this later...
// Generate log-scaled vector of frequencies from HZ_MIN to HZ_MAX
void GenLogspace() {
// Prepare vector
dft_logspace.resize(width);
// Calculate number of extra bins needed between 0 HZ and HZ_MIN
// In logspace, divide the region between MAX and MIN into
// w - 1 equal segments (by fencepost, this gives us w seperators)
const double d = (
(log10(HZ_MAX) - log10(HZ_MIN))
/
(width - 1)
);
// Count how many of these segments will fit between
// 0 and MIN (note that we're still in logspace).
// This is how many log-scaled intervals are outside
// our desired range of frequencies.
const size_t skip_bins = log10(HZ_MIN) / d;
// Calculate log scale size.
// We can't use the value of d here, because d is "anchored" to both MIN and MAX.
// The last bin should be equal to MAX, but there may not be a bin that is equal to MIN.
//
// So, we re-partition our logspace:
// Divide the distance between 0 and MAX into equal partitions.
const double log_scale = log10(HZ_MAX) / (skip_bins + width - 1);
// Exponential-map bins out of logspace, skipping those that are outside our range.
// Note that the first (skipped) bin is ALWAYS 1, since 10^0 = 1.
// The last bin ALWAYS equals MAX.
for (size_t i = skip_bins; i < width + skip_bins; ++i) {
dft_logspace[i - skip_bins] = pow(10, i * log_scale);
}
}

Related

Detecting linear interpolation of two frequnecies on embedded system

I am trying to recognise a sequence of audio frames on an embedded system - an audio frame being a frequency or interpolation of two frequencies for a variable amount of time. I know the sounds I am trying to recognise (i.e. the start and end frequencies which are being linearly interpolated and the duration of each audio frame), but they are produced by a another embedded system so the microphone and speaker are cheap and somewhat inaccurate. The output is a square wave. Any suggestions how to go about doing this?
What I am trying to do now is to use FFT to get the magnitude of all frequencies, detect the peaks, look at the detection duration/2 ms ago and check if that somewhat matches an audio frame, and finally just checking if any sound I am looking for matched the sequence.
So far I used the FFT to process the microphone input - after applying a Hann window - and then assigning each frequency bin a coefficient that it's a peak based on how many standard deviations is away from the mean. This hasn't worked great since it thought there are peaks when it was silence in the room. Any ideas on how to more accurately detect the peaks? Also I think there are a lot of harmonics because of the square wave / interpolation? Can I do harmonic product spectrum if the peaks don't really line up at double the frequency?
Here I graphed noise (almost silent room) with somewhere in the interpolation of 2226 and 1624 Hz.
https://i.stack.imgur.com/R5Gs2.png
I sample at 91 microseconds -> 10989 Hz. Should I sample more often?
I added here samples of how the interpolation sounds when recorded on my laptop and on the embedded system.
https://easyupload.io/m/5l72b0
#define MIC_SAMPLE_RATE 10989 // Hz
#define AUDIO_SAMPLES_NUMBER 1024
MicroBitAudioProcessor::MicroBitAudioProcessor(DataSource& source) : audiostream(source)
{
arm_rfft_fast_init_f32(&fft_instance, AUDIO_SAMPLES_NUMBER);
buf = (float *)malloc(sizeof(float) * (AUDIO_SAMPLES_NUMBER * 2));
output = (float *)malloc(sizeof(float) * AUDIO_SAMPLES_NUMBER);
mag = (float *)malloc(sizeof(float) * AUDIO_SAMPLES_NUMBER / 2);
}
float henn(int i){
return 0.5 * (1 - arm_cos_f32(2 * 3.14159265 * i / AUDIO_SAMPLES_NUMBER));
}
int MicroBitAudioProcessor::pullRequest()
{
int s;
int result;
auto mic_samples = audiostream.pull();
if (!recording)
return DEVICE_OK;
int8_t *data = (int8_t *) &mic_samples[0];
int samples = mic_samples.length() / 2;
for (int i=0; i < samples; i++)
{
s = (int) *data;
result = s;
data++;
buf[(position++)] = (float)result;
if (position % AUDIO_SAMPLES_NUMBER == 0)
{
position = 0;
float maxValue = 0;
uint32_t index = 0;
// Apply a Henn window
for(int i=0; i< AUDIO_SAMPLES_NUMBER; i++)
buf[i] *= henn(i);
arm_rfft_fast_f32(&fft_instance, buf, output, 0);
arm_cmplx_mag_f32(output, mag, AUDIO_SAMPLES_NUMBER / 2);
}
}
return DEVICE_OK;
}
uint32_t frequencyToIndex(int freq) {
return (freq / ((uint32_t)MIC_SAMPLE_RATE / AUDIO_SAMPLES_NUMBER));
}
float MicroBitAudioProcessor::getFrequencyIntensity(int freq){
uint32_t index = frequencyToIndex(freq);
if (index <= 0 || index >= (AUDIO_SAMPLES_NUMBER / 2) - 1) return 0;
return mag[index];
}

How to find the pixel value that corresponds to a specific number of pixels?

Assume that I have a grayscale image in OpenCV.
I want to find a value so that 5% of pixels in the images have a value greater than it.
I can iterate over pixels and find number of pixels with the same value and then from the result find the value that %5 of pixel are above my value, but I am looking for a faster way to do this. Is there any such technique in OpenCV?
I think histogram would help, but I am not sure how I can use it.
You need to:
Compute the cumulative histogram of your pixel values
Find the bin whose value is greater than 95% (100 - 5) of the total number of pixels.
Given an image uniformly random generated, you get an histogram like:
and the cumulative histogram like (you need to find the first bin whose value is over the blue line):
Then you need to find the proper bin. You can use std::lower_bound function to find the correct value, and std::distance to find the corresponding bin number (aka the value you want to find). (Please note that with lower_bound you'll find the element whose value is greater or equal to the given value. You can use upper_bound to find the element whose value is strictly greater then the given value)
In this case it results to be 242, which make sense for an uniform distribution from 0 to 255, since 255*0.95 = 242.25.
Check the full code:
#include <opencv2\opencv.hpp>
#include <vector>
#include <algorithm>
using namespace std;
using namespace cv;
void drawHist(const vector<int>& data, Mat3b& dst, int binSize = 3, int height = 0, int ref_value = -1)
{
int max_value = *max_element(data.begin(), data.end());
int rows = 0;
int cols = 0;
float scale = 1;
if (height == 0) {
rows = max_value + 10;
}
else {
rows = height;
scale = float(height) / (max_value + 10);
}
cols = data.size() * binSize;
dst = Mat3b(rows, cols, Vec3b(0, 0, 0));
for (int i = 0; i < data.size(); ++i)
{
int h = rows - int(scale * data[i]);
rectangle(dst, Point(i*binSize, h), Point((i + 1)*binSize - 1, rows), (i % 2) ? Scalar(0, 100, 255) : Scalar(0, 0, 255), CV_FILLED);
}
if (ref_value >= 0)
{
int h = rows - int(scale * ref_value);
line(dst, Point(0, h), Point(cols, h), Scalar(255,0,0));
}
}
int main()
{
Mat1b src(100, 100);
randu(src, Scalar(0), Scalar(255));
int percent = 5; // percent % of pixel values are above a val
int val; // I need to find this value
int n = src.rows * src.cols; // Total number of pixels
int th = cvRound((100 - percent) / 100.f * n); // Number of pixels below val
// Histogram
vector<int> hist(256, 0);
for (int r = 0; r < src.rows; ++r) {
for (int c = 0; c < src.cols; ++c) {
hist[src(r, c)]++;
}
}
// Cumulative histogram
vector<int> cum = hist;
for (int i = 1; i < hist.size(); ++i) {
cum[i] = cum[i - 1] + hist[i];
}
// lower_bound returns an iterator pointing to the first element
// that is not less than (i.e. greater or equal to) th.
val = distance(cum.begin(), lower_bound(cum.begin(), cum.end(), th));
// Plot histograms
Mat3b plotHist, plotCum;
drawHist(hist, plotHist, 3, 300);
drawHist(cum, plotCum, 3, 300, *lower_bound(cum.begin(), cum.end(), th));
cout << "Value: " << val;
imshow("Hist", plotHist);
imshow("Cum", plotCum);
waitKey();
return 0;
}
Note
The histogram drawing function is an upgrade from a former version I posted here
You can use calcHist to compute the histograms, but I personally find easier to use the aforementioned method for 1D histograms.
1) Determine the height and the width of the image, h and w.
2) Determine what 5% of the total number of pixels is (X)...
X = int(h * w * 0.05)
3) Start at the brightest bin in the histogram. Set total T = 0.
4) Add the number of pixels in this bin to your total T. If T is greater than X, you are finished and the value you want is the lower limit of the range of the current histogram bin.
3) Move to the next darker bin in your histogram. Goto 4.

Getting values for specific frequencies in a short time fourier transform

I'm trying to use C++ to recreate the spectrogram function used by Matlab. The function uses a Short Time Fourier Transform (STFT). I found some C++ code here that performs a STFT. The code seems to work perfectly for all frequencies but I only want a few. I found this post for a similar question with the following answer:
Just take the inner product of your data with a complex exponential at
the frequency of interest. If g is your data, then just substitute for
f the value of the frequency you want (e.g., 1, 3, 10, ...)
Having no background in mathematics, I can't figure out how to do this. The inner product part seems simple enough from the Wikipedia page but I have absolutely no idea what he means by (with regard to the formula for a DFT)
a complex exponential at frequency of interest
Could someone explain how I might be able to do this? My data structure after the STFT is a matrix filled with complex numbers. I just don't know how to extract my desired frequencies.
Relevant function, where window is Hamming, and vector of desired frequencies isn't yet an input because I don't know what to do with them:
Matrix<complex<double>> ShortTimeFourierTransform::Calculate(const vector<double> &signal,
const vector<double> &window, int windowSize, int hopSize)
{
int signalLength = signal.size();
int nOverlap = hopSize;
int cols = (signal.size() - nOverlap) / (windowSize - nOverlap);
Matrix<complex<double>> results(window.size(), cols);
int chunkPosition = 0;
int readIndex;
// Should we stop reading in chunks?
bool shouldStop = false;
int numChunksCompleted = 0;
int i;
// Process each chunk of the signal
while (chunkPosition < signalLength && !shouldStop)
{
// Copy the chunk into our buffer
for (i = 0; i < windowSize; i++)
{
readIndex = chunkPosition + i;
if (readIndex < signalLength)
{
// Note the windowing!
data[i][0] = signal[readIndex] * window[i];
data[i][1] = 0.0;
}
else
{
// we have read beyond the signal, so zero-pad it!
data[i][0] = 0.0;
data[i][1] = 0.0;
shouldStop = true;
}
}
// Perform the FFT on our chunk
fftw_execute(plan_forward);
// Copy the first (windowSize/2 + 1) data points into your spectrogram.
// We do this because the FFT output is mirrored about the nyquist
// frequency, so the second half of the data is redundant. This is how
// Matlab's spectrogram routine works.
for (i = 0; i < windowSize / 2 + 1; i++)
{
double real = fft_result[i][0];
double imaginary = fft_result[i][1];
results(i, numChunksCompleted) = complex<double>(real, imaginary);
}
chunkPosition += hopSize;
numChunksCompleted++;
} // Excuse the formatting, the while ends here.
return results;
}
Look up the Goertzel algorithm or filter for example code that uses the computational equivalent of an inner product against a complex exponential to measure the presence or magnitude of a specific stationary sinusoidal frequency in a signal. Performance or resolution will depend on the length of the filter and your signal.

Implementing FFT low-pass filter in C with FFTW

I am trying to create a very simple C++ program that given an argument in range [0-100] applies a low-pass filter to a grayscale image that should "compress" it proprotionally to the value of the given argument.
I am using the FFTW library.
I have some doubts about how I define the frequency threshold, cut. Is there any more effective way to define such value?
//fftw_complex *fft
//double[] magnitude
// . . .
int percent = 100;
if (percent < 0 || percent > 100) {
cerr << "Compression rate must be a value between 0 and 100." << endl;
return -1;
}
double cut =(double)(w*h) * ((double)percent / (double)100);
for (i = 0; i < (w * h); i++) {
magnitude[i] = sqrt(pow(fft[i][0], 2.0) + pow(fft[i][1], 2.0));
if (magnitude[i] < cut) {
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
Update1:
I've changed my code to this, but again I'm not sure this is a proper way to filter frequencies. The image is surely compressed, but non-square images are messed up and setting compression to 100% isn't the real maximum compression available (I can go up to ~140%).
Here you can find an image of what I see now.
int cX = w/2;
int cY = h/2;
cout<<"TEST "<<((double)percent/(double)100)*h<<endl;
for(i = 0; i<(w*h);i++){
int row = i/s;
int col = i%s;
int distance = sqrt((col-cX)*(col-cX)+(row-cY)*(row-cY));
if(distance<((double)percent/(double)100)*min(cX,cY)){
fft[i][0] = 0.0;
fft[i][1] = 0.0;
}
}
This is not a low-pass filter at all. A low-pass filter passes low frequencies, i.e. it removes fine details (blurring). You obviously need a 2D FFT for that.
This code just removes random bits, essentially.
[edit]
The new code looks a lot more like a low-pass filter. The 141% setting is expected: the diagonal of a square is sqrt(2)=1.41 times its side. Converting an index into a row/column pair should use the image width, not some random unexplained s.
I don't know where your zero frequency is located. That should be easy to spot (largest value) but it might be in (0,0) instead of (w/2,h/2)

Fast percentile in C++

My program calculates a Monte Carlo simulation for the value-at-risk metric. To simplify as much as possible, I have:
1/ simulated daily cashflows
2/ to get a sample of a possible 1-year cashflow,
I need to draw 365 random daily cashflows and sum them
Hence, the daily cashflows are an empirically given distrobution function to be sampled 365 times. For this, I
1/ sort the daily cashflows into an array called *this->distro*
2/ calculate 365 percentiles corresponding to random probabilities
I need to do this simulation of a yearly cashflow, say, 10K times to get a population of simulated yearly cashflows to work with. Having the distribution function of daily cashflows prepared, I do the sampling like...
for ( unsigned int idxSim = 0; idxSim < _g.xSimulationCount; idxSim++ )
{
generatedVal = 0.0;
for ( register unsigned int idxDay = 0; idxDay < 365; idxDay ++ )
{
prob = (FLT_TYPE)fastrand(); // prob [0,1]
dIdx = prob * dMaxDistroIndex; // scale prob to distro function size
// to get an index into distro array
_floor = ((FLT_TYPE)(long)dIdx); // fast version of floor
_ceil = _floor + 1.0f; // 'fast' ceil:)
iIdx1 = (unsigned int)( _floor );
iIdx2 = iIdx1 + 1;
// interpolation per se
generatedVal += this->distro[iIdx1]*(_ceil - dIdx );
generatedVal += this->distro[iIdx2]*(dIdx - _floor);
}
this->yearlyCashflows[idxSim] = generatedVal ;
}
The code inside of both for cycles does linear interpolation. If, say USD 1000 corresponds to prob=0.01, USD 10000 corresponds to prob=0.1 then if I don't have an empipirical number for p=0.05 I want to get USD 5000 by interpolation.
The question: this code runs correctly, though the profiler says that the program spends cca 60% of its runtime on the interpolation per se. So my question is, how can I make this task faster? Sample runtimes reported by VTune for specific lines are as follows:
prob = (FLT_TYPE)fastrand(); // 0.727s
dIdx = prob * dMaxDistroIndex; // 1.435s
_floor = ((FLT_TYPE)(long)dIdx); // 0.718s
_ceil = _floor + 1.0f; // -
iIdx1 = (unsigned int)( _floor ); // 4.949s
iIdx2 = iIdx1 + 1; // -
// interpolation per se
generatedVal += this->distro[iIdx1]*(_ceil - dIdx ); // -
generatedVal += this->distro[iIdx2]*(dIdx - _floor); // 12.704s
Dashes mean the profiler reports no runtimes for those lines.
Any hint will be greatly appreciated.
Daniel
EDIT:
Both c.fogelklou and MSalters have pointed out great enhancements. The best code in line with what c.fogelklou said is
converter = distroDimension / (FLT_TYPE)(RAND_MAX + 1)
for ( unsigned int idxSim = 0; idxSim < _g.xSimulationCount; idxSim++ )
{
generatedVal = 0.0;
for ( register unsigned int idxDay = 0; idxDay < 365; idxDay ++ )
{
dIdx = (FLT_TYPE)fastrand() * converter;
iIdx1 = (unsigned long)dIdx);
_floor = (FLT_TYPE)iIdx1;
generatedVal += this->distro[iIdx1] + this->diffs[iIdx1] *(dIdx - _floor);
}
}
and the best I have along MSalter's lines is
normalizer = 1.0/(FLT_TYPE)(RAND_MAX + 1);
for ( unsigned int idxSim = 0; idxSim < _g.xSimulationCount; idxSim++ )
{
generatedVal = 0.0;
for ( register unsigned int idxDay = 0; idxDay < 365; idxDay ++ )
{
dIdx = (FLT_TYPE)fastrand()* normalizer ;
iIdx1 = fastrand() % _g.xDayCount;
generatedVal += this->distro[iIdx1];
generatedVal += this->diffs[iIdx1]*dIdx;
}
}
The second code is approx. 30 percent faster. Now, of 95s of total runtime, the last line consumes 68s. The last but one line consumes only 3.2s hence the double*double multiplication must be the devil. I thought of SSE - saving the last three operands into an array and then carry out a vector multiplication of this->diffs[i]*dIdx[i] and add this to this->distro[i] but this code ran 50 percent slower. Hence, I think I hit the wall.
Many thanks to all.
D.
This is a proposal for a small optimization, removing the need for ceil, two casts, and one of the multiplies. If you are running on a fixed point processor, that would explain why the multiplies and casts between float and int are taking so long. In that case, try using fixed point optimizations or turning on floating point in your compiler if the CPU supports it!
for ( unsigned int idxSim = 0; idxSim < _g.xSimulationCount; idxSim++ )
{
generatedVal = 0.0;
for ( register unsigned int idxDay = 0; idxDay < 365; idxDay ++ )
{
prob = (FLT_TYPE)fastrand(); // prob [0,1]
dIdx = prob * dMaxDistroIndex; // scale prob to distro function size
// to get an index into distro array
iIdx1 = (long)dIdx;
_floor = (FLT_TYPE)iIdx1; // fast version of floor
iIdx2 = iIdx1 + 1;
// interpolation per se
{
const FLT_TYPE diff = this->distro[iIdx2] - this->distro[iIdx1];
const FLT_TYPE interp = this->distro[iIdx1] + diff * (dIdx - _floor);
generatedVal += interp;
}
}
this->yearlyCashflows[idxSim] = generatedVal ;
}
I would recommend to fix fastrand. Floating-point code isn't the fastest in the world, but what is especially slow is the switching between floating point and integer code. Since you need an integer index, use an integer random function.
It may even be advantageous to pre-generate all 365 random values in a loop. Since you need only log2(dMaxDistroIndex) bits of randomness per value, you may be able to reduce the number of RNG calls.
You would subsequently pick a random number between 0 and 1 for the interpolation fraction.