Swift. frequency of sound got from vDSP.DCT output differs from iPhone and iPad - swiftui

I'm trying to figure out the amplitude of each frequency of sound captured by microphone.
Just like this example https://developer.apple.com/documentation/accelerate/visualizing_sound_as_an_audio_spectrogram
I captured sample from microphone to sample buffer, copy to a circle buffer, and then performed ForwardDCT on it, just like this:
func processData(values: [Int16]) {
vDSP.convertElements(of: values,
to: &timeDomainBuffer)
vDSP.multiply(timeDomainBuffer,
hanningWindow,
result: &timeDomainBuffer)
forwardDCT.transform(timeDomainBuffer,
result: &frequencyDomainBuffer)
vDSP.absolute(frequencyDomainBuffer,
result: &frequencyDomainBuffer)
vDSP.convert(amplitude: frequencyDomainBuffer,
toDecibels: &frequencyDomainBuffer,
zeroReference: Float(Microphone.sampleCount))
if frequencyDomainValues.count > Microphone.sampleCount {
frequencyDomainValues.removeFirst(Microphone.sampleCount)
}
frequencyDomainValues.append(contentsOf: frequencyDomainBuffer)
}
the timeDomainBuffer is the float16 Array contains samples counting sampleCount,
while the frequencyDomainBuffer is the amplitude of each frequency, frequency is denoted as it's array index with it's value expressing amplitude of this frequency.
I'm trying to get amplitude of each frequency, just like this:
for index in frequencyDomainBuffer{
let frequency = index * (AVAudioSession().sampleRate/Double(Microphone.sampleCount)/2)
}
I supposed the index of freqeuencyDomainBuffer will be linear to the actual frequency, so sampleRate divided by half of sampleCount will be correct. (sampleCount is the timeDomainBuffer length)
The result is correct when running on my iPad, but the frequency got 10% higher when on iPhone.
I'm dubious whether AVAudioSession().sampleRate can be used on iPhone?
Of course I can add a condition like if iPhone, but I'd like to know why and will it be correct on other devices I haven't tested on?

If you're seeing a consistent 10% difference, I'm betting it's actually an 8.9% difference. I haven't studied your code, but I'd look for a hard-coded 44.1kHz somewhere. The sample rate on iPhones is generally 48kHz.
Remember also that the bins are (as you suspect) proportional to the sampling rate. So at different sampling rates the center of the bins are different. Depending on the number of bins you're using, this could represent a large difference (not really an "error" since the bins are ranges, but if you assume it's precisely the center frequency, this could match your 10%).

Related

Basic example of how to do numerical integration in C++

I think most people know how to do numerical derivation in computer programming, (as limit --> 0; read: "as the limit approaches zero").
//example code for derivation of position over time to obtain velocity
float currPosition, prevPosition, currTime, lastTime, velocity;
while (true)
{
prevPosition = currPosition;
currPosition = getNewPosition();
lastTime = currTime;
currTime = getTimestamp();
// Numerical derivation of position over time to obtain velocity
velocity = (currPosition - prevPosition)/(currTime - lastTime);
}
// since the while loop runs at the shortest period of time, we've already
// achieved limit --> 0;
This is the basic building block for most derivation programming.
How can I do this with integrals? Do I use a for loop and add or what?
Numerical derivation and integration in code for physics, mapping, robotics, gaming, dead-reckoning, and controls
Pay attention to where I use the words "estimate" vs "measurement" below. The difference is important.
Measurements are direct readings from a sensor.
Ex: a GPS measures position (meters) directly, and a speedometer measures speed (m/s) directly.
Estimates are calculated projections you can obtain through integrating and derivating (deriving) measured values.
Ex: you can derive position measurements (m) with respect to time to obtain speed or velocity (m/s) estimates, and you can integrate speed or velocity measurements (m/s) with respect to time to obtain position or displacement (m) estimates.
Wait, aren't all "measurements" actually just "estimates" at some fundamental level?
Yeah--pretty much! But, they are not necessarily produced through derivations or integrations with respect to time, so that is a bit different.
Also note that technically, virtually nothing can truly be measured directly. All sensors get reduced down to a voltage or a current, and guess how you measure a current?--a voltage!--either as a voltage drop across a tiny resistance, or as a voltage induced through an inductive coil due to current flow. So, everything boils down to a voltage. Even devices which "measure speed directly" may be using pressure (pitot-static tube on airplane), doppler/phase shift (radar or sonar), or looking at distance over time and then outputting speed. Fluid speed, or speed with respect to fluid such as air or water, can even be measured via a hot wire anemometer by measuring the current required to keep a hot wire at a fixed temperature, or by measuring the temperature change of the hot wire at a fixed current. And how is that temperature measured? Temperature is just a thermo-electrically-generated voltage, or a voltage drop across a diode or other resistance.
As you can see, all of these "measurements" and "estimates", at the low level, are intertwined. However, if a given device has been produced, tested, and calibrated to output a given "measurement", then you can accept it as a "source of truth" for all practical purposes and call it a "measurement". Then, anything you derive from that measurement, with respect to time or some other variable, you can consider an "estimate". The irony of this is that if you calibrate your device and output derived or integrated estimates, someone else could then consider your output "estimates" as their input "measurements" in their system, in a sort of never-ending chain down the line. That's being pedantic, however. Let's just go with the simplified definitions I have above for the time being.
The following table is true, for example. Read the 2nd line, for instance, as: "If you take the derivative of a velocity measurement with respect to time, you get an acceleration estimate, and if you take its integral, you get a position estimate."
Derivatives and integrals of position
Measurement, y Derivative Integral
Estimate (dy/dt) Estimate (dy*dt)
----------------------- ----------------------- -----------------------
position [m] velocity [m/s] - [m*s]
velocity [m/s] acceleration [m/s^2] position [m]
acceleration [m/s^2] jerk [m/s^3] velocity [m/s]
jerk [m/s^3] snap [m/s^4] acceleration [m/s^2]
snap [m/s^4] crackle [m/s^5] jerk [m/s^3]
crackle [m/s^5] pop [m/s^6] snap [m/s^4]
pop [m/s^6] - [m/s^7] crackle [m/s^5]
For jerk, snap or jounce, crackle, and pop, see: https://en.wikipedia.org/wiki/Fourth,_fifth,_and_sixth_derivatives_of_position.
1. numerical derivation
Remember, derivation obtains the slope of the line, dy/dx, on an x-y plot. The general form is (y_new - y_old)/(x_new - x_old).
In order to obtain a velocity estimate from a system where you are obtaining repeated position measurements (ex: you are taking GPS readings periodically), you must numerically derivate your position measurements over time. Your y-axis is position, and your x-axis is time, so dy/dx is simply (position_new - position_old)/(time_new - time_old). A units check shows this might be meters/sec, which is indeed a unit for velocity.
In code, that would look like this, for a system where you're only measuring position in 1-dimension:
double position_new_m = getPosition(); // m = meters
double position_old_m;
// `getNanoseconds()` should return a `uint64_t timestamp in nanoseconds, for
// instance
double time_new_sec = NS_TO_SEC((double)getNanoseconds());
double time_old_sec;
while (true)
{
position_old_m = position_new_m;
position_new_m = getPosition();
time_old_sec = time_new_sec;
time_new_sec = NS_TO_SEC((double)getNanoseconds());
// Numerical derivation of position measurements over time to obtain
// velocity in meters per second (mps)
double velocity_mps =
(position_new_m - position_old_m)/(time_new_sec - time_old_sec);
}
2. numerical integration
Numerical integration obtains the area under the curve, dy*dx, on an x-y plot. One of the best ways to do this is called trapezoidal integration, where you take the average dy reading and multiply by dx. This would look like this: (y_old + y_new)/2 * (x_new - x_old).
In order to obtain a position estimate from a system where you are obtaining repeated velocity measurements (ex: you are trying to estimate distance traveled while only reading the speedometer on your car), you must numerically integrate your velocity measurements over time. Your y-axis is velocity, and your x-axis is time, so (y_old + y_new)/2 * (x_new - x_old) is simply velocity_old + velocity_new)/2 * (time_new - time_old). A units check shows this might be meters/sec * sec = meters, which is indeed a unit for distance.
In code, that would look like this. Notice that the numerical integration obtains the distance traveled over that one tiny time interval. To obtain an estimate of the total distance traveled, you must sum all of the individual estimates of distance traveled.
double velocity_new_mps = getVelocity(); // mps = meters per second
double velocity_old_mps;
// `getNanoseconds()` should return a `uint64_t timestamp in nanoseconds, for
// instance
double time_new_sec = NS_TO_SEC((double)getNanoseconds());
double time_old_sec;
// Total meters traveled
double distance_traveled_m_total = 0;
while (true)
{
velocity_old_mps = velocity_new_mps;
velocity_new_mps = getVelocity();
time_old_sec = time_new_sec;
time_new_sec = NS_TO_SEC((double)getNanoseconds());
// Numerical integration of velocity measurements over time to obtain
// a distance estimate (in meters) over this time interval
double distance_traveled_m =
(velocity_old_mps + velocity_new_mps)/2 * (time_new_sec - time_old_sec);
distance_traveled_m_total += distance_traveled_m;
}
See also: https://en.wikipedia.org/wiki/Numerical_integration.
Going further:
high-resolution timestamps
To do the above, you'll need a good way to obtain timestamps. Here are various techniques I use:
In C++, use my uint64_t nanos() function here.
If using Linux in C or C++, use my uint64_t nanos() function which uses clock_gettime() here. Even better, I have wrapped it up into a nice timinglib library for Linux, in my eRCaGuy_hello_world repo here:
timinglib.h
timinglib.c
Here is the NS_TO_SEC() macro from timing.h:
#define NS_PER_SEC (1000000000L)
/// Convert nanoseconds to seconds
#define NS_TO_SEC(ns) ((ns)/NS_PER_SEC)
If using a microcontroller, you'll need to read an incrementing periodic counter from a timer or counter register which you have configured to increment at a steady, fixed rate. Ex: on Arduino: use micros() to obtain a microsecond timestamp with 4-us resolution (by default, it can be changed). On STM32 or others, you'll need to configure your own timer/counter.
use high data sample rates
Taking data samples as fast as possible in a sample loop is a good idea, because then you can average many samples to achieve:
Reduced noise: averaging many raw samples reduces noise from the sensor.
Higher-resolution: averaging many raw samples actually adds bits of resolution in your measurement system. This is known as oversampling.
I write about it on my personal website here: ElectricRCAircraftGuy.com: Using the Arduino Uno’s built-in 10-bit to 16+-bit ADC (Analog to Digital Converter).
And Atmel/Microchip wrote about it in their white-paper here: Application Note AN8003: AVR121: Enhancing ADC resolution by oversampling.
Taking 4^n samples increases your sample resolution by n bits of resolution. For example:
4^0 = 1 sample at 10-bits resolution --> 1 10-bit sample
4^1 = 4 samples at 10-bits resolution --> 1 11-bit sample
4^2 = 16 samples at 10-bits resolution --> 1 12-bit sample
4^3 = 64 samples at 10-bits resolution --> 1 13-bit sample
4^4 = 256 samples at 10-bits resolution --> 1 14-bit sample
4^5 = 1024 samples at 10-bits resolution --> 1 15-bit sample
4^6 = 4096 samples at 10-bits resolution --> 1 16-bit sample
See:
So, sampling at high sample rates is good. You can do basic filtering on these samples.
If you process raw samples at a high rate, doing numerical derivation on high-sample-rate raw samples will end up derivating a lot of noise, which produces noisy derivative estimates. This isn't great. It's better to do the derivation on filtered samples: ex: the average of 100 or 1000 rapid samples. Doing numerical integration on high-sample-rate raw samples, however, is fine, because as Edgar Bonet says, "when integrating, the more samples you get, the better the noise averages out." This goes along with my notes above.
Just using the filtered samples for both numerical integration and numerical derivation, however, is just fine.
use reasonable control loop rates
Control loop rates should not be too fast. The higher the sample rates, the better, because you can filter them to reduce noise. The higher the control loop rate, however, not necessarily the better, because there is a sweet spot in control loop rates. If your control loop rate is too slow, the system will have a slow frequency response and won't respond to the environment fast enough, and if the control loop rate is too fast, it ends up just responding to sample noise instead of to real changes in the measured data.
Therefore, even if you have a sample rate of 1 kHz, for instance, to oversample and filter the data, control loops that fast are not needed, as the noise from readings of real sensors over very small time intervals will be too large. Use a control loop anywhere from 10 Hz ~ 100 Hz, perhaps up to 400+ Hz for simple systems with clean data. In some scenarios you can go faster, but 50 Hz is very common in control systems. The more-complicated the system and/or the more-noisy the sensor measurements, generally, the slower the control loop must be, down to about 1~10 Hz or so. Self-driving cars, for instance, which are very complicated, frequently operate at control loops of only 10 Hz.
loop timing and multi-tasking
In order to accomplish the above, independent measurement and filtering loops, and control loops, you'll need a means of performing precise and efficient loop timing and multi-tasking.
If needing to do precise, repetitive loops in Linux in C or C++, use the sleep_until_ns() function from my timinglib above. I have a demo of my sleep_until_us() function in-use in Linux to obtain repetitive loops as fast as 1 KHz to 100 kHz here.
If using bare-metal (no operating system) on a microcontroller as your compute platform, use timestamp-based cooperative multitasking to perform your control loop and other loops such as measurements loops, as required. See my detailed answer here: How to do high-resolution, timestamp-based, non-blocking, single-threaded cooperative multi-tasking.
full, numerical integration and multi-tasking example
I have an in-depth example of both numerical integration and cooperative multitasking on a bare-metal system using my CREATE_TASK_TIMER() macro in my Full coulomb counter example in code. That's a great demo to study, in my opinion.
Kalman filters
For robust measurements, you'll probably need a Kalman filter, perhaps an "unscented Kalman Filter," or UKF, because apparently they are "unscented" because they "don't stink."
See also
My answer on Physics-based controls, and control systems: the many layers of control

C++ mathematical function generation

In working on a project I came across the need to generate various waves, accurately. I thought that a simple sine wave would be the easiest to begin with, but it appears that I am mistaken. I made a simple program that generates a vector of samples and then plays those samples back so that the user hears the wave, as a test. Here is the relevant code:
vector<short> genSineWaveSample(int nsamples, float freq, float amp) {
vector<short> samples;
for(float i = 0; i <= nsamples; i++) {
samples.push_back(amp * sinx15(freq*i));
}
return samples;
}
I'm not sure what the issue with this is. I understand that there could be some issue with the vector being made of shorts, but that's what my audio framework wants, and I am inexperienced with that kind of library and so do not know what to expect.
The symptoms are as follows:
frequency not correct
ie: given freq=440, A4 is not the note played back
strange distortion
Most frequencies do not generate a clean wave. 220, 440, 880 are all clean, most others are distorted
Most frequencies are shifted upwards considerably
Can anyone give advice as to what I may be doing wrong?
Here's what I've tried so far:
Making my own sine function, for greater accuracy.
I used a 15th degree Taylor Series expansion for sin(x)
Changed the sample rate, anything from 256 to 44100, no change can be heard given the above errors, the waves are simply more distorted.
Thank you. If there is any information that can help you, I'd be obliged to provide it.
I suspect that you are passing incorrect values to your sin15x function. If you are familiar with the basics of signal processing the Nyquist frequency is the minimum frequency at which you can faithful reconstruct (or in your case construct) a sampled signal. The is defined as 2x the highest frequency component present in the signal.
What this means for your program is that you need at last 2 values per cycle of the highest frequency you want to reproduce. At 20Khz you'd need 40,000 samples per second. It looks like you are just packing a vector with values and letting the playback program sort out the timing.
We will assume you use 44.1Khz as your playback sampling frequency. This means that a snipet of code producing one second of a 1kHz wave would look like
DataStructure wave = new DataStructure(44100) // creates some data structure of 44100 in length
for(int i = 0; i < 44100; i++)
{
wave[i] = sin(2*pi * i * (frequency / 44100) + pi / 2) // sin is in radians, frequency in Hz
}
You need to divide by the frequency, not multiply. To see this, take the case of a 22,050 Hz frequency value is passed. For i = 0, you get sin(0) = 1. For i = 1, sin(3pi/2) = -1 and so on are so forth. This gives you a repeating sequence of 1, -1, 1, -1... which is the correct representation of a 22,050Hz wave sampled at 44.1Khz. This works as you go down in frequency but you get more and more samples per cycle. Interestingly though this does not make a difference. A sinewave sampled at 2 samples per cycle is just as accurately recreated as one that is sampled 1000 times per second. This doesn't take into account noise but for most purposes works well enough.
I would suggest looking into the basics of digital signal processing as it a very interesting field and very useful to understand.
Edit: This assumes all of those parameters are evaluated as floating point numbers.
Fundamentally, you're missing a piece of information. You don't specify the amount of time over which you want your samples taken. This could also be thought of as the rate at which the samples will be played by your system. Something roughly in this direction will get you closer, for now, though.
samples.push_back(amp * std::sin(M_PI / freq *i));

Drawing audio spectrum with Bass library

How can I draw an spectrum for an given audio file with Bass library?
I mean the chart similar to what Audacity generates:
I know that I can get the FFT data for given time t (when I play the audio) with:
float fft[1024];
BASS_ChannelGetData(chan, fft, BASS_DATA_FFT2048); // get the FFT data
That way I get 1024 values in array for each time t. Am I right that the values in that array are signal amplitudes (dB)? If so, how the frequency (Hz) is associated with those values? By the index?
I am an programmer, but I am not experienced with audio processing at all. So I don't know what to do, with the data I have, to plot the needed spectrum.
I am working with C++ version, but examples in other languages are just fine (I can convert them).
From the documentation, that flag will cause the FFT magnitude to be computed, and from the sounds of it, it is the linear magnitude.
dB = 10 * log10(intensity);
dB = 20 * log10(pressure);
(I'm not sure whether audio file samples are a measurement of intensity or pressure. What's a microphone output linearly related to?)
Also, it indicates the length of the input and the length of the FFT match, but half the FFT (corresponding to negative frequencies) is discarded. Therefore the highest FFT frequency will be one-half the sampling frequency. This occurs at N/2. The docs actually say
For example, with a 2048 sample FFT, there will be 1024 floating-point values returned. If the BASS_DATA_FIXED flag is used, then the FFT values will be in 8.24 fixed-point form rather than floating-point. Each value, or "bin", ranges from 0 to 1 (can actually go higher if the sample data is floating-point and not clipped). The 1st bin contains the DC component, the 2nd contains the amplitude at 1/2048 of the channel's sample rate, followed by the amplitude at 2/2048, 3/2048, etc.
That seems pretty clear.

Audio Visualizer from wav looks wrong

I'm having trouble making an audio visualizer look accurate. The bins that have a significant amount of sound tend to draw correctly, but the problem I'm having is that all the frequencies with no significant sound seem to be coming back with a value that usually bounces between -60dB and -40dB. This forms a flat bouncing line (usually in the higher freqencies).
I want to display 512 bins or less at 30 frames per second. I've been reading up on FFT and audio non stop for a couple weeks now, and so far my process has been:
Load pcm data from wav file. This comes in as 44100 samples per second that have a range of -/+ 32767. I'm assuming I treat these as real numbers when passing them to the FFT.
Divide these samples up into 1470 per frame. (446 are ignored)
Take 1024 samples and apply a Hann window.
Pass the samples to FFT as an array of real[1024] as well as another array of the same size filled with zeros for the imaginary part.
Get the magnitude by looping through the (samples/2) bins and do a sqrt(real[i]*real[i] + img[i]*img[i]).
Taking 20 * log(magnitude) to get the decibel level of each bin
Draw a rectangle for each bin. Draw these bins for each frame.
I've tested it with a couple songs, and a wav file I generated that just plays a tone at 440Hz. With the wav file, I do get a spike at the 440 bin, but all the other bins form a line that isn't much shorter than the 440 bin. Also every other frame, the bins apart from 440 look like a graphed log function with a dip on some other bin.
I'm writing this in c++. Using STK to only load left channel from the audio file:
//put every sample in the song into a temporary vector
for (int i = 0; i < stkObject->getSize(); i++)
{
standardVector.push_back(stkObject->tick(LEFT));
}
I'm using FFTReal to perform the FFT:
std::vector<std::vector <double> > leftChannelData;
int numberOfFrames = stkObject->getSize()/samplesPerFrame;
leftChannelData.resize(numberOfFrames);
for(int i = 0; i < numberOfFrames; i++)
{
for(int j = 0; j < FFT_SAMPLE_LENGTH; j++)
{
real[j] = standardVector[j + (i*samplesPerFrame)];
}
applyHannWindow(real, FFT_SAMPLE_LENGTH);
fft_object.do_fft(imaginary,real);
//FFTReal instructions say to run this after an fft
fft_object.rescale(real);
leftChannelData[i].resize(FFT_SAMPLE_LENGTH/2);
for (int j = 0; j < FFT_SAMPLE_LENGTH/2; j++)
{
double magnitude = sqrt(real[j]*real[j] + imaginary[j]*imaginary[j]);
double dbValue = 20 * log(magnitude/maxMagnitude);
leftChannelData[i].at(j) = dbValue;
}
}
I'm at a loss as to what's causing this. I've tried various ways to pull those 446 samples I'm ignoring, but the results don't seem to change. I think I may be doing something fundamentally wrong. I've tried normalizing the pcm data before handing it to the fft and I've tried normalizing the magnitude before finding the decibels, but it doesn't seem to be working. Any thoughts?
EDIT: I don't see any difference between log(magnitude) and log(magnitude/maxMagnitude). All it seems to do is shift all of the bin's values evenly downwards.
EDIT2:
Here's a what they look like to get a visual:
Song playing low sounds - with log(mag)
Song playing low sounds - same but with log(mag/maxMag)
Again, log(mag) and log(mag/maxMag) generally look the same, but with values spanning in the negative. Like MSalters said, the decibel can approach -infinite, so I can clamp those values to -100dB. Then take log(mag/maxMag) and add 100. That way the rectangle's height range from 0 to 100 instead of -100 to 0.
Is this what I should do? I've tried this, but it still looks wrong. Maybe it's just a scaling issue? When I do this, a lot of the bars don't make it above the line when it sounds like they should. And if they do make it above 0, they do so just barely.
You have to understand that you're not taking the Fourier Transform of an infinite signal, but the FT of an windowed version thereof. And your window isn't even a plain Hann window. Discarding 446 points is effectively a rectangular window function. The FT of the window functions will both show up in your output.
Secondly, the dB scale is logarithmic. That indeed means it can go quite low in the absence of a signal. You mention -60 dB, but it in fact could hit minus infinity. The only thing that would save you from that is the window function, which will introduce smear at about -110 dB.
The noise (stop band ripple) produced by a quantized Von Hann window of length 1024 could well be around -40 to -60 dB. So one strategy is to just set a threshold, and ignore (don't plot) all values below that threshold.
Also, try removing the rescale(real) function, as that could distort your complex vector before you take the log magnitude.
Also, make sure you are actually loading the audio samples into your real vector correctly (sign, number of bits and endianess).

FIR filter design: how to input sine wave form

I am currently taking a class in school and I have to code FIR/IIR filter in C/C++.
As an input to the filter, 2kHz sine wave with white noise is used. Then, by inputting the sine wave to the C/C++ code, I need to observe the clean sine wave output. It's all done in software level.
My problem is that I don't know how to deal with this input/output of sine wave. For example, I don't know what type of file format I can use or need to use, I don't know how to make the sine wave form and etc.
This might be a very trivial question, but I have no clue where to begin.
Does anyone have any experience in this type of question or have any tips?
Any help would be really appreciated.
Generating the sine wave at 2kHz means that you want to generate values over time that, when graphed, follow a sine wave. Pick an amplitude (you didn't mention one), and pick your sample rate. See the graph here (http://en.wikipedia.org/wiki/Sine_wave); you want values that when plotted follow the sine wave graphed in 2D with the X axis being time, and the Y axis being the amplitude of the value you are measuring.
amplitude (volts, degrees, pascals, milliamps, etc)
frequency (2kHz, that is 2000 sine waves/second)
sample rate (how many samples do you want per second)
Suppose you generate a file that has a time value and an amplitude measurement, which you would want to scale to your amplitude (more on this later). So a device might give an 8-bit or 16-bit digital reading which represents either an absolute, or logarithmic measurement against some scale.
struct sample
{
long usec; //microseconds (1/1,000,000 second)
short value; //many devices give a value between 0 and 255
}
Suppose you generate exactly 2000 samples/second. If you were actually measuring an external value, you would get the same value every time (see that?), which when graphed would look like a straight line.
So you want a sample rate higher than the frequency. Suppose you sample as 2x the frequency. Then you would see points 180deg off on the sine wave, which might be peaks, up or down slope, or where sine wave crosses zero. A sample rate 4x the frequency would show a sawtooth pattern. And as you increase the number of samples, your graph looks closer to the actual sine wave. This is similar to the pixelization you see in 8-bit game sprites.
How many samples for any given sine wave would you think would give you a good approximation of a sine wave? 8? 16? 100? 500? Suppose you sampled 1,000,000 times per second, then you would have 1,000,000/2,000 = 500 samples per sine wave.
pick your sample rate (500)
define your frequency (2000)
decide how long to record your samples (5 seconds?)
define your amplitude (device measures 0-255, but what is measured max?)
Here is code to generate some samples,
#define MAXJITTER (10)
#define MAXNOISE (20)
int
generate_samples( long duration, //duration in microseconds
int amplitude, //scaled peak measurement from device
int frequency, //Hz > 0
int samplerate ) //how many samples/second > 0
{
long ts; //timestamp in microseconds, usec
long sdelay; //sample delay in usec
if(frequency<1) frequency1=1; //avoid division by zero
if(samplerate<1) samplerate=1; //avoid division by zero
sdelay = 1000000/samplerate; //usec delay between each sample
sample m;
int jitter, noise; //introduce noise here
for( long ts=0; ts<duration; ts+=sdelay ) // //in usec (microseconds)
{
//jitter, sample not exactly sdelay
jitter = drand48()*MAXJITTER - (MAXJITTER/2); // +/-1/2 MAXJITTER
//noise is mismeasurement
noise = drand48()*MAXNOISE - (MAXNOISE/2); // +/-1/2 MAXNOISE
m.usec = ts + jitter;
//2PI in a full sine wave
float period = 2*PI * (ts*1.0/frequency);
m.value = sin( period );
//write m to file or save me to array/vector
}
return 0; //return number of samples, or sample array, etc
}
First generate some samples,
generate_samples( 5*1000000, 100, 2000, 2000*50 );
You could graph the samples generated as a view of the noisy signal.
The above certainly answers many of your questions about how to record measurements, and what format is typically used. And it shows how transit through the period of multiple sine waves, generate random samples with jitter and noise, and record samples over some time duration.
Building your filter is a second issue. Writing the code to emulate the filter(s) described below is left as an exercise, or a second question as you glean more understanding,
http://en.wikipedia.org/wiki/Finite_impulse_response
http://en.wikipedia.org/wiki/Infinite_impulse_response
The generated sample of the signal (above) would be fed into the code you write to build the filter. Expect that the output of the filter would be a new set of samples, perhaps with jitter, but expect that your filter would eliminate at least some of the noise. You would then be able to graph the samples produced by the filter.
You might consider that converting the samples into a comma delimited file would enable you to load them into excel and graph them. And it might help if you elucidated your electronics background, your trig knowledge, and how much you know about filters, etc.
Good luck!