Runtime Sound Generation in C++ on Windows

Runtime Sound Generation in C++ on Windows - c++

How might one generate audio at runtime using C++? I'm just looking for a starting point. Someone on a forum suggested I try to make a program play a square wave of a given frequency and amplitude.
I've heard that modern computers encode audio using PCM samples: At a give rate for a specific unit of time (eg. 48 kHz), the amplitude of a sound is recorded at a given resolution (eg. 16-bits). If I generate such a sample, how do I get my speakers to play it? I'm currently using windows. I'd prefer to avoid any additional libraries if at all possible but I'd settle for a very light one.
Here is my attempt to generate a square wave sample using this principal:
signed short* Generate_Square_Wave(
signed short a_amplitude ,
signed short a_frequency ,
signed short a_sample_rate )
{
signed short* sample = new signed short[a_sample_rate];
for( signed short c = 0; c == a_sample_rate; c++ )
{
if( c % a_frequency < a_frequency / 2 )
sample[c] = a_amplitude;
else
sample[c] = -a_amplitude;
}
return sample;
}
Am I doing this correctly? If so, what do I do with the generated sample to get my speakers to play it?

Your loop has to use c < a_sample_rate to avoid a buffer overrun.
To output the sound you call waveOutOpen and other waveOut... functions. They are all listed here:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd743834(v=vs.85).aspx

The code you are using generates a wave that is truly square, binary kind of square, in short the type of waveform that does not exist in real life. In reality most (pretty sure all) of the sounds you hear are a combination of sine waves at different frequencies.
Because your samples are created the way they are they will produce aliasing, where a higher frequency masquerades as a lower frequency causing audio artefacts. To demonstrate this to yourself write a little program which sweeps the frequency of your code from 20-20,000hz. You will hear that the sound does not go up smoothly as it raises in frequency. You will hear artefacts.
Wikipedia has an excellent article on square waves: https://en.m.wikipedia.org/wiki/Square_wave
One way to generate a square wave is to perform an inverse Fast Fourier Transform which transforms a series of frequency measurements into a series of time based samples. Then generating a square wave is a matter of supplying the routine with a collection of the measurements of sin waves at different frequencies that make up a square wave and the output is a buffer with a single cycle of the waveform.
To generate audio waves is computationally expensive so what is often done is to generate arrays of audio samples and play them back at varying speeds to play different frequencies. This is called wave table synthesis.
Have a look at the following link:
https://www.earlevel.com/main/2012/05/04/a-wavetable-oscillator%E2%80%94part-1/
And some more about band limiting a signal and why it’s necessary:
https://dsp.stackexchange.com/questions/22652/why-band-limit-a-signal

Related

C++ mathematical function generation

In working on a project I came across the need to generate various waves, accurately. I thought that a simple sine wave would be the easiest to begin with, but it appears that I am mistaken. I made a simple program that generates a vector of samples and then plays those samples back so that the user hears the wave, as a test. Here is the relevant code:
vector<short> genSineWaveSample(int nsamples, float freq, float amp) {
vector<short> samples;
for(float i = 0; i <= nsamples; i++) {
samples.push_back(amp * sinx15(freq*i));
}
return samples;
}
I'm not sure what the issue with this is. I understand that there could be some issue with the vector being made of shorts, but that's what my audio framework wants, and I am inexperienced with that kind of library and so do not know what to expect.
The symptoms are as follows:
frequency not correct
ie: given freq=440, A4 is not the note played back
strange distortion
Most frequencies do not generate a clean wave. 220, 440, 880 are all clean, most others are distorted
Most frequencies are shifted upwards considerably
Can anyone give advice as to what I may be doing wrong?
Here's what I've tried so far:
Making my own sine function, for greater accuracy.
I used a 15th degree Taylor Series expansion for sin(x)
Changed the sample rate, anything from 256 to 44100, no change can be heard given the above errors, the waves are simply more distorted.
Thank you. If there is any information that can help you, I'd be obliged to provide it.

I suspect that you are passing incorrect values to your sin15x function. If you are familiar with the basics of signal processing the Nyquist frequency is the minimum frequency at which you can faithful reconstruct (or in your case construct) a sampled signal. The is defined as 2x the highest frequency component present in the signal.
What this means for your program is that you need at last 2 values per cycle of the highest frequency you want to reproduce. At 20Khz you'd need 40,000 samples per second. It looks like you are just packing a vector with values and letting the playback program sort out the timing.
We will assume you use 44.1Khz as your playback sampling frequency. This means that a snipet of code producing one second of a 1kHz wave would look like
DataStructure wave = new DataStructure(44100) // creates some data structure of 44100 in length
for(int i = 0; i < 44100; i++)
{
wave[i] = sin(2*pi * i * (frequency / 44100) + pi / 2) // sin is in radians, frequency in Hz
}
You need to divide by the frequency, not multiply. To see this, take the case of a 22,050 Hz frequency value is passed. For i = 0, you get sin(0) = 1. For i = 1, sin(3pi/2) = -1 and so on are so forth. This gives you a repeating sequence of 1, -1, 1, -1... which is the correct representation of a 22,050Hz wave sampled at 44.1Khz. This works as you go down in frequency but you get more and more samples per cycle. Interestingly though this does not make a difference. A sinewave sampled at 2 samples per cycle is just as accurately recreated as one that is sampled 1000 times per second. This doesn't take into account noise but for most purposes works well enough.
I would suggest looking into the basics of digital signal processing as it a very interesting field and very useful to understand.
Edit: This assumes all of those parameters are evaluated as floating point numbers.

Fundamentally, you're missing a piece of information. You don't specify the amount of time over which you want your samples taken. This could also be thought of as the rate at which the samples will be played by your system. Something roughly in this direction will get you closer, for now, though.
samples.push_back(amp * std::sin(M_PI / freq *i));

Drawing audio spectrum with Bass library

How can I draw an spectrum for an given audio file with Bass library?
I mean the chart similar to what Audacity generates:
I know that I can get the FFT data for given time t (when I play the audio) with:
float fft[1024];
BASS_ChannelGetData(chan, fft, BASS_DATA_FFT2048); // get the FFT data
That way I get 1024 values in array for each time t. Am I right that the values in that array are signal amplitudes (dB)? If so, how the frequency (Hz) is associated with those values? By the index?
I am an programmer, but I am not experienced with audio processing at all. So I don't know what to do, with the data I have, to plot the needed spectrum.
I am working with C++ version, but examples in other languages are just fine (I can convert them).

From the documentation, that flag will cause the FFT magnitude to be computed, and from the sounds of it, it is the linear magnitude.
dB = 10 * log10(intensity);
dB = 20 * log10(pressure);
(I'm not sure whether audio file samples are a measurement of intensity or pressure. What's a microphone output linearly related to?)
Also, it indicates the length of the input and the length of the FFT match, but half the FFT (corresponding to negative frequencies) is discarded. Therefore the highest FFT frequency will be one-half the sampling frequency. This occurs at N/2. The docs actually say
For example, with a 2048 sample FFT, there will be 1024 floating-point values returned. If the BASS_DATA_FIXED flag is used, then the FFT values will be in 8.24 fixed-point form rather than floating-point. Each value, or "bin", ranges from 0 to 1 (can actually go higher if the sample data is floating-point and not clipped). The 1st bin contains the DC component, the 2nd contains the amplitude at 1/2048 of the channel's sample rate, followed by the amplitude at 2/2048, 3/2048, etc.
That seems pretty clear.

SFML and PN noise (8-bit emulation)

I have had the absurd idea to write a Commodore VIC-20 emulator, my first computer.
Everything has gone quite well until sound emulation time has come! The VIC-20 has 3 voices (square waveform) and a noise speaker. Searching the net I found that it is a PN generator (somewhere is called "white" noise).
I know that white noise is not frequency driven, but you put a specific frequency value into the noise register (POKE 36877,X command). The formula is:
freq = cpu_speed/(127 - x)
(more details on the VIC-20 Programmer's Guida, especially the MOS6560/6561 VIC-I chip)
where x is the 7-bit value of the noise register (bit 8 is noise on/off switch)
I have a 1024 pre-generated buffer of numbers (the pseudo-random sequence), the question is: how can I correlate the frequency (freq) to create a sample buffer to pass to the sound card (in this case to sf::SoundBuffer that accepts sf::Int16 (aka unsigned short) values?
I guess most of you had a Commodore VIC-20 or C64 at home and played with the old POKE instruction... Can anyone of you help me in understanding this step?
EDIT:
Searching on the internet I found the C64 Programmer's Guida that shows the waveform graph of its noise generator. Can anyone recognize this kind of wave/perturbation etc...? The waveform seems to be periodic (with period of freq), but how tu generate such wave?

FIR filter design: how to input sine wave form

I am currently taking a class in school and I have to code FIR/IIR filter in C/C++.
As an input to the filter, 2kHz sine wave with white noise is used. Then, by inputting the sine wave to the C/C++ code, I need to observe the clean sine wave output. It's all done in software level.
My problem is that I don't know how to deal with this input/output of sine wave. For example, I don't know what type of file format I can use or need to use, I don't know how to make the sine wave form and etc.
This might be a very trivial question, but I have no clue where to begin.
Does anyone have any experience in this type of question or have any tips?
Any help would be really appreciated.

Generating the sine wave at 2kHz means that you want to generate values over time that, when graphed, follow a sine wave. Pick an amplitude (you didn't mention one), and pick your sample rate. See the graph here (http://en.wikipedia.org/wiki/Sine_wave); you want values that when plotted follow the sine wave graphed in 2D with the X axis being time, and the Y axis being the amplitude of the value you are measuring.
amplitude (volts, degrees, pascals, milliamps, etc)
frequency (2kHz, that is 2000 sine waves/second)
sample rate (how many samples do you want per second)
Suppose you generate a file that has a time value and an amplitude measurement, which you would want to scale to your amplitude (more on this later). So a device might give an 8-bit or 16-bit digital reading which represents either an absolute, or logarithmic measurement against some scale.
struct sample
{
long usec; //microseconds (1/1,000,000 second)
short value; //many devices give a value between 0 and 255
}
Suppose you generate exactly 2000 samples/second. If you were actually measuring an external value, you would get the same value every time (see that?), which when graphed would look like a straight line.
So you want a sample rate higher than the frequency. Suppose you sample as 2x the frequency. Then you would see points 180deg off on the sine wave, which might be peaks, up or down slope, or where sine wave crosses zero. A sample rate 4x the frequency would show a sawtooth pattern. And as you increase the number of samples, your graph looks closer to the actual sine wave. This is similar to the pixelization you see in 8-bit game sprites.
How many samples for any given sine wave would you think would give you a good approximation of a sine wave? 8? 16? 100? 500? Suppose you sampled 1,000,000 times per second, then you would have 1,000,000/2,000 = 500 samples per sine wave.
pick your sample rate (500)
define your frequency (2000)
decide how long to record your samples (5 seconds?)
define your amplitude (device measures 0-255, but what is measured max?)
Here is code to generate some samples,
#define MAXJITTER (10)
#define MAXNOISE (20)
int
generate_samples( long duration, //duration in microseconds
int amplitude, //scaled peak measurement from device
int frequency, //Hz > 0
int samplerate ) //how many samples/second > 0
{
long ts; //timestamp in microseconds, usec
long sdelay; //sample delay in usec
if(frequency<1) frequency1=1; //avoid division by zero
if(samplerate<1) samplerate=1; //avoid division by zero
sdelay = 1000000/samplerate; //usec delay between each sample
sample m;
int jitter, noise; //introduce noise here
for( long ts=0; ts<duration; ts+=sdelay ) // //in usec (microseconds)
{
//jitter, sample not exactly sdelay
jitter = drand48()*MAXJITTER - (MAXJITTER/2); // +/-1/2 MAXJITTER
//noise is mismeasurement
noise = drand48()*MAXNOISE - (MAXNOISE/2); // +/-1/2 MAXNOISE
m.usec = ts + jitter;
//2PI in a full sine wave
float period = 2*PI * (ts*1.0/frequency);
m.value = sin( period );
//write m to file or save me to array/vector
}
return 0; //return number of samples, or sample array, etc
}
First generate some samples,
generate_samples( 5*1000000, 100, 2000, 2000*50 );
You could graph the samples generated as a view of the noisy signal.
The above certainly answers many of your questions about how to record measurements, and what format is typically used. And it shows how transit through the period of multiple sine waves, generate random samples with jitter and noise, and record samples over some time duration.
Building your filter is a second issue. Writing the code to emulate the filter(s) described below is left as an exercise, or a second question as you glean more understanding,
http://en.wikipedia.org/wiki/Finite_impulse_response
http://en.wikipedia.org/wiki/Infinite_impulse_response
The generated sample of the signal (above) would be fed into the code you write to build the filter. Expect that the output of the filter would be a new set of samples, perhaps with jitter, but expect that your filter would eliminate at least some of the noise. You would then be able to graph the samples produced by the filter.
You might consider that converting the samples into a comma delimited file would enable you to load them into excel and graph them. And it might help if you elucidated your electronics background, your trig knowledge, and how much you know about filters, etc.
Good luck!

Plotting waveform of the .wav file

I wanted to plot the wave-form of the .wav file for the specific plotting width.
Which method should I use to display correct waveform plot ?
Any Suggestions , tutorial , links are welcomed....

Basic algorithm:
Find number of samples to fit into draw-window
Determine how many samples should be presented by each pixel
Calculate RMS (or peak) value for each pixel from a sample block. Averaging does not work for audio signals.
Draw the values.
Let's assume that n(number of samples)=44100, w(width)=100 pixels:
then each pixel should represent 44100/100 == 441 samples (blocksize)
for (x = 0; x < w; x++)
draw_pixel(x_offset + x,
y_baseline - rms(&mono_samples[x * blocksize], blocksize));
Stuff to try for different visual appear:
rms vs max value from block
overlapping blocks (blocksize x but advance x/2 for each pixel etc)
Downsampling would not probably work as you would lose peak information.

Either use RMS, BlockSize depends on how far you are zoomed in!
float RMS = 0;
for (int a = 0; a < BlockSize; a++)
{
RMS += Samples[a]*Samples[a];
}
RMS = sqrt(RMS/BlockSize);
or Min/Max (this is what cool edit/Audtion Uses)
float Max = -10000000;
float Min = 1000000;
for (int a = 0; a < BlockSize; a++)
{
if (Samples[a] > Max) Max = Samples[a];
if (Samples[a] < Min) Min = Samples[a];
}

Almost any kind of plotting is platform specific. That said, .wav files are most commonly used on Windows, so it's probably a fair guess that you're interested primarily (or exclusively) in code for Windows as well. In this case, it mostly depends on your speed requirements. If you want a fairly static display, you can just draw with MoveTo and (mostly) LineTo. If that's not fast enough, you can gain a little speed by using something like PolyLine.
If you want it substantially faster, chances are that your best bet is to use something like OpenGL or DirectX graphics. Either of these does the majority of real work on the graphics card. Given that you're talking about drawing a graph of sound waves, even a low-end graphics card with little or no work on optimizing the drawing will probably keep up quite easily with almost anything you're likely to throw at it.
Edit: As far as reading the .wav file itself goes, the format is pretty simple. Most .wav files are uncompressed PCM samples, so drawing them is a simple matter of reading the headers to figure out the sample size and number of channels, then scaling the data to fit in your window.
Edit2: You have a couple of choices for handling left and right channels. One is to draw them in two separate plots, typically one above the other. Another is to draw them superimposed, but in different colors. Which is more suitable depends on what you're trying to accomplish -- if it's mostly to look cool, a superimposed, multi-color plot will probably work nicely. If you want to allow the user to really examine what's there in detail, you'll probably want two separate plots.

What exactly do you mean by a waveform? Are you trying to plot the level of the frequency components in the signal a.k.a the spectrum, most commonly seen in musci visualizers, car stereos, boomboxes? If so, you should use the Fast Fourier Transform. FFT is a standard technique to split a time domain signal into its individual frequencies. There are tons of good FFT library routines available.
In C++, you can use the openFrameworks library to set up a music player for wav, extract the FFT and draw it.
You can also use Processing with the Minim library to do the same. I have tried it and it is pretty straightforward.
Processing even has support for OpenGL and it is a snap to use.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js