Changing gain of audio signal while it's playing causes artifacts - c++

I am playing back audio files in a program, and in the audio rendering callbacks, I apply a gain multiplier to the input signal and add it to the output buffer. Here's some pseudo code to illustrate my actions:
void audioCallback(AudioOutputBuffer* ao, AudioInput* ai, int startSample, int numSamples){
for (int i=startSample; i<numSamples+startSample; i++){
ao[i] = ai[i]*gain;
}
}
Basically I just multiply the data by some multiplier. In this case, gain is a float member that is being adjusted via a GUI callback. If I adjust this value while the audio is still playing, I can hear that the audio is getting softer or louder when I move the slider, but I hear lots of little pops and clicks.
Not really sure what the deal is. I know about interpolation, and I do that if the audio is pitch shifted, but I'm not sure if I need to do any extra interpolation or something if the gain is being adjusted in real time before the audio file is finished playing.
If I adjust the slider before the audio start playing, the gain is set properly and I get no clicks.
Am I missing something here? How else is gain implemented but a multiplier on the input signal?

Question: how does the multiplication operator know which operand is the audio signal and which one is the gain? Answer: it doesn't. They're both audio signals, and anything audible in either one will be audible in the output.
A flat, unchanging signal doesn't produce any audible sounds. As long as the gain remains constant, it won't introduce any sound of its own.
A signal that changes abruptly will be very audible, it sounds like a click, containing lots of high frequencies.
As you've determined on your own, one way to reduce the high frequency content and thus the audibility is to stretch out the change over a number of samples, using a constant slope. This would certainly suffice in an application where you have lots of time to make the gain change.
Another way would be to run a low-pass filter on the gain signal and use that as the input to the multiplication.

I fixed it by changing the gain in increments of the amount changed. For instance, if the gain multiplier was set to 1.0, then changed to 0.8, that's a difference of 0.2 gain. For each sample in the callback, add the difference / numSamples to the previous volume to create a slurring or gradual gain change.

Related

Accurate Signal Generator

I want to create an accurate signal generator in Qt.
For example a square signal that 10us (microseconds) generates 255 and 10ms (milliseconds) generates 0.
I'm using usleep() in my thread but it sleep about 1ms!! when searched about it, I found that it's for CPU context switch.
//fp:frequency of signal //t:time of generate high (amp) // n:generate n time
void Thread::rectGenerator(double fp, double t, double amp, double n)
{
double result;
double T=1000000/fp; //(us)
for (double i=0,ii=0; i<n*T; i+=_Interval,ii+=_Interval)
{
if (ii>=T)
ii=0;
if (ii<=t)
result=amp;
else
result=0;
th.usleep(1);
qDebug() << i << "\t" <<result;
}
}
As a result : rectGenerator(200, 20, 255, 12) execute in 12 seconds but it should execute in 60(ms) !!!
So what is the best way to generate accurate signal ?
Normally what you would do is allocate a buffer that represents a certain amount of real time, fill this buffer with your generated signal, then schedule it to be played or saved or streamed. (You don't specify what you're doing with the signal, but since you're doing it with threads, I'll assume it is approximately real-time).
Assume then that your target sampling rate is 48kHz (standard for professional audio). Then you would allocate a buffer of 48000 samples of floats to store 1 second of audio. (Using double is almost certainly overkill; high quality audio is 16-bit or maybe 24-bit, and 32-bit if you're mastering top-flight systems, so float is more than enough precision; double is wasting bits).
Then you would fill this buffer with your signal using a looping function very similar to what you have pasted above. But you don't use sleep or anything like that; for now, you're only preparing the data which will be played later.
So once you have the audio buffer prepared, you need to schedule it to be played. This generally involves you sending the buffer to the system to be played back at a certain time. Depending on the API or device, you will get a callback to fill up the buffer, when a low water mark has been reached, etc.
If your signal never changes, you can just generate the one buffer and keep reusing it by rescheduling it to be played. Depending on the period of the signal, you may need to adjust the buffer size to maintain the correct frequency.
(Note that a pure square wave as you describe requires theoretically infinite bandwidth to reproduce, with its brick wall edges; you should probably apply a low pass filter to band-limit the signal, which depends on your output device.)
For accuracy in the 10 uS range, your best bet my be to offload the signal generation to dedicated hardware (FPGA or microcontroller with a real-time or no OS).
Since you need a 100 Khz signal, by far the easiest solution is to use a device that's designed to create signals in that range. A good soundcard will achieve this and is quite easy to program. Just load your sample in and tell it to play. Its internal hardware will do all the timing.

What is the accepted timing strategy when using Vertical Synchronisation?

Coming from a basic understanding of OpenGL programming, all required drawing operations are performed in a sequence, once per frame redraw. The performance of the hardware dictates essentially how fast this happens. As I understand, a game will attempt to draw as quickly as possible so redraw operations are essentially wrapped in a while loop. The graphics operations (graphics engine) will then be optimised to ensure the frame rate is acceptable for the application.
Graphics hardware supporting Vertical Synchronisation however locks frame rates to the display rate. A first question would be how should a graphics engine interact with the hardware synchronisation? Is this even possible or does the renderer work at maximum speed and the hardware selectively calls up the latest frame, discarding all unused previous frames..?
The motivation for this question is not that I am immediately intending to write a graphics engine, instead am debugging an issue with an existing system where the graphics of a moving scene appear to stutter onscreen. Symptomatically, the stutter is slight when VSync is turned off, when it is turned on either there is a significant and periodic stutter or alternatively the stutter is resolved entirely. I am somewhat clutching at straws as to what is happening or why, want to understand some more background information on graphics systems.
Summarily the question would be on how one is expected to interact with hardware redraw events and if that is even possible. However any additional information would be welcome.
A first question would be how should a graphics engine interact with the hardware synchronisation?
To avoid flicker modern rendering systems use double buffering i.e. there are two color plane buffers and after finishing drawing to one, the display readout pointer is set to the finished buffer plane. This buffer swap can happen synchronized or non-synchronized. With V-Sync enabled the buffer swap will be synchronized and the rendering thread blocks until the buffer swap happened.
Since with double buffering mandates buffer swaps this implicitly introduces a synchronization mechanism. This is how interactive rendering systems lock onto the display refresh.
Symptomatically, the stutter is slight when VSync is turned off, when it is turned on either there is a significant and periodic stutter or alternatively the stutter is resolved entirely.
This sounds like a badly written animation loop that assumes constant framerate locked onto the display refresh rate, based on the assumption that frames render faster than a display refresh interval and the buffer swap can be issued in time for the next retrace to happen.
The only robust way to deal with vertical synchronization is to actually measure the time between frame renderings and advance the rendering loop by that amount of time.
This is a guess, but:
The Problem Isn't Vertical Synchronization
I don't know what OS you're working with, but there are various ways to get information about the monitor and how fast the screen is refreshing (for the purposes of this answer, we'll assume your monitor is somewhat recent and redraws at a rate of 60 Hz, or 60 times every second, or once every 16.66666... milliseconds).
Renderers are usually paired up with an "Logic" side to the application: input, ui calculations, simulation running, etc. etc. It seems like the logic side of your application is running fast enough, but the Rendering side - i.e., the Draw Call as its commonly summed up into - is bounding the speed of your application.
Vertical Synchronization can exacerbate this in that if your Draw Call is made to happen every 16.66666 milliseconds - but it takes much longer than 16.666666 milliseconds - then you perceive a frame rate drop (i.e. frames will "stutter" because they're taking too long to produce a single frame). VSync - and the enabling or disabling thereof - is not something that bottlenecks your code: it just says "hey, since the Hardware is only going to take 1 frame from us every 16.666666 milliseconds, why make more draw calls than just one every 16.66666 milliseconds? As long as we do one draw call once for every passing of this time, our application will look as fluid as possible, and we don't have to waste time making more calls than that!"
The problem with that is that it assumes your code is going to run fast enough to make it in those 16.6666 milliseconds. If it does not, stuttering, lagging, visual artifacts, frozen frames, and other things manifest themselves on screen.
When you turn off VSync, you're telling your Render Call to be called as often as possible, as fast as possible. This may give it some extra wiggle room alongside the Logic call to get a frame rendered, so that when the Hardware Says "I'm gonna take a picture and put it on the screen now!" it's all prettied up, just in time, to get into posture and say cheese! (though by what you say, it barely makes it).
What To Do:
Start by profiling your code. Find out what functions are taking the most time. Judging by the stutter, something in your code is taking longer than is expected and is giving you undesirable performance. Make sure to profile first to find the critical sections of where you're burning away time, and figure out how to keep it correct and make it just as fast. You may want to figure out what's being called in the Render Call and profile the time it takes to complete one cycle of that specifically. Then time the Logic call(s) and see how long it takes to execute those as well. Then, chop away.
Good luck!

How can I reduce or remove the noise created by changing the 'volume' of a sample from 16bit PCM

I'm currently working on a small project where I'm loading 16bit wave files with a sample rate of 44100Hz. In normal playback the audio seems fine but as soon as I start to play with things like amplitude size to change the volume it starts giving a little bit of static noise.
What I'm doing is getting a sample from the buffer in the case of this 16bit type a short, converting this to a float in the range of -1 to 1 to start doing mixing and other effects. In this I also change the volume, when I just multiply it by 1 giving the same output its fine but as soon as I start to change the volume I hear the static noise. It happens when going over 1.0 as well as going below 1.0. And it gets worse the bigger or smaller the scale.
Anyone an idea how to reduce or remove the noise ?
"Static", otherwise known as "clicks and pops" are the result of discontinuities in the output signal. Here is a perfect example of a discontinuity:
http://en.wikipedia.org/wiki/File:Discontinuity_jump.eps.png
If you send a buffer of audio to the system to play back, and then for the next buffer you multiply every sample by 1.1, you can create a discontinuity. For example, consider a buffer that contains a sine wave with values from [-0.5, 0.5]. You send a piece of this wave to the output device, and the last sample happens to be 0.5.
Now on your next buffer you try to adjust the volume by multiplying by 1.1. The first sample of the new buffer will be close to 0.5 (since the previous sample was 0.5). Multiply that by 1.1 and you get 0.55.
A change from one sample to the next of 0.05 will probably sound like a click or a pop. If you create enough of these, it will sound like static.
The solution is to "ramp" your volume change over the buffer. For example, if you want to apply a gain of 1.1 to a buffer of 100 samples, and the previous gain was 1.0, then you would loop over all 100 samples starting with gain 1 and smoothly increase the gain until you reach the last sample, at which point your gain should be 1.1.
If you want an example of this code look at juce::AudioSampleBuffer::applyGainRamp:
http://www.rawmaterialsoftware.com/api/classAudioSampleBuffer.html
I found the flaw, I was abstracting different bit data types by going to their data using char*, I did not cast the usage of it to the correct datatype pointer. This means bytes were cut off when giving it data. This created the noise and volume changing bugs amongst others.
A flaw of my implementation and me not thinking about this when working with the audio data. A tip for anyone doing the same kind of thing, keep a good eye when modifying data, check which type your data is when using abstractions.
Many thanks to the guys trying to help me, the links were really interesting and it did learn me more things about audio programming.

setting max frames per second in openGL

Is there any way to calculate how much updates should be made to reach desired frame rate, NOT system specific? I found that for windows, but I would like to know if something like this exists in openGL itself. It should be some sort of timer.
Or how else can I prevent FPS to drop or raise dramatically? For this time I'm testing it on drawing big number of vertices in line, and using fraps I can see frame rate to go from 400 to 200 fps with evident slowing down of drawing it.
You have two different ways to solve this problem:
Suppose that you have a variable called maximum_fps, which contains for the maximum number of frames you want to display.
Then You measure the amount of time spent on the last frame (a timer will do)
Now suppose that you said that you wanted a maximum of 60FPS on your application. Then you want that the time measured be no lower than 1/60. If the time measured s lower, then you call sleep() to reach the amount of time left for a frame.
Or you can have a variable called tick, that contains the current "game time" of the application. With the same timer, you will incremented it at each main loop of your application. Then, on your drawing routines you calculate the positions based on the tick var, since it contains the current time of the application.
The big advantage of option 2 is that your application will be much easier to debug, since you can play around with the tick variable, go forward and back in time whenever you want. This is a big plus.
Rule #1. Do not make update() or loop() kind of functions rely on how often it gets called.
You can't really get your desired FPS. You could try to boost it by skipping some expensive operations or slow it down by calling sleep() kind of functions. However, even with those techniques, FPS will be almost always different from the exact FPS you want.
The common way to deal with this problem is using elapsed time from previous update. For example,
// Bad
void enemy::update()
{
position.x += 10; // this enemy moving speed is totally up to FPS and you can't control it.
}
// Good
void enemy::update(elapsedTime)
{
position.x += speedX * elapsedTime; // Now, you can control its speedX and it doesn't matter how often it gets called.
}
Is there any way to calculate how much updates should be made to reach desired frame rate, NOT system specific?
No.
There is no way to precisely calculate how many updates should be called to reach desired framerate.
However, you can measure how much time has passed since last frame, calculate current framerate according to it, compare it with desired framerate, then introduce a bit of Sleeping to reduce current framerate to the desired value. Not a precise solution, but it will work.
I found that for windows, but I would like to know if something like this exists in openGL itself. It should be some sort of timer.
OpenGL is concerned only about rendering stuff, and has nothing to do with timers. Also, using windows timers isn't a good idea. Use QueryPerformanceCounter, GetTickCount or SDL_GetTicks to measure how much time has passed, and sleep to reach desired framerate.
Or how else can I prevent FPS to drop or raise dramatically?
You prevent FPS from raising by sleeping.
As for preventing FPS from dropping...
It is insanely broad topic. Let's see. It goes something like this: use Vertex buffer objects or display lists, profile application, do not use insanely big textures, do not use too much alpha-blending, avoid "RAW" OpenGL (glVertex3f), do not render invisible objects (even if no polygons are being drawn, processing them takes time), consider learning about BSPs or OCTrees for rendering complex scenes, in parametric surfaces and curves, do not needlessly use too many primitives (if you'll render a circle using one million polygons, nobody will notice the difference), disable vsync. In short - reduce to absolute possible minimum number of rendering calls, number of rendered polygons, number of rendered pixels, number of texels read, read every available performance documentation from NVidia, and you should get a performance boost.
You're asking the wrong question. Your monitor will only ever display at 60 fps (50 fps in Europe, or possibly 75 fps if you're a pro-gamer).
Instead you should be seeking to lock your fps at 60 or 30. There are OpenGL extensions that allow you to do that. However the extensions are not cross platform (luckily they are not video card specific or it'd get really scary).
windows: wglSwapIntervalEXT
x11 (linux): glXSwapIntervalSGI
max os x: ?
These extensions are closely tied to your monitor's v-sync. Once enabled calls to swap the OpenGL back-buffer will block until the monitor is ready for it. This is like putting a sleep in your code to enforce 60 fps (or 30, or 15, or some other number if you're not using a monitor which displays at 60 Hz). The difference it the "sleep" is always perfectly timed instead of an educated guess based on how long the last frame took.
You absolutely do wan't to throttle your frame-rate it all depends on what you got
going on in that rendering loop and what your application does. Especially with it's
Physics/Network related. Or if your doing any type of graphics processing with an out side toolkit (Cairo, QPainter, Skia, AGG, ...) unless you want out of sync results or 100% cpu usage.
This code may do the job, roughly.
static int redisplay_interval;
void timer(int) {
glutPostRedisplay();
glutTimerFunc(redisplay_interval, timer, 0);
}
void setFPS(int fps)
{
redisplay_interval = 1000 / fps;
glutTimerFunc(redisplay_interval, timer, 0);
}
Here is a similar question, with my answer and worked example
I also like deft_code's answer, and will be looking into adding what he suggests to my solution.
The crucial part of my answer is:
If you're thinking about slowing down AND speeding up frames, you have to think carefully about whether you mean rendering or animation frames in each case. In this example, render throttling for simple animations is combined with animation acceleration, for any cases when frames might be dropped in a potentially slow animation.
The example is for animation code that renders at the same speed regardless of whether benchmarking mode, or fixed FPS mode, is active. An animation triggered before the change even keeps a constant speed after the change.

Programmatically convert WAV

I'm writing a file compressor utility in C++ that I want support for PCM WAV files, however I want to keep it in PCM encoding and just convert it to a lower sample rate and change it from stereo to mono if applicable to yield a lower file size.
I understand the WAV file header, however I have no experience or knowledge of how the actual sound data works. So my question is, would it be relatively easy to programmatically manipulate the "data" sub-chunk in a WAV file to convert it to another sample rate and change the channel number, or would I be much better off using an existing library for it? If it is, then how would it be done? Thanks in advance.
PCM merely means that the value of the original signal is sampled at equidistant points in time.
For stereo, there are two sequences of these values. To convert them to mono, you merely take piecewise average of the two sequences.
Resampling the signal at lower sampling rate is a little bit more tricky -- you have to filter out high frequencies from the signal so as to prevent alias (spurious low-frequency signal) from being created.
I agree with avakar and nico, but I'd like to add a little more explanation. Lowering the sample rate of PCM audio is not trivial unless two things are true:
Your signal only contains significant frequencies lower than 1/2 the new sampling rate (Nyquist rate). In this case you do not need an anti-aliasing filter.
You are downsampling by an integer value. In this case, downampling by N just requires keeping every Nth sample and dropping the rest.
If these are true, you can just drop samples at a regular interval to downsample. However, they are both probably not true if you're dealing with anything other than a synthetic signal.
To address problem one, you will have to filter the audio samples with a low-pass filter to make sure the resulting signal only contains frequency content up to 1/2 the new sampling rate. If this is not done, high frequencies will not be accurately represented and will alias back into the frequencies that can be properly represented, causing major distortion. Check out the critical frequency section of this wikipedia article for an explanation of aliasing. Specifically, see figure 7 that shows 3 different signals that are indistinguishable by just the samples because the sampling rate is too low.
Addressing problem two can be done in multiple ways. Sometimes it is performed in two steps: an upsample followed by a downsample, therefore achieving rational change in the sampling rate. It may also be done using interpolation or other techniques. Basically the problem that must be solved is that the samples of the new signal do not line up in time with samples of the original signal.
As you can see, resampling audio can be quite involved, so I would take nico's advice and use an existing library. Getting the filter step right will require you to learn a lot about signal processing and frequency analysis. You won't have to be an expert, but it will take some time.
I don't think there's really the need of reinventing the wheel (unless you want to do it for your personal learning).
For instance you can try to use libsnd