Reverb Algorithm [closed] - c++

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm looking for a simple or commented reverb algorithm, even in pseudocode would help a lot.
I've found a couple, but the code tends to be rather esoteric and hard to follow.

Here is a very simple implementation of a "delay line" which will produce a reverb effect in an existing array (C#, buffer is short[]):
int delayMilliseconds = 500; // half a second
int delaySamples =
(int)((float)delayMilliseconds * 44.1f); // assumes 44100 Hz sample rate
float decay = 0.5f;
for (int i = 0; i < buffer.length - delaySamples; i++)
{
// WARNING: overflow potential
buffer[i + delaySamples] += (short)((float)buffer[i] * decay);
}
Basically, you take the value of each sample, multiply it by the decay parameter and add the result to the value in the buffer delaySamples away.
This will produce a true "reverb" effect, as each sound will be heard multiple times with declining amplitude. To get a simpler echo effect (where each sound is repeated only once) you use basically the same code, only run the for loop in reverse.
Update: the word "reverb" in this context has two common usages. My code sample above produces a classic reverb effect common in cartoons, whereas in a musical application the term is used to mean reverberation, or more generally the creation of artificial spatial effects.
A big reason the literature on reverberation is so difficult to understand is that creating a good spatial effect requires much more complicated algorithms than my sample method here. However, most electronic spatial effects are built up using multiple delay lines, so this sample hopefully illustrates the basics of what's going on. To produce a really good effect, you can (or should) also muddy the reverb's output using FFT or even simple blurring.
Update 2: Here are a few tips for multiple-delay-line reverb design:
Choose delay values that won't positively interfere with each other (in the wave sense). For example, if you have one delay at 500ms and a second at 250ms, there will be many spots that have echoes from both lines, producing an unrealistic effect. It's common to multiply a base delay by different prime numbers in order to help ensure that this overlap doesn't happen.
In a large room (in the real world), when you make a noise you will tend to hear a few immediate (a few milliseconds) sharp echoes that are relatively undistorted, followed by a larger, fainter "cloud" of echoes. You can achieve this effect cheaply by using a few backwards-running delay lines to create the initial echoes and a few full reverb lines plus some blurring to create the "cloud".
The absolute best trick (and I almost feel like I don't want to give this one up, but what the hell) only works if your audio is stereo. If you slightly vary the parameters of your delay lines between the left and right channels (e.g. 490ms for the left channel and 513ms for the right, or .273 decay for the left and .2631 for the right), you'll produce a much more realistic-sounding reverb.

Digital reverbs generally come in two flavors.
Convolution Reverbs convolve an impulse response and a input signal. The impulse response is often a recording of a real room or other reverberation source. The character of the reverb is defined by the impulse response. As such, convolution reverbs usually provide limited means of adjusting the reverb character.
Algorithmic Reverbs mimic reverb with a network of delays, filters and feedback. Different schemes will combine these basic building blocks in different ways. Much of the art is in knowing how to tune the network. Algorithmic reverbs usually expose several parameters to the end user so the reverb character can be adjusted to suit.
The A Bit About Reverb post at EarLevel is a great introduction to the subject. It explains the differences between convolution and algorithmic reverbs and shows some details on how each might be implemented.
Physical Audio Signal Processing by Julius O. Smith has a chapter on reverb algorithms, including a section dedicated to the Freeverb algorithm. Skimming over that might help when searching for some source code examples.
Sean Costello's Valhalla blog is full of interesting reverb tidbits.

What you need is the impulse response of the room or reverb chamber which you want to model or simulate. The full impulse response will include all the multiple and multi-path echos. The length of the impulse response will be roughly equal to the length of time (in samples) it takes for an impulse sound to completely decay below audible threshold or given noise floor.
Given an impulse vector of length N, you could produce an audio output sample by vector multiplication of the input vector (made up of the current audio input sample concatenated with the previous N-1 input samples) by the impulse vector, with appropriate scaling.
Some people simplify this by assuming most taps (down to all but 1) in the impulse response are zero, and just using a few scaled delay lines for the remaining echos which are then added into the output.
For even more realistic reverb, you might want to use different impulse responses for each ear, and have the response vary a bit with head position. A head movement of as little as a quarter inch might vary the position of peaks in the impulse response by 1 sample (at 44.1k rates).

You can use GVerb. Get the code from here.GVerb is a LADSPA plug-in, you can go here if you want to know something about LADSPA.
Here is the wiki for GVerb , including explaining of the parameters and some instant reverb settings.
Also we can use it directly in Objc:
ty_gverb *_verb;
_verb = gverb_new(16000.f, 41.f, 40.0f, 7.0f, 0.5f, 0.5f, 0.5f, 0.5f, 0.5f);
AudioSampleType *samples = (AudioSampleType*)dataBuffer.mBuffers[0].mData;//Audio Data from AudioUnit Render or ExtAuidoFileReader
float lval,rval;
for (int i = 0; i< fileLengthFrames; i++) {
float value = (float)samples[i] / 32768.f;//from SInt16 to float
gverb_do(_verb, value, &lval, &rval);
samples[i] = (SInt16)(lval * 32767.f);//float to SInt16
}
GVerb is a mono effect but if you want a stereo effect you could run each channel through the effect separately and then pan and mix the processed signals with the dry signals as required

Related

Where to alter reference code to extract motion vectors from HEVC encoded video

So this question has been asked a few times, but I think my C++ skills are too deficient to really appreciate the answers. What I need is a way to start with an HEVC encoded video and end with CSV that has all the motion vectors. So far, I've compiled and run the reference decoder, everything seems to be working fine. I'm not sure if this matters, but I'm interested in the motion vectors as a convenient way to analyze motion in a video. My plan at first is to average the MVs in each frame to just get a value expressing something about the average amount of movement in that frame.
The discussion here tells me about the TComDataCU class methods I need to interact with to get the MVs and talks about how to iterate over CTUs. But I still don't really understand the following:
1) what information is returned by these MV methods and in what format? With my limited knowledge, I assume that there are going to be something like 7 values associated with the MV: the frame number, an index identifying a macroblock in that frame, the size of the macroblock, the x coordinate of the macroblock (probably the top left corner?), the y coordinate of the macroblock, the x coordinate of the vector, and the y coordinate of the vector.
2) where in the code do I need to put new statements that save the data? I thought there must be some spot in TComDataCU.cpp where I can put lines in that print the data I want to a file, but I'm confused when the values are actually determined and what they are. The variable declarations look like this:
// create motion vector fields
m_pCtuAboveLeft = NULL;
m_pCtuAboveRight = NULL;
m_pCtuAbove = NULL;
m_pCtuLeft = NULL;
But I can't make much sense of those names. AboveLeft, AboveRight, Above, and Left seem like an asymmetric mix of directions?
Any help would be great! I think I would most benefit from seeing some example code. An explanation of the variables I need to pay attention to would also be very helpful.
At TEncSlice.cpp, you can access every CTU in loop
for( UInt ctuTsAddr = startCtuTsAddr; ctuTsAddr < boundingCtuTsAddr; ++ctuTsAddr )
then you can choose exact CTU by using address of CTU.
pCtu(TComDataCU class)->getCtuRsAddr().
After that,
pCtu->getCUMvField()
will return CTU's motion vector field. You can extract MV of CTU in that object.
For example,
TComMvField->getMv(g_auiRasterToZscan[y * 16 + x])->getHor()
returns specific 4x4 block MV's Horizontal element.
You can save these data after m_pcCuEncoder->compressCtu( pCtu ) because compressCtu determines all data of CTU such as CU partition and motion estimation, etc.
I hope this information helps you and other people!

How to use buildOpticalFlowPyramid?

I'm using OpenCV 3.3.1. I want to do a semi-dense optical flow operation using cv::calcOpticalFlowPyrLK, but I've been getting some really noticeable slowdown whenever my ROI is pretty big (Partly due to the fact that I am letting the user decide what the winSize should be, ranging from from 10 to 100). Anyways, it seems like cv::buildOpticalFlowPyramid can mitigate the slowdown by building image pyramids? I'm sorta familiar what image pyramids are, but in context of the function, I'm especially confused about what parameters I pass in, and how it impacts my function call to cv::calcOpticalFlowPyrLK. With that in mind, I now have these set of questions:
The output is, according to the documentation, is an OutputArrayOfArrays, which I take it can be a vector of cv::Mat objects. If so, what do I pass in to cv::calcOpticalFlowPyrLK for prevImg and nextImg (assuming that I need to make image pyramids for both)?
According to the docs for cv::buildOpticalFlowPyramid, you need to pass in a winSize parameter in order to calculate required padding for pyramid levels. If so, do you pass in the same winSize value when you eventually call cv::calcOpticalFlowPyrLK?
What exactly are the arguments for pyrBorder and derivBorder doing?
Lastly, and apologies if it sounds newbish, but what is the purpose of this function? I always assumed that cv::calcOpticalFlowPyrLK internally builds the image pyramids. Is it just to speed up the optical flow operation?
I hope my questions were clear, I'm still very new to OpenCV, and computer vision, but this topic is very interesting.
Thank you for your time.
EDIT:
I used the function to see if my guess was correct, so far it has worked, but I've seen no noticeable speed up. Below is how I used it:
// Building pyramids
int maxLvl = 3;
maxLvl = cv::buildOpticalFlowPyramid(imgPrev, imPyr1, cv::Size(searchSize, searchSize), maxLvl, true);
maxLvl = cv::buildOpticalFlowPyramid(tmpImg, imPyr2, cv::Size(searchSize, searchSize), maxLvl, true);
// LK optical flow call
cv::calcOpticalFlowPyrLK(imPyr1, imPyr2, currentPoints, nextPts, status, err,
cv::Size(searchSize, searchSize), maxLvl, termCrit, 0, 0.00001);
So now I'm wondering what's the purpose of preparing the image pyramids if calcOpticalFlowPyrLK does it internally?
So the point of your question is that you are trying to improve speed of optical flow tracking by tuning your input parameters.
If you want dirty and quick answer then here it is
KTL (OpenCV's calcOpticalFlowPyrLK) define a e residual function which are sum of gradient of point inside search window .
The main purpose is to find vector of point that can minimize residual function
So if you increase search window size (winSize) then it is more difficult to find that set of points.
If your really really want to do that then please read the official paper.
See the section 2.4
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.185.585&rep=rep1&type=pdf
I took it from official document
https://docs.opencv.org/2.4/modules/video/doc/motion_analysis_and_object_tracking.html#bouguet00
Hope that help

Cocos2D BezierBy with increasing speed over time

I'm pretty new to C++/Cocos2d, but I've been making pretty good progress. :)
What I want to do is animate a coin 'falling off' the screen after a player gets it. I've managed to successfully implement it 2 different ways, but each way has major downsides.
The goal: After a player gets a coin, the coin should 'jump', then 'fall' off of the screen. Ideally, the coin acts as if acted upon by gravity, so it jumps up with a fast speed, slows down to a stop, then proceeds to go downward at an increasing rate.
Attempts so far:
void Coin::tick(float dt) {
velocityY += gravity * dt;
float newX = coin->getPositionX() + velocityX;
float newY = coin->getPositionY() + velocityY;
coin->setPosition(newX, newY);
// using MoveBy(dt, Vec2(newX, newY)) has same result
}
// This is run on every 'update' of the main game loop.
This method does exactly what I would like it to do as far as movement, however, the frame rate gets extremely choppy and it starts to 'jump' between frames, sometimes quite significant distances.
ccBezierConfig bz;
bz.controlPoint_1 = Vec2(0, 0);
bz.controlPoint_2 = Vec2(20, 50); // These are just test values. Will normally be randomized to a degree.
bz.endPosition = Vec2(100, -2000);
auto coinDrop = BezierBy::create(2, bz);
coin->runAction(coinDrop);
This one has the benefit of 'perfect' framerate, where there is no choppiness whatsoever, however, it moves at a constant rate which ruins the experience of it falling and just makes it look like it's arbitrarily moving along some set path. (Which, well, it is.)
Has anybody run into a similar situation or know of a fix? Either to better handle the frame rate of the first one (MoveBy/To don't work- still has the choppy effect) or to programmatically set speeds of the second one (change speeds going to/from certain points in the curve)
Another idea I've had is to use a number of different MoveBy actions with different speeds, but that would have awkward 'pointy' curves and awkward changes in speed, so not really a solution.
Any ideas/help are/is greatly appreciated. :)
Yes, I have run into a similar situation. This is where 'easing' comes in handy. There are many built in easing functions that you can use such as Ease In or Ease Out. So your new code would look something like:
coin->runAction(cocos2d::EaseBounceOut::create(coinDrop));
This page shows the graphs for several common easing methods:
http://cocos2d-x.org/docs/programmers-guide/4/index.html
For your purposes (increasing speed over time) I would recommend trying the 'EaseIn' method.

Backpropagation 2-Dimensional Neuron Network C++

I am learning about Two Dimensional Neuron Network so I am facing many obstacles but I believe it is worth it and I am really enjoying this learning process.
Here's my plan: To make a 2-D NN work on recognizing images of digits. Images are 5 by 3 grids and I prepared 10 images from zero to nine. For Example this would be number 7:
Number 7 has indexes 0,1,2,5,8,11,14 as 1s (or 3,4,6,7,9,10,12,13 as 0s doesn't matter) and so on. Therefore, my input layer will be a 5 by 3 neuron layer and I will be feeding it zeros OR ones only (not in between and the indexes depends on which image I am feeding the layer).
My output layer however will be one dimensional layer of 10 neurons. Depends on which digit was recognized, a certain neuron will fire a value of one and the rest should be zeros (shouldn't fire).
I am done with implementing everything, I have a problem in computing though and I would really appreciate any help. I am getting an extremely high error rate and an extremely low (negative) output values on all output neurons and values (error and output) do not change even on the 10,000th pass.
I would love to go further and post my Backpropagation methods since I believe the problem is in it. However to break down my work I would love to hear some comments first, I want to know if my design is approachable.
Does my plan make sense?
All the posts are speaking about ranges ( 0->1, -1 ->+1, 0.01 -> 0.5 etc ), will it work for either { 0 | .OR. | 1 } on the output layer and not a range? if yes, how can I control that?
I am using TanHyperbolic as my transfer function. Does it make a difference between this and sigmoid, other functions.. etc?
Any ideas/comments/guidance are appreciated and thanks in advance
Well, by the description given above, I think that the design and approach taken it's correct! With respect to the choice of the activation function, remember that those functions help to get the neurons which have the largest activation number, also, their algebraic properties, such as an easy derivative, help with the definition of Backpropagation. Taking this into account, you should not worry about your choice of activation function.
The ranges that you mention above, correspond to a process of scaling of the input, it is better to have your input images in range 0 to 1. This helps to scale the error surface and help with the speed and convergence of the optimization process. Because your input set is composed of images, and each image is composed of pixels, the minimum value and and the maximum value that a pixel can attain is 0 and 255, respectively. To scale your input in this example, it is essential to divide each value by 255.
Now, with respect to the training problems, Have you tried checking if your gradient calculation routine is correct? i.e., by using the cost function, and evaluating the cost function, J? If not, try generating a toy vector theta that contains all the weight matrices involved in your neural network, and evaluate the gradient at each point, by using the definition of gradient, sorry for the Matlab example, but it should be easy to port to C++:
perturb = zeros(size(theta));
e = 1e-4;
for p = 1:numel(theta)
% Set perturbation vector
perturb(p) = e;
loss1 = J(theta - perturb);
loss2 = J(theta + perturb);
% Compute Numerical Gradient
numgrad(p) = (loss2 - loss1) / (2*e);
perturb(p) = 0;
end
After evaluating the function, compare the numerical gradient, with the gradient calculated by using backpropagation. If the difference between each calculation is less than 3e-9, then your implementation shall be correct.
I recommend to checkout the UFLDL tutorials offered by the Stanford Artificial Intelligence Laboratory, there you can find a lot of information related to neural networks and its paradigms, it's worth to take look at it!
http://ufldl.stanford.edu/wiki/index.php/Main_Page
http://ufldl.stanford.edu/tutorial/

Runtime Sound Generation in C++ on Windows

How might one generate audio at runtime using C++? I'm just looking for a starting point. Someone on a forum suggested I try to make a program play a square wave of a given frequency and amplitude.
I've heard that modern computers encode audio using PCM samples: At a give rate for a specific unit of time (eg. 48 kHz), the amplitude of a sound is recorded at a given resolution (eg. 16-bits). If I generate such a sample, how do I get my speakers to play it? I'm currently using windows. I'd prefer to avoid any additional libraries if at all possible but I'd settle for a very light one.
Here is my attempt to generate a square wave sample using this principal:
signed short* Generate_Square_Wave(
signed short a_amplitude ,
signed short a_frequency ,
signed short a_sample_rate )
{
signed short* sample = new signed short[a_sample_rate];
for( signed short c = 0; c == a_sample_rate; c++ )
{
if( c % a_frequency < a_frequency / 2 )
sample[c] = a_amplitude;
else
sample[c] = -a_amplitude;
}
return sample;
}
Am I doing this correctly? If so, what do I do with the generated sample to get my speakers to play it?
Your loop has to use c < a_sample_rate to avoid a buffer overrun.
To output the sound you call waveOutOpen and other waveOut... functions. They are all listed here:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd743834(v=vs.85).aspx
The code you are using generates a wave that is truly square, binary kind of square, in short the type of waveform that does not exist in real life. In reality most (pretty sure all) of the sounds you hear are a combination of sine waves at different frequencies.
Because your samples are created the way they are they will produce aliasing, where a higher frequency masquerades as a lower frequency causing audio artefacts. To demonstrate this to yourself write a little program which sweeps the frequency of your code from 20-20,000hz. You will hear that the sound does not go up smoothly as it raises in frequency. You will hear artefacts.
Wikipedia has an excellent article on square waves: https://en.m.wikipedia.org/wiki/Square_wave
One way to generate a square wave is to perform an inverse Fast Fourier Transform which transforms a series of frequency measurements into a series of time based samples. Then generating a square wave is a matter of supplying the routine with a collection of the measurements of sin waves at different frequencies that make up a square wave and the output is a buffer with a single cycle of the waveform.
To generate audio waves is computationally expensive so what is often done is to generate arrays of audio samples and play them back at varying speeds to play different frequencies. This is called wave table synthesis.
Have a look at the following link:
https://www.earlevel.com/main/2012/05/04/a-wavetable-oscillator%E2%80%94part-1/
And some more about band limiting a signal and why it’s necessary:
https://dsp.stackexchange.com/questions/22652/why-band-limit-a-signal