I need to play stream in OpenAL. But i dont understand what i need to do with buffers and source. My pseudocode:
FirstTime = true;
while (true)
{
if (!FirstTime)
{
alSourceUnqueueBuffers(alSource, 1, &unbuf);
}
//get buffer to play in boost::array buf (882 elements) (MONO16).
if (NumberOfSampleSet >=3)
{
alBufferData(alSampleSet[NumberOfSampleSet], AL_FORMAT_MONO16, buf.data(), buf.size(), 44100);
alSourceQueueBuffers(alSource, 1, &alSampleSet[NumberOfSampleSet++]);
if (NumberOfSampleSet == 4)
{
FirstTime = false;
NumberOfSampleSet = 0;
}
}
alSourcePlay(alSource);
}
What am i doing wrong? In speakers i listen repeating clicks. Please tell me what i need to do with buffers to play my sound?
4 buffers (882 samples each) and a 44kHz source give only (4 * 882/ (2 * 44100) ) = 0.04 seconds of playback - that's just a "click".
To produce longer sounds you should load more data (though only two buffers is usually sufficient).
Imagine you have a 100Mb of uncompressed .wav file. Just read say 22050 samples (that is 44100 bytes of data) and enqueue them to the OpenAL's queue associated with Source. Then read another 22050 samples into the second buffer and enqueue them also. Then just switch buffers (like you do now at NumberOfSampleSet == 4) and repeat until the file is not finished.
If you want a pure sine wave of e.g. 440Hz, then using the same 22050-sample buffers just fill them with the values of sine wave:
const int BufferSize = 22050;
const int NumSamples = 44100;
// phase offset to avoid "clicks" between buffers
int LastOffset = 0;
const float Omega = 440.0f;
for(int i = 0 ; i < BufferSize ; i++)
{
float t = ( 2.0f * PI * Omega * ( i + LastOffset ) ) / static_cast<float>( NumSamples );
short VV = (short)(volume * sin(t));;
// 16-bit sample: 2 bytes
buffers[CurrentBuffer][i * 2 + 0] = VV & 0xFF;
buffers[CurrentBuffer][i * 2 + 1] = VV >> 8;
}
LastOffset += BufferSize / 2;
LastOffset %= FSignalFreq;
EDIT1:
To process something in real-time (with severe latency, unfortunately) you have to create the buffers, push some initial data and then check for how much data OpenAL needs:
int StreamBuffer( ALuint BufferID )
{
// get sound to the buffer somehow - load from file, read from input channel (queue), generate etc.
// do the custom sound processing here in buffers[CurrentBuffer]
// submit more data to OpenAL
alBufferData( BufferID, Format, buffers[CurrentBuffer].data(), buffers[CurrentBuffer].size(), SamplesPerSec );
}
int main()
{
....
ALuint FBufferID[2];
alGenBuffers( 2, &FBufferID[0] );
StreamBuffer( FBufferID[0], BUFFER_SIZE );
StreamBuffer( FBufferID[1], BUFFER_SIZE );
alSourceQueueBuffers( FSourceID, 2, &FBufferID[0] );
while(true)
{
// Check how much data is processed in OpenAL's internal queue
ALint Processed;
alGetSourcei( FSourceID, AL_BUFFERS_PROCESSED, &Processed );
// add more buffers while we need them
while ( Processed-- )
{
Luint BufID;
alSourceUnqueueBuffers( SourceID, 1, &BufID );
StreamBuffer(BufID);
alSourceQueueBuffers( SourceID, 1, &BufID );
}
}
....
}
Related
I am trying to extract frames from a stream which I create with Gstreamer and trying to save them with FreeImage or QImage ( this one is for testing ).
GstMapInfo bufferInfo;
GstBuffer *sampleBuffer;
GstStructure *capsStruct;
GstSample *sample;
GstCaps *caps;
int width, height;
const int BitsPP = 32;
/* Retrieve the buffer */
g_signal_emit_by_name (sink, "pull-sample", &sample);
if (sample) {
sampleBuffer = gst_sample_get_buffer(sample);
gst_buffer_map(sampleBuffer,&bufferInfo,GST_MAP_READ);
if (!bufferInfo.data) {
g_printerr("Warning: could not map GStreamer buffer!\n");
throw;
}
caps = gst_sample_get_caps(sample);
capsStruct= gst_caps_get_structure(caps,0);
gst_structure_get_int(capsStruct,"width",&width);
gst_structure_get_int(capsStruct,"height",&height);
auto bitmap = FreeImage_Allocate(width, height, BitsPP,0,0,0);
memcpy( FreeImage_GetBits( bitmap ), bufferInfo.data, width * height * (BitsPP/8));
// int pitch = ((((BitsPP * width) + 31) / 32) * 4);
// auto bitmap = FreeImage_ConvertFromRawBits(bufferInfo.data,width,height,pitch,BitsPP,0, 0, 0);
FreeImage_FlipHorizontal(bitmap);
bitmap = FreeImage_RotateClassic(bitmap,180);
static int id = 0;
std::string name = "/home/stadmin/pic/sample" + std::to_string(id++) + ".png";
#ifdef FREE_SAVE
FreeImage_Save(FIF_PNG,bitmap,name.c_str());
#endif
#ifdef QT_SAVE
//Format_ARGB32
QImage image(bufferInfo.data,width,height,QImage::Format_ARGB32);
image.save(QString::fromStdString(name));
#endif
fibPipeline.push(bitmap);
gst_sample_unref(sample);
gst_buffer_unmap(sampleBuffer, &bufferInfo);
return GST_FLOW_OK;
The color output in FreeImage are totally wrong like when Qt - Format_ARGB32 [ greens like blue or blues like oranges etc.. ] but when I test with Qt - Format_RGBA8888 I can get correct output. I need to use FreeImage and I wish to learn how to correct this.
Since you say Qt succeeds using Format_RGBA8888, I can only guess: the gstreamer frame has bytes in RGBA order while FreeImage expects ARGB.
Quick fix:
//have a buffer the same length of the incoming bytes
size_t length = width * height * (BitsPP/8);
BYTE * bytes = (BYTE *) malloc(length);
//copy the incoming bytes to it, in the right order:
int index = 0;
while(index < length)
{
bytes[index] = bufferInfo.data[index + 2]; //B
bytes[index + 1] = bufferInfo.data[index + 1]; //G
bytes[index + 2] = bufferInfo.data[index]; //R
bytes[index + 3] = bufferInfo.data[index + 3]; //A
index += 4;
}
//fill the bitmap using the buffer
auto bitmap = FreeImage_Allocate(width, height, BitsPP,0,0,0);
memcpy( FreeImage_GetBits( bitmap ), bytes, length);
//don't forget to
free(bytes);
I'm currently using these settings with OpenAL and recording from a Mic:
BUFFERSIZE 4410
FREQ 22050 // Sample rate
CAP_SIZE 10000 // How much to capture at a time (affects latency)
AL_FORMAT_MONO16
Is it possible to go lower in recording quality? I've tried reducing the sample rate but the end result is a faster playback speed.
Alright, so this is some of the most hacky code I've ever written, and I truly hope no one in their right mind ever uses it in production... just sooooo many bad things.
But to answer your question, I've been able to get the quality down to 8bitMono recording at 11025. However, everything I've recorded from my mic comes with significant amounts of static, and I'm not entirely sure I know why. I've generated 8bit karplus-strong string plucks that sound fantastic, so it could just be my recording device.
#include <AL/al.h>
#include <AL/alc.h>
#include <conio.h>
#include <stdio.h>
#include <vector>
#include <time.h>
void sleep( clock_t wait )
{
clock_t goal;
goal = wait + clock();
while( goal > clock() )
;
}
#define BUFFERSIZE 4410
const int SRATE = 11025;
int main()
{
std::vector<ALchar> vBuffer;
ALCdevice *pDevice = NULL;
ALCcontext *pContext = NULL;
ALCdevice *pCaptureDevice;
const ALCchar *szDefaultCaptureDevice;
ALint iSamplesAvailable;
ALchar Buffer[BUFFERSIZE];
ALint iDataSize = 0;
ALint iSize;
// NOTE : This code does NOT setup the Wave Device's Audio Mixer to select a recording input
// or a recording level.
pDevice = alcOpenDevice(NULL);
pContext = alcCreateContext(pDevice, NULL);
alcMakeContextCurrent(pContext);
printf("Capture Application\n");
if (pDevice == NULL)
{
printf("Failed to initialize OpenAL\n");
//Shutdown code goes here
return 0;
}
// Check for Capture Extension support
pContext = alcGetCurrentContext();
pDevice = alcGetContextsDevice(pContext);
if (alcIsExtensionPresent(pDevice, "ALC_EXT_CAPTURE") == AL_FALSE){
printf("Failed to detect Capture Extension\n");
//Shutdown code goes here
return 0;
}
// Get list of available Capture Devices
const ALchar *pDeviceList = alcGetString(NULL, ALC_CAPTURE_DEVICE_SPECIFIER);
if (pDeviceList){
printf("\nAvailable Capture Devices are:-\n");
while (*pDeviceList)
{
printf("%s\n", pDeviceList);
pDeviceList += strlen(pDeviceList) + 1;
}
}
// Get the name of the 'default' capture device
szDefaultCaptureDevice = alcGetString(NULL, ALC_CAPTURE_DEFAULT_DEVICE_SPECIFIER);
printf("\nDefault Capture Device is '%s'\n\n", szDefaultCaptureDevice);
pCaptureDevice = alcCaptureOpenDevice(szDefaultCaptureDevice, SRATE, AL_FORMAT_MONO8, BUFFERSIZE);
if (pCaptureDevice)
{
printf("Opened '%s' Capture Device\n\n", alcGetString(pCaptureDevice, ALC_CAPTURE_DEVICE_SPECIFIER));
// Start audio capture
alcCaptureStart(pCaptureDevice);
// Wait for any key to get pressed before exiting
while (!_kbhit())
{
// Release some CPU time ...
sleep(1);
// Find out how many samples have been captured
alcGetIntegerv(pCaptureDevice, ALC_CAPTURE_SAMPLES, 1, &iSamplesAvailable);
printf("Samples available : %d\r", iSamplesAvailable);
// When we have enough data to fill our BUFFERSIZE byte buffer, grab the samples
if (iSamplesAvailable > (BUFFERSIZE / 2))
{
// Consume Samples
alcCaptureSamples(pCaptureDevice, Buffer, BUFFERSIZE / 2);
// Write the audio data to a file
//fwrite(Buffer, BUFFERSIZE, 1, pFile);
for(int i = 0; i < BUFFERSIZE / 2; i++){
vBuffer.push_back(Buffer[i]);
}
// Record total amount of data recorded
iDataSize += BUFFERSIZE / 2;
}
}
// Stop capture
alcCaptureStop(pCaptureDevice);
// Check if any Samples haven't been consumed yet
alcGetIntegerv(pCaptureDevice, ALC_CAPTURE_SAMPLES, 1, &iSamplesAvailable);
while (iSamplesAvailable)
{
if (iSamplesAvailable > (BUFFERSIZE / 2))
{
alcCaptureSamples(pCaptureDevice, Buffer, BUFFERSIZE / 2);
for(int i = 0; i < BUFFERSIZE/2; i++){
vBuffer.push_back(Buffer[i]);
}
iSamplesAvailable -= (BUFFERSIZE / 2);
iDataSize += BUFFERSIZE;
}
else
{
//TODO::Fix
alcCaptureSamples(pCaptureDevice, Buffer, iSamplesAvailable);
for(int i = 0; i < BUFFERSIZE/2; i++){
vBuffer.push_back(Buffer[i]);
}
iDataSize += iSamplesAvailable * 2;
iSamplesAvailable = 0;
}
}
alcCaptureCloseDevice(pCaptureDevice);
}
//TODO::Make less hacky
ALuint bufferID; // The OpenAL sound buffer ID
ALuint sourceID; // The OpenAL sound source
// Create sound buffer and source
alGenBuffers(1, &bufferID);
alGenSources(1, &sourceID);
alListener3f(AL_POSITION, 0.0f, 0.0f, 0.0f);
alSource3f(sourceID, AL_POSITION, 0.0f, 0.0f, 0.0f);
alBufferData(bufferID, AL_FORMAT_MONO8, &vBuffer[0], static_cast<ALsizei>(vBuffer.size()), SRATE);
// Attach sound buffer to source
alSourcei(sourceID, AL_BUFFER, bufferID);
// Finally, play the sound!!!
alSourcePlay(sourceID);
printf("Press any key to continue...");
getchar();
return 0;
}
As you can see from:
alBufferData(bufferID, AL_FORMAT_MONO8, &vBuffer[0], static_cast<ALsizei>(vBuffer.size()), SRATE);
I've verified that this is the case. For demonstration code I'm okay throwing this example out there, but I wouldn't ever use it in production.
I'm not sure but for me FREQ is the output frequency but not the sample rate.
define sampling-rate 48000
see this link : http://supertux.lethargik.org/wiki/OpenAL_Configuration
I wrote two functions which should export an audio-float buffer into a .wav-file, but I have problems with playing the exported file. Audacity plays it like it should be (sounds exactly like within my application), however, Ableton (DAW-software) seems to misinterprets some part of the wav so it sounds realy distorted. (like a distortion-effekt)
I guess that ableton somehow assumes a wrong sample-depth (smaller) so the actuall samples blow the limits.
I have two functions, the one creates an int32_t buffer from two float-buffers (mixing left and right into one buffer), the other function writes the .wav-file, including the format chunk etc. I guess that somewhere there is the problem.
class members / structs
// static I use in the export function
static const int FORMAT_PCM = 1;
static const int CHANNEL_COUNT = 2; // fix stereo
static const int BYTES_PER_SAMPLE = 4; // fix bytes per sample, 32bit audio
// a function I found in the internet, helps writting the bytes to the file
template <typename T>
static void write(std::ofstream& stream, const T& t) {
stream.write((const char*)&t, sizeof(T));
};
// used "structure" to store the buffer
class StereoAudioBuffer {
public:
StereoAudioBuffer(int length) : sizeInSamples(2*length){
samples = new int32_t[2*length];
};
~StereoAudioBuffer() {delete samples;};
int32_t *samples;
const int sizeInSamples;
};
converting function
StereoAudioBuffer* WaveExport::convertTo32BitStereo(
float *leftSamples,
float*rightSamples,
int length)
{
StereoAudioBuffer *buffer = new StereoAudioBuffer(length);
float max = 0;
// find max sample
for(int i = 0; i < length; i++) {
if(abs(leftSamples[i]) > max) {
max = abs(leftSamples[i]);
}
if(abs(rightSamples[i]) > max) {
max = abs(rightSamples[i]);
}
}
// normalise and scale to size(int32_t)
float factor = 2147483000.0f / max;
for(int i = 0; i < length; i++) {
buffer->samples[2*i] = leftSamples[i] * factor ;
buffer->samples[2*i+1] = rightSamples[i] * factor;
}
return buffer;
}
the exporting function (part of this code comes from the internet, sadly, I can't find the source anymore
void WaveExport::writeStereoWave(
const char *path,
StereoAudioBuffer* buffer,
int sampleRate)
{
std::ofstream stream(path, std::ios::binary);
// RIFF
stream.write("RIFF", 4);
// FILE SIZE
write<int>(stream, 36 + buffer->sizeInSamples * BYTES_PER_SAMPLE); // 32 bits -> 4 bytes
// WAVE
stream.write("WAVE", 4);
// FORMAT CHUNK
stream.write("fmt ", 4);
write<int>(stream, 16);
write<short>(stream, FORMAT_PCM); // Format
write<short>(stream, CHANNEL_COUNT); // Channels
write<int>(stream, sampleRate); // Sample Rate
write<int>(stream, sampleRate * CHANNEL_COUNT * BYTES_PER_SAMPLE); // Byterate
write<short>(stream, CHANNEL_COUNT * BYTES_PER_SAMPLE); // Frame size
write<short>(stream, 8 * BYTES_PER_SAMPLE); // Bits per sample
int dataChunkSize = buffer->sizeInSamples * BYTES_PER_SAMPLE;
// SAMPLES
stream.write("data", 4);
stream.write((const char*)&dataChunkSize, 4);
stream.write((const char*)buffer->samples, BYTES_PER_SAMPLE*buffer->sizeInSamples);
}
Does anybody know how to write .wav files and maybe can tell me what I did wrong or missed?
Thanks!
There was no problem. I used 32bit .wav which just wasn't supported in the application, I used for playback.
I changed the export functions to use int16_t, 16bit depth, and it works fine.
I've created 2 functions :
- One that records the microphone
- One that plays the sound of the microphone
It records the microphone for 3 seconds
#include <iostream>
#include <Windows.h>
#include <vector>
using namespace std;
#pragma comment(lib, "winmm.lib")
short int waveIn[44100 * 3];
void PlayRecord();
void StartRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
MMRESULT result;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
result = waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat,
0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Insert a wave input buffer
result = waveInAddBuffer(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
// Commence sampling input
result = waveInStart(hWaveIn);
cout << "recording..." << endl;
Sleep(3 * 1000);
// Wait until finished recording
waveInClose(hWaveIn);
PlayRecord();
}
void PlayRecord()
{
const int NUMPTS = 44100 * 3; // 3 seconds
int sampleRate = 44100;
// 'short int' is a 16-bit type; I request 16-bit samples below
// for 8-bit capture, you'd use 'unsigned char' or 'BYTE' 8-bit types
HWAVEIN hWaveIn;
WAVEFORMATEX pFormat;
pFormat.wFormatTag=WAVE_FORMAT_PCM; // simple, uncompressed format
pFormat.nChannels=1; // 1=mono, 2=stereo
pFormat.nSamplesPerSec=sampleRate; // 44100
pFormat.nAvgBytesPerSec=sampleRate*2; // = nSamplesPerSec * n.Channels * wBitsPerSample/8
pFormat.nBlockAlign=2; // = n.Channels * wBitsPerSample/8
pFormat.wBitsPerSample=16; // 16 for high quality, 8 for telephone-grade
pFormat.cbSize=0;
// Specify recording parameters
waveInOpen(&hWaveIn, WAVE_MAPPER,&pFormat, 0L, 0L, WAVE_FORMAT_DIRECT);
WAVEHDR WaveInHdr;
// Set up and prepare header for input
WaveInHdr.lpData = (LPSTR)waveIn;
WaveInHdr.dwBufferLength = NUMPTS*2;
WaveInHdr.dwBytesRecorded=0;
WaveInHdr.dwUser = 0L;
WaveInHdr.dwFlags = 0L;
WaveInHdr.dwLoops = 0L;
waveInPrepareHeader(hWaveIn, &WaveInHdr, sizeof(WAVEHDR));
HWAVEOUT hWaveOut;
cout << "playing..." << endl;
waveOutOpen(&hWaveOut, WAVE_MAPPER, &pFormat, 0, 0, WAVE_FORMAT_DIRECT);
waveOutWrite(hWaveOut, &WaveInHdr, sizeof(WaveInHdr)); // Playing the data
Sleep(3 * 1000); //Sleep for as long as there was recorded
waveInClose(hWaveIn);
waveOutClose(hWaveOut);
}
int main()
{
StartRecord();
return 0;
}
How can I change my StartRecord function (and I guess my PlayRecord function aswell), to make it record untill theres no input from the microphone?
(So far, those 2 functions are working perfectly - records the microphone for 3 seconds, then plays the recording)...
Thanks!
Edit: by no sound, I mean the sound level is too low or something (means the person probably isnt speaking)...
Because sound is a wave, it oscillates between high and low pressures. This waveform is usually recorded as positive and negative numbers, with zero being the neutral pressure. If you take the absolute value of the signal and keep a running average it should be sufficient.
The average should be taken over a long enough period that you account for the appropriate amount of silence. A very cheap way to keep an estimate of the running average is like this:
const double threshold = 50; // Whatever threshold you need
const int max_samples = 10000; // The representative running average size
double average = 0; // The running average
int sample_count = 0; // When we are building the average
while( sample_count < max_samples || average > threshold ) {
// New sample arrives, stored in 'sample'
// Adjust the running absolute average
if( sample_count < max_samples ) sample_count++;
average *= double(sample_count-1) / sample_count;
average += std::abs(sample) / sample_count;
}
The larger max_samples, the slower average will respond to a signal. After the sound stops, it will slowly trail off. However, it will be slow to rise again too. This would be fine for reasonably continuous sound.
With something like speech, which can have short or long pauses, you may want to use an impulse-based approach. You can just define the number of samples of 'silence' that you expect, and reset it whenever you receive an impulse that exceeds the threshold. Using the running average above with a much shorter window size will give you a simple way of detecting an impulse. Then you just need to count...
const int max_samples = 100; // Smaller window size for impulse
const int max_silence_samples = 10000; // Maximum samples below threshold
int silence = 0; // Number of samples below threshold
while( silence < max_silence_samples ) {
// Compute running average as before
//...
// Check for silence. If there's a signal, reset the counter.
if( average > threshold ) silence = 0;
else ++silence;
}
Adjusting threshold and max_samples will control the sensitivity to pops and clicks, while max_silence_samples gives you control over how much silence is allowed before you stop recording.
There are undoubtedly more technical ways to achieve your goals, but it's always good to try the simple one first. See how you go with this.
I suggest you to do it via DirectShow. You should create an instance of microphone, SampleGrabber, audio encoder and file writer. Your graph should be like this:
Microphone -> SampleGrabber -> Audio Encoder -> File Writer
Every sample passes through SampleGrabber and you can read all raw samples and check if you should continue record or not. This is the best way you and both record and check it's contents.
I am having trouble understanding a particular area of code in the Steinberg VST Synth example
In this function:
void VstXSynth::processReplacing (float** inputs, float** outputs, VstInt32 sampleFrames)
{
float* out1 = outputs[0];
float* out2 = outputs[1];
if (noteIsOn)
{
float baseFreq = freqtab[currentNote & 0x7f] * fScaler;
float freq1 = baseFreq + fFreq1; // not really linear...
float freq2 = baseFreq + fFreq2;
float* wave1 = (fWaveform1 < .5) ? sawtooth : pulse;
float* wave2 = (fWaveform2 < .5) ? sawtooth : pulse;
float wsf = (float)kWaveSize;
float vol = (float)(fVolume * (double)currentVelocity * midiScaler);
VstInt32 mask = kWaveSize - 1;
if (currentDelta > 0)
{
if (currentDelta >= sampleFrames) // future
{
currentDelta -= sampleFrames;
return;
}
memset (out1, 0, currentDelta * sizeof (float));
memset (out2, 0, currentDelta * sizeof (float));
out1 += currentDelta;
out2 += currentDelta;
sampleFrames -= currentDelta;
currentDelta = 0;
}
// loop
while (--sampleFrames >= 0)
{
// this is all very raw, there is no means of interpolation,
// and we will certainly get aliasing due to non-bandlimited
// waveforms. don't use this for serious projects...
(*out1++) = wave1[(VstInt32)fPhase1 & mask] * fVolume1 * vol;
(*out2++) = wave2[(VstInt32)fPhase2 & mask] * fVolume2 * vol;
fPhase1 += freq1;
fPhase2 += freq2;
}
}
else
{
memset (out1, 0, sampleFrames * sizeof (float));
memset (out2, 0, sampleFrames * sizeof (float));
}
}
The way I understand the function is that if a midi note is currently on, we need to copy our wave table into the outputs array to pass back to the VstHost. What I don't understand specifically is what the area in the if (currentDelta > 0) conditional block is doing. It seems like its just writing zeros to the output arrays...
A full version of the file can be found at http://pastebin.com/SdAXkRyW
The incomming MIDI NoteOn event can have an offset relative to the start of the buffers you receive (called deltaFrames). The currentDelta keeps track of when the note should play relative to the start of the buffers received.
So if the currentDelta > sampleFrames, that means the note should not play in this cycle (future) - early exit.
If the currentDelta is within range of this cycle then the memory is cleared up to the moment the note should produce output (memset) and the pointers are manipulated to make it look like the buffers begin right on the spot where the sound should play - length -sampleFrames- is also adjusted.
Then in the loop the sound is produced.
Hope it helps.
Marc