Fixing Real Time Audio with PortAudio in Windows 10 - c++

I created an application a couple of years ago that allowed me to process audio by downmixing a 6 channel or 8 channel a.k.a 5.1 as 7.1 as matrixed stereo encoded for that purpose I used the portaudio library with great results this is an example of the open stream function and callback to downmix a 7.1 signal
Pa_OpenStream(&Flujo, &inputParameters, &outParameters, SAMPLE_RATE, 1, paClipOff, ptrFunction, NULL);
Notice the use of framesPerBuffer value of just one (1), this is my callback function
int downmixed8channels(const void *input, void *output, unsigned long framesPerBuffer, const PaStreamCallbackTimeInfo * info, PaStreamCallbackFlags state, void * userData)
{
(void)userData;
(void)info;
(void)state;
(void)framesBuffer;
float *ptrInput = (float*)input;
float *ptrOutput = (float*)ouput;
/*This is a struct to identify samples*/
AudioSamples->L = ptrInput[0];
AudioSamples->R = ptrInput[1];
AudioSamples->C = ptrInput[2];
AudioSamples->LFE = ptrInput[3];
AudioSamples->RL = ptrInput[4];
AudioSamples->RR = ptrInput[5];
AudioSamples->SL = ptrInput[6];
AudioSamples->SR = ptrInput[7];
Encoder->8channels(AudioSamples->L,
AudioSamples->R,
AudioSamples->C,
AudioSamples->LFE,
MuestrasdeAudio->SL,
MuestrasdeAudio->SR,
MuestrasdeAudio->RL,
MuestrasdeAudio->RR,);
ptrOutput[0] = Encoder->gtLT();
ptrOutput[1] = Encoder->gtRT();
return paContinue;
}
As you can see the order set by the index in the output and input buffer correspond to a discrete channel
in the case of the output 0 = Left channel, 1 = right Channel. This used to work well, until entered Windows 10 2004, since I updated my system to this new version my audio glitch and I get artifacts like those
Those are captures from the sound of the channel test window under the audio device panel of windows. By the images is clear my program is dropping frames, so the first try to solve this was to use a larger buffer than one to hold samples process them and send then, the reason I did not use a buffer size larger than one in the first place was that the program would drop frames.
But before implementing a I did a proof of concept, would not include audio processing at all, of simple passing of data from input to ouput, for that I set the oputput channelCount parameters to 8 just like the input, resulting in something as simple as this.
for (int i = 0; i < FramesPerBuffer /*1000*/; i++)
{
ptrOutput[i] = ptrOutput[i];
}
but still the program is still dropping samples.
Next I used two callbacks one for writing to a buffer and a second one to read it and send it to output
(void)info;
(void)userData;
(void)state;
(void)output;
float* ptrInput = (float*)input;
for (int i = 0; i < FRAME_SIZE; i++)
{
buffer_input[i] = ptrInput[i];
}
return paContinue;
Callback to store.
(void)info;
(void)userData;
(void)state;
(void)output;
float* ptrOutput = (float*)output;
for (int i = 0; i < FRAME_SIZE; i++)
{
AudioSamples->L = (buffer_input[i] );
AudioSamples->R = (buffer_input[i++]);
AudioSamples->C = (buffer_input[i++] );
AudioSamples->LFE = (buffer_input[i++]);
AudioSamples->SL = (buffer_input[i++] );
AudioSamples->SR = (buffer_input[i++]);
Encoder->Encoder(AudioSamples->L, AudioSamples->R, AudioSamples->C, AudioSamples->LFE,
AudioSamples->SL, AudioSamples->SR);
bufferTransformed[w] = (Encoder->getLT() );
bufferTransformed[w++] = (Encoder->getRT() );
}
w = 0;
for (int i = 0; i < FRAME_REDUCED; i++)
{
ptrOutput[i] = buffer_Transformed[i];
}
return paContinue;
Callback for processing
The processing callback use a reduced frames per buffer since 2 channel is less than eight since it seems in portaudio a frame is composed of a sample for each audio channel.
This also did not work, since the first problem, is how to syncronize the two callback?, after all of this, what recommendation or advice, can you give me to solve this issue,
Notes: the samplerate must be same for both devices, I implemeted logic in the program to prevent this, the bitdepth is also the same I am using paFloat32,
.The portaudio is the modified one use by audacity, since I wanted to use their implementation of WASAPI
loopback
Thank very much in advance!.

At the end of the day it I did not have to change my callbacks functions in any way, what solved it, was changing or increasing the parameter ".suggestedLatency" of the input and output parameters, to 1.0, even the devices defaultLowOutputLatency or defaultHighOutputLatency values where causing to much glitching, I test it until 1.0 was de sweepspot, higher values did not seen to improve.
TL;DR Increased the suggestedLatency until the glitching is gone.

Related

How can I decompress an OGG sound file using FMOD?

Is there a way to decompress sound files using the FMOD library in c++?
I'm developing a sound editor, using the FMOD Engine library, but I got to the problem with compressed audio files, specifically OGG types.
For now I'm just reading the raw data using FMOD::Sound::readData(), and then normalize it and display it to the screen using SFML. This works fine with WAV files, because they are not compressed, but I need to do more steps for compressed formats.
This is what I'm doing now:
FMOD_RESULT r;
FMOD::System* m_fmodSystem = nullptr;
int m_maxChannels = 64;
// Create fmod system
r = FMOD::System_Create(&m_fmodSystem);
FMOD_ERROR_CHECK(r);
// Initialize system
r = m_fmodSystem->init(m_maxChannels, FMOD_INIT_NORMAL, nullptr);
FMOD_ERROR_CHECK(r);
// Create sound
FMOD::Sound* soundResource = nullptr;
FMOD::Channel* channel = nullptr;
r = m_fmodSystem->createSound("640709__chobesha__laser-gun-sound.ogg",
FMOD_DEFAULT | FMOD_OPENONLY,
nullptr,
&soundResource);
FMOD_ERROR_CHECK(r);
// Get sound length in raw bytes
unsigned int audioLength = 0;
FMOD_TIMEUNIT timeUnit = FMOD_TIMEUNIT_RAWBYTES;
r = soundResource->getLength(&audioLength, timeUnit);
FMOD_ERROR_CHECK(r);
// Read sound data
char* audioBuffer = new char[audioLength];
unsigned int readData = 0;
r = soundResource->readData(reinterpret_cast<void*>(audioBuffer),
audioLength,
&readData);
FMOD_ERROR_CHECK(r);
signed short* interpretedData = reinterpret_cast<signed short*>(audioBuffer);
// Analize data to normalize it
signed short maxValue = -32767;
signed short minValue = 32767;
int interpretedDataSize = readData / sizeof(signed short);
for (int i = 0; i < interpretedDataSize; ++i) {
if (interpretedData[i] > maxValue) {
maxValue = interpretedData[i];
}
if (interpretedData[i] < minValue) {
minValue = interpretedData[i];
}
}
float maxValF = static_cast<float>(maxValue);
float minValF = static_cast<float>(minValue);
float* normalizedArray = new float[interpretedDataSize];
// Normalize data
float maxAbsValF = abs(maxValF);
maxAbsValF = maxAbsValF > abs(minValF) ? maxAbsValF : abs(minValF);
for (int i = 0; i < interpretedDataSize; ++i) {
normalizedArray[i] = interpretedData[i] / maxAbsValF;
}
I read on other posts and on the FMOD documentation that I can use the flag FMOD_CREATESAMPLE to tell the createSound function to decompress the data at loadtime, instead of playtime, but It doesn't work in my current structure of the code, I'm guessing because the FMOD_OPENONLY prevents it from closing, and therefor it doesn't gets the chance to decompress, or something. That's what I got from the documentation.
The problem with not using the FMOD_OPENONLY flag, is that I cannot read the data using the readData function, or it returns an error flag.
Searching, I found that I can use the lock function, to help it decompress and to get the pointer to the data of the sound, but even with all of this, it stills appears to be compressed. I donĀ“t know if I'm missing something.
This is the version 2 of the code, with this modifications:
// Create sound
FMOD::Sound* soundResource = nullptr;
FMOD::Channel* channel = nullptr;
r = m_fmodSystem->createSound("640709__chobesha__laser-gun-sound.ogg",
FMOD_DEFAULT | FMOD_CREATESAMPLE,
nullptr,
&soundResource);
FMOD_ERROR_CHECK(r);
// Get sound length in raw bytes
unsigned int audioLength = 0;
FMOD_TIMEUNIT timeUnit = FMOD_TIMEUNIT_RAWBYTES;
r = soundResource->getLength(&audioLength, timeUnit);
FMOD_ERROR_CHECK(r);
// Read sound data
char* audioBuffer = new char[audioLength];
void* ptr2 = nullptr;
unsigned int len1, len2;
r = soundResource->lock(0, audioLength, reinterpret_cast<void**>(&audioBuffer), &ptr2, &len1, &len2);
FMOD_ERROR_CHECK(r);
r = soundResource->unlock(reinterpret_cast<void*>(audioBuffer), ptr2, len1, len2);
FMOD_ERROR_CHECK(r);
This is the graph I get for a WAV sound
The left side is the sound loaded in the Audacity app, and the right side is my graph.
This is the graph for the first try with the OGG file
And this is the graph for the OGG file with the modifications
For what I can see, the first is the same as the second, so I'm assuming both are compressed and what I changed did nothing.
Someone knows a better way to decompress and read the raw data of a sound, preferably using this library of FMOD. If it's not possible with FMOD, what is the best way to decompress any sound file, knowing its format.
This answer is just a couple minority-position takes and a sketchy description of a process I once used. Maybe the thoughts are worth consideration.
One thought: a person who is editing sound (your target audience?) has the know-how to decompress files (e.g., using Audacity), so perhaps adding this capability (handling all possible incoming audio formats) is a lower priority?
Another thought: there are likely many libraries for decompressing sound available. You could employ one of them prior to presenting the results to FMOD. I just did a search on github for "ogg c++" and was shown 51 repositories.
In my own experience, for an application I wrote about seven years ago, I tweaked some code from a Vorbis decoder source so that it output PCM rather than outputting as a .wav. With OGG, the .wav data is converted to PCM prior to compression. So, it decompresses back to PCM before converting that to a .wav. I found the point in the code where the conversion happens and edited that out, leaving the data in a decompressed PCM form.
My application was built to accept PCM, so I actually ended up saving an intermediate step.

How can I access the contents of a MTLBuffer after GPU rendering?

I'm working on an OpenFX plugin to process images in grading/post-production software.
All my processing is done in a series of Metal kernel functions. The image is sent to the GPU as buffers (float array), one for the input and one for the output.
The output is then used by the OpenFX framework for display inside the host application, so up till then I didn't have to take care of it.
I now need to be able to read the output values once the GPU has processed the commands. I have tried to use the "contents" method applied on the buffer but my plugin keeps crashing (in the worst case), or gives me very weird values when it "works" (I'm not supposed to have anything over 1 and under 0, but I get very large numbers, 0 or negative 0, nan... So I assume I have a memory access issue of sorts).
At first I thought it was an issue with Private/Shared memory, so I tried to modify the buffer to be shared. But I'm still struggling!
Full disclosure: I have no specific training in MSL, I'm learning as I go with this project so I might be doing and-or saying very stupid things. I have looked around for hours before deciding to ask for help. Thanks to all who will help out in any way!
Below is the code (without everything that doesn't concern my current issue). If it is lacking anything of interest please let me know.
id < MTLBuffer > srcDeviceBuf = reinterpret_cast<id<MTLBuffer> >(const_cast<float*>(p_Input)) ;
//Below is the destination Image buffer creation the way it used to be done before my edits
//id < MTLBuffer > dstDeviceBuf = reinterpret_cast<id<MTLBuffer> >(p_Output);
//My attempt at creating a Shared memory buffer
MTLResourceOptions bufferOptions = MTLResourceStorageModeShared;
int bufferLength = sizeof(float)*1920*1080*4;
id <MTLBuffer> dstDeviceBuf = [device newBufferWithBytes:p_Output length:bufferLength options:bufferOptions];
id<MTLCommandBuffer> commandBuffer = [queue commandBuffer];
commandBuffer.label = [NSString stringWithFormat:#"RunMetalKernel"];
id<MTLComputeCommandEncoder> computeEncoder = [commandBuffer computeCommandEncoder];
//First method to be computed
[computeEncoder setComputePipelineState:_initModule];
int exeWidth = [_initModule threadExecutionWidth];
MTLSize threadGroupCount = MTLSizeMake(exeWidth, 1, 1);
MTLSize threadGroups = MTLSizeMake((p_Width + exeWidth - 1) / exeWidth,
p_Height, 1);
[computeEncoder setBuffer:srcDeviceBuf offset: 0 atIndex: 0];
[computeEncoder setBuffer:dstDeviceBuf offset: 0 atIndex: 8];
//encodes first module to be executed
[computeEncoder dispatchThreadgroups:threadGroups threadsPerThreadgroup: threadGroupCount];
//Modules encoding
if (p_lutexport_on) {
//Fills the image with patch values for the LUT computation
[computeEncoder setComputePipelineState:_LUTExportModule];
[computeEncoder dispatchThreadgroups:threadGroups threadsPerThreadgroup: threadGroupCount];
}
[computeEncoder endEncoding];
[commandBuffer commit];
if (p_lutexport_on) {
//Here is where I try to read the buffer values (and inserts them into a custom object "p_lut_exp_lut"
float* result = static_cast<float*>([dstDeviceBuf contents]);
//Retrieve the output values and populate the LUT with them
int lutLine = 0;
float3 out;
for (int index(0); index < 35937 * 4; index += 4) {
out.x = result[index];
out.y = result[index + 1];
out.z = result[index + 2];
p_lutexp_lut->setValuesAtLine(lutLine, out);
lutLine++;
}
p_lutexp_lut->toFile();
}
If a command buffer includes write or read operations on a given MTLBuffer, you must ensure that these operations complete before reading the buffers contents. You can use the addCompletedHandler: method, waitUntilCompleted method, or custom semaphores to signal that a command buffer has completed execution.
[commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> cb) {
/* read or write buffer here */
}];
[commandBuffer commit];

C++ ASIO, accessing buffers

I have no experience in audio programming and C++ is quite low level language so I have a little problems with it. I work with ASIO SDK 2.3 downloaded from http://www.steinberg.net/en/company/developers.html.
I am writing my own host based on example inside SDK.
For now I've managed to go through the whole sample and it looks like it's working. I have external sound card connected to my PC. I've successfully loaded driver for this device, configured it, handled callbacks, casting data from analog to digital etc. common stuff.
And part where I am stuck now:
When I play some track via my device I can see bars moving in the mixer (device's software). So device is connected in right way. In my code I've picked the inputs and outputs with the names of the bars that are moving in mixer. I've also used ASIOCreateBuffers() to create buffer for each input/output.
Now correct me if I am wrong:
When ASIOStart() is called and driver is in running state, when I input the sound signal to my external device I believe the buffers get filled with data, right?
I am reading the documentation but I am a bit lost - how can I access the data being sent by device to application, stored in INPUT buffers? Or signal? I need it for signal analysis or maybe recording in future.
EDIT: If I had made it to complicated then in a nutshell my question is: how can I access input stream data from code? I don't see any objects/callbacks letting me to do so in documentation.
The hostsample in the ASIO SDK is pretty close to what you need. In the bufferSwitchTimeInfo callback there is some code like this:
for (int i = 0; i < asioDriverInfo.inputBuffers + asioDriverInfo.outputBuffers; i++)
{
int ch = asioDriverInfo.bufferInfos[i].channelNum;
if (asioDriverInfo.bufferInfos[i].isInput == ASIOTrue)
{
char* buf = asioDriver.bufferInfos[i].buffers[index];
....
Inside of that if block asioDriver.bufferInfos[i].buffers[index] is a pointer to the raw audio data (index is a parameter to the method).
The format of the buffer is dependent upon the driver and that can be discovered by testing asioDriverInfo.channelInfos[i].type. The types of formats will be 32bit int LSB first, 32bit int MSB first, and so on. You can find the list of values in the ASIOSampleType enum in asio.h. At this point you'll want to convert the samples to some common format for downstream signal processing code. If you're doing signal processing you'll probably want convert to double. The file host\asioconvertsample.cpp will give you some idea of what's involved in the conversion. The most common format you're going to encounter is probably INT32 MSB. Here is how you'd convert it to double.
for (int i = 0; i < asioDriverInfo.inputBuffers + asioDriverInfo.outputBuffers; i++)
{
int ch = asioDriverInfo.bufferInfos[i].channelNum;
if (asioDriverInfo.bufferInfos[i].isInput == ASIOTrue)
{
switch (asioDriverInfo.channelInfos[i].type)
{
case ASIOInt32LSB:
{
double* pDoubleBuf = new double[_bufferSize];
for (int i = 0 ; i < _bufferSize ; ++i)
{
pDoubleBuf[i] = *(int*)asioDriverInfo.bufferInfos.buffers[index] / (double)0x7fffffff;
}
// now pDoubleBuf contains one channels worth of samples in the range of -1.0 to 1.0.
break;
}
// and so on...
Thank you very much. Your answer helped quite much but as I am inexperienced with C++ a bit :P I find it a bit problematic.
In general I've written my own host based on hostsample. I didn't implement asioDriverInfo structure and use common variables for now.
My first problem was:.
char* buf = asioDriver.bufferInfos[i].buffers[index];
as I got error that I can't cast (void*) to char* but this probably solved the problem:
char* buf = static_cast<char*>(bufferInfos[i].buffers[doubleBufferIndex]);
My second problem is with the data conversion. I've checked the file you've recommended me but I find it a little black magic. For now I am trying to follow your example and:
for (int i = 0; i < inputBuffers + outputBuffers; i++)
{
if (bufferInfos[i].isInput)
{
switch (channelInfos[i].type)
{
case ASIOSTInt32LSB:
{
double* pDoubleBuf = new double[buffSize];
for (int j = 0 ; j < buffSize ; ++j)
{
pDoubleBuf[j] = bufferInfos[i].buffers[doubleBufferIndex] / (double)0x7fffffff;
}
break;
}
}
}
I get error there:
pDoubleBuf[j] = bufferInfos[i].buffers[doubleBufferIndex] / (double)0x7fffffff;
which is:
error C2296: '/' : illegal, left operand has type 'void *'
What I don't get is that in your example there is no table there: asioDriverInfo.bufferInfos.buffers[index] after bufferInfos and even if I fix it... to what kind of type should I cast it to make it work. P
PS. I am sure ASIOSTInt32LSB data type is fine for my PC.
The ASIO input and output buffers are accessible using void pointers, but using memcpy or memmove to access I/O buffer will create a memory copy which is to be avoided if you are doing real-time processing. I would suggest casting the pointer type to int* so you can directly access them.
It's also very slow in real-time processing to cast types 1 by 1 when you have like 100+ audio channels when AVX2 is supported on most CPUs.
_mm256_loadu_si256() and _mm256_cvtepi32_ps() will do the conversion much faster.

Granular Synthesis in iOS 6 using AudioFileServices

I have a question regarding a sound synthesis app that I'm working on. I am trying to read in an audio file, create randomized 'grains' using granular synthesis techniques, place them into an output buffer and then be able to play that back to the user using OpenAL. For testing purposes, I am simply writing the output buffer to a file that I can then listen back to.
Judging by my results, I am on the right track but am getting some aliasing issues and playback sounds that just don't seem quite right. There is usually a rather loud pop in the middle of the output file and volume levels are VERY loud at times.
Here are the steps that I have taken to get the results I need, but I'm a little bit confused about a couple of things, namely formats that I am specifying for my AudioStreamBasicDescription.
Read in an audio file from my mainBundle, which is a mono file in .aiff format:
ExtAudioFileRef extAudioFile;
CheckError(ExtAudioFileOpenURL(loopFileURL,
&extAudioFile),
"couldn't open extaudiofile for reading");
memset(&player->dataFormat, 0, sizeof(player->dataFormat));
player->dataFormat.mFormatID = kAudioFormatLinearPCM;
player->dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
player->dataFormat.mSampleRate = S_RATE;
player->dataFormat.mChannelsPerFrame = 1;
player->dataFormat.mFramesPerPacket = 1;
player->dataFormat.mBitsPerChannel = 16;
player->dataFormat.mBytesPerFrame = 2;
player->dataFormat.mBytesPerPacket = 2;
// tell extaudiofile about our format
CheckError(ExtAudioFileSetProperty(extAudioFile,
kExtAudioFileProperty_ClientDataFormat,
sizeof(AudioStreamBasicDescription),
&player->dataFormat),
"couldnt set client format on extaudiofile");
SInt64 fileLengthFrames;
UInt32 propSize = sizeof(fileLengthFrames);
ExtAudioFileGetProperty(extAudioFile,
kExtAudioFileProperty_FileLengthFrames,
&propSize,
&fileLengthFrames);
player->bufferSizeBytes = fileLengthFrames * player->dataFormat.mBytesPerFrame;
Next I declare my AudioBufferList and set some more properties
AudioBufferList *buffers;
UInt32 ablSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
buffers = (AudioBufferList *)malloc(ablSize);
player->sampleBuffer = (SInt16 *)malloc(sizeof(SInt16) * player->bufferSizeBytes);
buffers->mNumberBuffers = 1;
buffers->mBuffers[0].mNumberChannels = 1;
buffers->mBuffers[0].mDataByteSize = player->bufferSizeBytes;
buffers->mBuffers[0].mData = player->sampleBuffer;
My understanding is that .mData will be whatever was specified in the formatFlags (in this case, type SInt16). Since it is of type (void *), I want to convert this to float data which is obvious for audio manipulation. Before I set up a for loop which just iterated through the buffer and cast each sample to a float*. This seemed unnecessary so now I pass in my .mData buffer to a function I created which then granularizes the audio:
float *theOutBuffer = [self granularizeWithData:(float *)buffers->mBuffers[0].mData with:framesRead];
In this function, I dynamically allocate some buffers, create random size grains, place them in my out buffer after windowing them using a hamming window and return that buffer (which is float data). Everything is cool up to this point.
Next I set up all my output file ASBD and such:
AudioStreamBasicDescription outputFileFormat;
bzero(audioFormatPtr, sizeof(AudioStreamBasicDescription));
outputFileFormat->mFormatID = kAudioFormatLinearPCM;
outputFileFormat->mSampleRate = 44100.0;
outputFileFormat->mChannelsPerFrame = numChannels;
outputFileFormat->mBytesPerPacket = 2 * numChannels;
outputFileFormat->mFramesPerPacket = 1;
outputFileFormat->mBytesPerFrame = 2 * numChannels;
outputFileFormat->mBitsPerChannel = 16;
outputFileFormat->mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked;
UInt32 flags = kAudioFileFlags_EraseFile;
ExtAudioFileRef outputAudioFileRef = NULL;
NSString *tmpDir = NSTemporaryDirectory();
NSString *outFilename = #"Decomp.caf";
NSString *outPath = [tmpDir stringByAppendingPathComponent:outFilename];
NSURL *outURL = [NSURL fileURLWithPath:outPath];
AudioBufferList *outBuff;
UInt32 abSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
outBuff = (AudioBufferList *)malloc(abSize);
outBuff->mNumberBuffers = 1;
outBuff->mBuffers[0].mNumberChannels = 1;
outBuff->mBuffers[0].mDataByteSize = abSize;
outBuff->mBuffers[0].mData = theOutBuffer;
CheckError(ExtAudioFileCreateWithURL((__bridge CFURLRef)outURL,
kAudioFileCAFType,
&outputFileFormat,
NULL,
flags,
&outputAudioFileRef),
"ErrorCreatingURL_For_EXTAUDIOFILE");
CheckError(ExtAudioFileSetProperty(outputAudioFileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof(outputFileFormat),
&outputFileFormat),
"ErrorSettingProperty_For_EXTAUDIOFILE");
CheckError(ExtAudioFileWrite(outputAudioFileRef,
framesRead,
outBuff),
"ErrorWritingFile");
The file is written correctly, in CAF format. My question is this: am I handling the .mData buffer correctly in that I am casting the samples to float data, manipulating (granulating) various window sizes and then writing it to a file using ExtAudioFileWrite (in CAF format)? Is there a more elegant way to do this such as declaring my ASBD formatFlag as kAudioFlagIsFloat? My output CAF file has some clicks in it and when I open it in Logic, it looks like there is a lot of aliasing. This makes sense if I am trying to send it float data but there is some kind of conversion happening which I am unaware of.
Thanks in advance for any advice on the matter! I have been an avid reader of pretty much all the source material online, including the Core Audio Book, various blogs, tutorials, etc. The ultimate goal of my app is to play the granularized audio in real time to a user with headphones so the writing to file thing is just being used for testing at the moment. Thanks!
What you say about step 3 suggests to me you are interpreting an array of shorts as an array of floats? If that is so, we found the reason for your trouble. Can you assign the short values one by one into an array of floats? That should fix it.
It looks like mData is a void * pointing to an array of shorts. Casting this pointer to a float * doesn't change the underlying data into float but your audio processing function will treat them as if they were. However, float and short values are stored in totally different ways, so the math you do in that function will operate on very different values that have nothing to do with your true input signal. To investigate this experimentally, try the following:
short data[4] = {-27158, 16825, 23024, 15};
void *pData = data;
The void pointer doesn't indicate what kind of data it points to, so erroneously, one can falsely assume it points to float values. Note that a short is 2 byte wide, but a float is 4 byte wide. It is a coincidence that your code did not crash with an access violation. Interpreted as float the array above is only long enough for two values. Let's just look at the first value:
float *pfData = (float *)pData;
printf("%d == %f\n", data[0], pfData[0]);
The output of this will be -27158 == 23.198200 illustrating how instead of the expected -27158.0f you obtain roughly 23.2f. Two problematic things happened. First, sizeof(float) is not sizeof(short). Second, the "ones and zeros" of a floating point number are stored very differently than an integer. See http://en.wikipedia.org/wiki/Single_precision_floating-point_format.
How to solve the problem? There are at least two simple solutions. First, you could convert each element of the array before you feed it into your audio processor:
int k;
float *pfBuf = (float *)malloc(n_data * sizeof(float));
short *psiBuf = (short *)buffers->mBuffers[0].mData[k];
for (k = 0; k < n_data; k ++)
{
pfBuf[k] = psiBuf[k];
}
[self granularizeWithData:pfBuf with:framesRead];
for (k = 0; k < n_data; k ++)
{
psiBuf[k] = pfBuf[k];
}
free(pfBuf);
You see that most likely you will have to convert everything back to short after your call to granularizeWithData: with:. So the second solution would be to do all processing in short although from what you write, I imagine you would not like that latter approach.

Generating Sounds at Runtime with C++

So I'm picking up C++ after a long hiatus and I had the idea to create a program which can generate music based upon strings of numbers at runtime (was inspired by the composition of Pi done by some people) with the eventual goal being some sort of procedural music generation software.
So far I have been able to make a really primitive version of this with the Beep() function and just feeding through the first so and so digits of Pi as a test. Works like a charm.
What I'm looking for now is how I could kick it up a notch and get some higher quality sound being made (because Beep() literally is the most primitive sound... ever) and I realized I have absolutely no idea how to do this. What I need is either a library or some sort of API that can:
1) Generate sound without pre-existing file. I want the result to be 100% generated by code and not rely on any samples, optimally.
2) If I could get something going that would be capable of playing multiple sounds at a time, like be able to play chords or a melody with a beat, that would be nice.
3) and If I could in any way control the wave it plays (kinda like chiptune mixers can) via equation or some other sort of data, that'd be super helpful.
I don't know if this is a weird request or I just researched it using the wrong terms, but I just wasn't able to find anything along these lines or at least nothing that was well documented at all. :/
If anyone can help, I'd really appreciate it.
EDIT: Also, apparently I'm just super not used to asking stuff on forums, my target platform is Windows (7, specifically, although I wouldn't think that matters).
I use portaudio (http://www.portaudio.com/). It will let you create PCM streams in a portable way. Then you just push the samples into the stream, and they will play.
#edit: using PortAudio is pretty easy. You initialize the library. I use floating point samples to make it super easy. I do it like this:
PaError err = Pa_Initialize();
if ( err != paNoError )
return false;
mPaParams.device = Pa_GetDefaultOutputDevice();
if ( mPaParams.device == paNoDevice )
return false;
mPaParams.channelCount = NUM_CHANNELS;
mPaParams.sampleFormat = paFloat32;
mPaParams.suggestedLatency =
Pa_GetDeviceInfo( mPaParams.device )->defaultLowOutputLatency;
mPaParams.hostApiSpecificStreamInfo = NULL;
Then later when you want to play sounds you create a stream, 2 channels for stereo, at 44khz, good for mp3 audio:
PaError err = Pa_OpenStream( &mPaStream,
NULL, // no input
&mPaParams,
44100, // params
NUM_FRAMES, // frames per buffer
0,
sndCallback,
this
);
Then you implement the callback to fill the PCM audio stream. The callback is a c function, but I just call through to my C++ class to handle the audio. I ripped this from my code, and it may not be 100% correct now as I removed a ton of stuff you won't care about. But its works kind of like this:
static int sndCallback( const void* inputBuffer,
void* outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void* userData )
{
Snd* snd = (Snd*)userData;
return snd->callback( (float*)outputBuffer, framesPerBuffer );
}
u32 Snd::callback( float* outbuf, u32 nFrames )
{
mPlayMutex.lock(); // use mutexes because this is asyc code!
// clear the output buffer
memset( outbuf, 0, nFrames * NUM_CHANNELS * sizeof( float ));
// mix all the sounds.
if ( mChannels.size() )
{
// I have multiple audio sources I'm mixing. That's what mChannels is.
for ( s32 i = mChannels.size(); i > 0; i-- )
{
for ( u32 j = 0; j < frameCount * NUM_CHANNELS; j++ )
{
float f = outbuf[j] + getNextSample( i ) // <------------------- your code here!!!
if ( f > 1.0 ) f = 1.0; // clamp it so you don't get clipping.
if ( f < -1.0 ) f = -1.0;
outbuf[j] = f;
}
}
}
mPlayMutex.unlock_p();
return 1; // when you are done playing audio return zero.
}
I answered a very similar question on this earlier this week: Note Synthesis, Harmonics (Violin, Piano, Guitar, Bass), Frequencies, MIDI . In your case if you don't want to rely on samples then the wavetable method is out. So your simplest option would be to dynamically vary the frequency and amplitude of sinusoids over time, which is easy but will sound pretty terrible (like a cheap Theremin). Your only real option would be a more sophisticated synthesis algorithm such as one of the Physical Modelling ones (eg Karplus-Strong). That would be an interesting project, but be warned that it does require something of a mathematical background.
You can indeed use something like Portaudio as Rafael has mentioned to physically get the sound out of the PC, in fact I think Portaudio is the best option for that. But generating the data so that it sounds musical is by far your biggest challenge.