Create CMSampleBufferRef from an AudioInputIOProc - c++

I have an AudioInputIOProc that I'm getting an AudioBufferList from. I need to convert this AudioBufferList to a CMSampleBufferRef.
Here's the code I've written so far:
- (void)handleAudioSamples:(const AudioBufferList*)samples numSamples:(UInt32)numSamples hostTime:(UInt64)hostTime {
// Create a CMSampleBufferRef from the list of samples, which we'll own
AudioStreamBasicDescription monoStreamFormat;
memset(&monoStreamFormat, 0, sizeof(monoStreamFormat));
monoStreamFormat.mSampleRate = 44100;
monoStreamFormat.mFormatID = kAudioFormatMPEG4AAC;
monoStreamFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagsNativeEndian | kAudioFormatFlagIsPacked | kAudioFormatFlagIsNonInterleaved;
monoStreamFormat.mBytesPerPacket = 4;
monoStreamFormat.mFramesPerPacket = 1;
monoStreamFormat.mBytesPerFrame = 4;
monoStreamFormat.mChannelsPerFrame = 2;
monoStreamFormat.mBitsPerChannel = 16;
CMFormatDescriptionRef format = NULL;
OSStatus status = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, &monoStreamFormat, 0, NULL, 0, NULL, NULL, &format);
if (status != noErr) {
// really shouldn't happen
return;
}
mach_timebase_info_data_t tinfo;
mach_timebase_info(&tinfo);
UInt64 _hostTimeToNSFactor = (double)tinfo.numer / tinfo.denom;
uint64_t timeNS = (uint64_t)(hostTime * _hostTimeToNSFactor);
CMTime presentationTime = CMTimeMake(timeNS, 1000000000);
CMSampleTimingInfo timing = { CMTimeMake(1, 44100), kCMTimeZero, kCMTimeInvalid };
CMSampleBufferRef sampleBuffer = NULL;
status = CMSampleBufferCreate(kCFAllocatorDefault, NULL, false, NULL, NULL, format, numSamples, 1, &timing, 0, NULL, &sampleBuffer);
if (status != noErr) {
// couldn't create the sample buffer
NSLog(#"Failed to create sample buffer");
CFRelease(format);
return;
}
// add the samples to the buffer
status = CMSampleBufferSetDataBufferFromAudioBufferList(sampleBuffer,
kCFAllocatorDefault,
kCFAllocatorDefault,
0,
samples);
if (status != noErr) {
NSLog(#"Failed to add samples to sample buffer");
CFRelease(sampleBuffer);
CFRelease(format);
NSLog(#"Error status code: %d", status);
return;
}
[self addAudioFrame:sampleBuffer];
NSLog(#"Original sample buf size: %ld for %d samples from %d buffers, first buffer has size %d", CMSampleBufferGetTotalSampleSize(sampleBuffer), numSamples, samples->mNumberBuffers, samples->mBuffers[0].mDataByteSize);
NSLog(#"Original sample buf has %ld samples", CMSampleBufferGetNumSamples(sampleBuffer));
}
Now, I'm unsure how to calculate the numSamples given this function definition of an AudioInputIOProc:
OSStatus AudioTee::InputIOProc(AudioDeviceID inDevice, const AudioTimeStamp *inNow, const AudioBufferList *inInputData, const AudioTimeStamp *inInputTime, AudioBufferList *outOutputData, const AudioTimeStamp *inOutputTime, void *inClientData)
This definition exists in the AudioTee.cpp file in WavTap.
The error I'm getting is a CMSampleBufferError_RequiredParameterMissing error with the error code -12731 when I try to call CMSampleBufferSetDataBufferFromAudioBufferList.
Update:
To clarify on the problem a bit, the following is the format of the audio data I'm getting from the AudioDeviceIOProc:
Channels: 2, Sample Rate: 44100, Precision: 32-bit, Sample Encoding: 32-bit Signed Integer PCM, Endian Type: little, Reverse Nibbles: no, Reverse Bits: no
I'm getting an AudioBufferList* that has all the audio data (30 seconds of video) that I need to convert to a CMSampleBufferRef* and add those sample buffers to a video (that is 30 seconds long) that is being written to disk via an AVAssetWriterInput.

Three things look wrong:
You declare that the format ID is kAudioFormatMPEG4AAC, but configure it as LPCM. So try
monoStreamFormat.mFormatID = kAudioFormatLinearPCM;
You also call the format "mono" when it's configured as stereo.
Why use mach_timebase_info which could leave gaps in your audio presentation timestamps? Use sample count instead:
CMTime presentationTime = CMTimeMake(numSamplesProcessed, 44100);
Your CMSampleTimingInfo looks wrong, and you're not using presentationTime. You set the buffer's duration as 1 sample long when it can be numSamples and its presentation time to zero which can't be right. Something like this would make more sense:
CMSampleTimingInfo timing = { CMTimeMake(numSamples, 44100), presentationTime, kCMTimeInvalid };
And some questions:
Does your AudioBufferList have the expected 2 AudioBuffers?
Do you have a runnable version of this?
p.s. I'm guilty of it myself, but allocating memory on the audio thread is considered harmful in audio dev.

Related

ESP32 i2s_read returns empty buffer after calling this function

I am trying to record audio from an INMP441 which is connected to a ESP32 but returning the buffer containing the bytes the microphone read always leads to something which is NULL.
The code for setting up i2s and the microphone is this:
// i2s config
const i2s_config_t i2s_config = {
.mode = i2s_mode_t(I2S_MODE_MASTER | I2S_MODE_RX), // receive
.sample_rate = SAMPLE_RATE, // 44100 (44,1KHz)
.bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT, // 32 bits per sample
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT, // use right channel
.communication_format = i2s_comm_format_t(I2S_COMM_FORMAT_I2S | I2S_COMM_FORMAT_I2S_MSB),
.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1, // interrupt level 1
.dma_buf_count = 64, // number of buffers
.dma_buf_len = SAMPLES_PER_BUFFER}; // 512
// pin config
const i2s_pin_config_t pin_config = {
.bck_io_num = gpio_sck, // serial clock, sck (gpio 33)
.ws_io_num = gpio_ws, // word select, ws (gpio 32)
.data_out_num = I2S_PIN_NO_CHANGE, // only used for speakers
.data_in_num = gpio_sd // serial data, sd (gpio 34)
};
// config i2s driver and pins
// fct must be called before any read/write
esp_err_t err = i2s_driver_install(I2S_PORT, &i2s_config, 0, NULL);
if (err != ESP_OK)
{
Serial.printf("Failed installing the driver: %d\n", err);
}
err = i2s_set_pin(I2S_PORT, &pin_config);
if (err != ESP_OK)
{
Serial.printf("Failed setting pin: %d\n", err);
}
Serial.println("I2S driver installed! :-)");
Setting up the i2s stuff is no problem at all. The tricky part for me is reading from the i2s:
// 44KHz * Byte per sample * time in seconds = total size in bytes
const size_t recordSize = (SAMPLE_RATE * I2S_BITS_PER_SAMPLE_32BIT / 8) * recordTime; //recordTime = 5s
// size in bytes
size_t totalReadSize = 0;
// 32 bits per sample set in config * 1024 samples per buffers = total bits per buffer
char *samples = (char *)calloc(totalBitsPerBuffer, sizeof(char));
// number of bytes read
size_t bytesRead;
Serial.println("Start recording...");
// read until wanted size is reached
while (totalReadSize < recordSize)
{
// read to buffer
esp_err_t err = i2s_read(I2S_PORT, (void *)samples, totalBitsPerBuffer, &bytesRead, portMAX_DELAY);
// check if error occurd, if so stop recording
if (err != ESP_OK)
{
Serial.println("Error while recording!");
break;
}
// check if bytes read works → yes
/*
for (int i = 0; i < bytesRead; i++)
{
uint8_t sample = (uint8_t) samples[i];
Serial.print(sample);
} */
// add read size to total read size
totalReadSize += bytesRead;
// Serial.printf("Currently recorded %d%% \n", totalReadSize * 100 / recordSize);
}
// convert bytes to mb
double_t totalReadSizeMB = (double_t)totalReadSize / 1e+6;
Serial.printf("Total read size: %fMb\n", totalReadSizeMB);
Serial.println("Samples deref");
Serial.println(*samples);
Serial.println("Samples");
Serial.println(samples);
return samples;
Using this code leads to the following output:
I2S driver installed! :-)
Start recording...
Total read size: 0.884736Mb
Samples deref
␀
Samples
When I uncomment the part where I iterate over the bytes read part I get something like this:
200224231255255224210022418725525522493000902552550238002241392542552241520020425225508050021624525501286700194120022461104022421711102242271030018010402242510000188970224141930022291022410185022487830021679001127500967200666902241776600246610224895902244757022418353002224802242274302249741022419339009435001223102242432602243322022412120001241402245911022418580084402248325525522461252255044249255224312452552242212372552241272352550342302552241212262552242112212550252216255014621325501682092550112205255224161202255224237198255224235194255224231922552248518725501141832550421812552241951762550144172255018168255034164255224173157255018215525522455152255028148255021014425505214025522487137255014613225522412112825502361252550180120255018011725522451172550252113255224133111255061082550248105255224891042552249910125522439972550138942552242279225503287255224101832552242478125522410178255224231732552244970255224336525501766225501426125502325625522424553255224109492550186[...]
This shows that the microphone is able to record, but I cant return the actual value of the buffer.
While programming this code I looked up at the official doku and some code which seems to work elsewhere.
I am also new to C++ and am not used to work with pointers.
Does anyone know what the problem could be?

FFmpeg - resampled audio with much noise

I'm not familiar with auido resampling. I tried to resample auido streams from two videos. The first one's output was close to the original but with noise, the other one was almost full of noise.
Information for the first one
128 kb/s, 48.0kHz, 2 channels, AACLC
Information for the second one
384 kb/s, 48.0 kHz, 6channels, AACLC
I found that, when I set the sample size 16, the frist one worked quit good but still with noise. The other one worked too bad but still had sound. What and how to determine the output sample size? Although I used channels * av_get_bytes_per_sample((AVSampleFormat)output_fmt) as the output sample size because I wanted it to be the same as the original, it had no sound at all.
MyResampling.cpp
bool MyResample::open(AVCodecParameters* par) {
if (!par) {
std::cout << "par is null" << std::endl;
return false;
}
audio_context = swr_alloc_set_opts(
audio_context, av_get_default_channel_layout(2), (AVSampleFormat)output_fmt,
par->sample_rate, av_get_default_channel_layout(par->channels), (AVSampleFormat)par->format, par->sample_rate,
0, 0);
avcodec_parameters_free(&par);
int ret = swr_init(audio_context);
if (ret != 0) {
std::cout << "failed to open audio codec" << std::endl;
}
return true;
}
int MyResample::resample(AVFrame* frame, unsigned char* output)
{
if (!frame)
return 0;
if (!output)
av_frame_free(&frame);
uint8_t* data[2] = { 0 };
data[0] = output;
int ret = swr_convert(audio_context, data, frame->nb_samples, (const uint8_t**)frame->data, frame->nb_samples);
//int size = ret * frame->channels * av_get_bytes_per_sample((AVSampleFormat)output_fmt);
int size = av_samples_get_buffer_size(nullptr, frame->channels, frame->nb_samples, (AVSampleFormat)output_fmt, 1);
if (ret < 0)
return ret;
return size;
}
MyAudioPlayer.cpp
bool open()
{
close();
QAudioFormat fmt;
fmt.setSampleRate(sample_rate); // from audioStream->codecpar->sample_rate
fmt.setSampleSize(16); //
fmt.setChannelCount(channels); // from audioStream->codecpar->channels
fmt.setCodec("audio/pcm");
fmt.setByteOrder(QAudioFormat::LittleEndian);
fmt.setSampleType(QAudioFormat::UnSignedInt);
output = new QAudioOutput(fmt);
io = output->start();
if (io)
return true;
return false;
}
bool write(const unsigned char* data, int data_size)
{
if (!data || data_size <= 0)
return false;
if (!output || !io)
{
return false;
}
int size = io->write((char*)data, data_size);
if (data_size != size)
return false;
return true;
}
main.cpp
MyAudioPlayer::open();
unsigned char* pcm = new unsigned char[1024 * 1024];
if (demux.get_media_type() == 1) { // audio
audio_decode.sendPacket(pkt);
AVFrame* frame = audio_decode.receiveFrame();
int len = resample.resample(frame, pcm);
while (len > 0) {
if (MyAudioPlayer::check_space() >= len) {
MyAudioPlayer::write(pcm, len);
break;
}
msleep(1);
}
}
If you have troubles with the final quality and noise probably you are misunderstanding the proper way to perform a resampling or there is a bug in your configuration.
Take a look into this example: libswresample-example.
I am not familiar with the FFmpeg API because to do resampling I tend to use libsamplerate.
Regarding old example, those are the steps to perform a basic resample with FFMPEG:
Start by configuring your resampling context:
//Set up resampling context
SwrContext *swr = swr_alloc();
av_opt_set_channel_layout(swr, "in_channel_layout", AV_CH_LAYOUT_STEREO, 0);
av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
av_opt_set_int(swr, "in_sample_rate", 44100, 0);
av_opt_set_int(swr, "out_sample_rate", 22050, 0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_FLT, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_FLT, 0);
swr_init(swr);
Depending on your input data types and the format you expect as an output, you will need to specify the right format. This is the equivalence in C++ standard:
----------------------------------------
| *AV_SAMPLE_FMT_S16* | `std::int16_t` |
| *AV_SAMPLE_FMT_S32* | `std::int32_t` |
| *AV_SAMPLE_FMT_FLT* | `float` |
| *AV_SAMPLE_FMT_DBL | `double` |
| *AV_SAMPLE_FMT_U8P* | `std::uint8_t` |
| ... | |
Get your data from whatever place in the right format and estimate your sampling count.
After that, you can perform the resampling in few steps:
Estimate the number of output samples
uint8_t* out_samples;
int out_num_samples = av_rescale_rnd(swr_get_delay(swr, in_samplerate) + in_num_samples, out_samplerate, in_samplerate, AV_ROUND_UP);
Allocate the memory for the output file
av_samples_alloc(&out_samples, NULL, out_num_channels, out_num_samples, AV_SAMPLE_FMT_FLT, 0);
Convert the input data into the expected output format
out_num_samples = swr_convert(swr, &out_samples, out_num_samples, &in_samples, in_num_samples);
Do not forget to free your memory
av_freep(&out_samples);
swr_free(&swr);
If you have noise, probably the input formats and output formats are not the proper one or the resampling quality is low.
For instance, do not panic if you get fewer samples than what you expected. It is the common behavior because of the way the filtering works. To get the remaining trailing you can perform the step 5 with NULL as input, which will flush the internal data.

Accessing all input streams from an external usb audio interface with CoreAudio

I am working with CoreAudio examples to interface with various different audio interfaces. I took the CAPlayThrough example (https://developer.apple.com/library/content/samplecode/CAPlayThrough/Introduction/Intro.html) and have been modifying little things here and there. One of the main things that I wanted to accomplish was to access all the input streams of all the input channels from my 4 channel audio interface simultaneously during playback. In my particular setup I have a behringer uphoria 404 interface that has 4 physical input channels that I want to access simultaneously. i.e. if i have mic/instruments plugged into channel 1,2,3 and 4 in the external interface, I want to be able to hear the output from all 4 channels like you would in a DAW such as pro tools, logic, etc.
So I took the example from this documentation (https://developer.apple.com/library/content/technotes/tn2091/_index.html) and proceeded to create my own channel map.
So what I did was write this and set the channel map for my input AudioUnit
//Create channel Map
SInt32 *channelMap = NULL;
UInt32 numOfChannels = this->streamFormat.mChannelsPerFrame;
UInt32 mapSize = numOfChannels *sizeof(SInt32);
channelMap = (SInt32 *)malloc(mapSize);
for(UInt32 i=0;i<numOfChannels;i++)
{
channelMap[i]=-1;
}
channelMap[0] = 0;
channelMap[1] = 1;
channelMap[2] = 0;
channelMap[3] = 1;
AudioUnitSetProperty(mInputUnit,
kAudioOutputUnitProperty_ChannelMap,
kAudioUnitScope_Output,
1,
channelMap,
mapSize);
free(channelMap);
The results were I could only access the data from channel 1 and 2 from my interface. However when I changed the channel map to this:
channelMap[0] = 2;
channelMap[1] = 3;
I can access data from channel 3 and 4 from my audio interface, but not channel 1 and 2.
So I figured I would have to set the channel map for my output AudioUnit. So I set the channel map to my output AudioUnit:
SInt32 *channelMap = NULL;
UInt32 numOfChannels = this->streamFormat.mChannelsPerFrame;
UInt32 mapSize = numOfChannels *sizeof(SInt32);
channelMap = (SInt32 *)malloc(mapSize);
for(UInt32 i=0;i<numOfChannels;i++)
{
channelMap[i]=-1;
}
channelMap[0] = 0;
channelMap[1] = 1;
channelMap[2] = 0;
channelMap[3] = 1;
AudioUnitSetProperty(this->outputUnit,
kAudioOutputUnitProperty_ChannelMap,
kAudioUnitScope_Output,
1,
channelMap,
mapSize);
free(channelMap);
Still the same results.
Is setting the channel map the correct way of going about this? I've combed through Apple's documentation website and tried to look at their documentation, but could not find anything more helpful than the article posted above.
UPDATE:
These are some of the things that I have tried so far besides modifying the channel maps:
I thought that maybe each physical input on my 4 channel interface needed an AudioUnit to represent it and choose which channel to get the input stream from with the channel map. But when setting up multiple AudioUnit for the input, the the last AudioUnit being set ended up acting as the default input. Further more, the AUGraph only allowed me to add 1 output node.
Use the AudioUnit type kAudioUnitSubType_StereoMixer to combine multiple streams with multiple output render call backs. I was able to set up the render callbacks and properly link it to the stereo mixer AudioUnit, but wasn't sure how to connect the input streams to the output.
UPDATE 2:
I should've added this before. This is from the CAPlayThrough example listed above to set up the kAudioUnitProperty_StreamFormat:
//Get the Stream Format (Output client side)
propertySize = sizeof(asbd_dev1_in);
err = AudioUnitGetProperty(mInputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 1, &asbd_dev1_in, &propertySize);
checkErr(err);
//printf("=====Input DEVICE stream format\n" );
//asbd_dev1_in.Print();
//Get the Stream Format (client side)
propertySize = sizeof(asbd);
err = AudioUnitGetProperty(mInputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &asbd, &propertySize);
checkErr(err);
//printf("=====current Input (Client) stream format\n");
//asbd.Print();
//Get the Stream Format (Output client side)
propertySize = sizeof(asbd_dev2_out);
err = AudioUnitGetProperty(mOutputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 0, &asbd_dev2_out, &propertySize);
checkErr(err);
//printf("=====Output (Device) stream format\n");
//asbd_dev2_out.Print();
//////////////////////////////////////
//Set the format of all the AUs to the input/output devices channel count
//For a simple case, you want to set this to the lower of count of the channels
//in the input device vs output device
//////////////////////////////////////
asbd.mChannelsPerFrame =((asbd_dev1_in.mChannelsPerFrame < asbd_dev2_out.mChannelsPerFrame) ?asbd_dev1_in.mChannelsPerFrame :asbd_dev2_out.mChannelsPerFrame) ;
// We must get the sample rate of the input device and set it to the stream format of AUHAL
propertySize = sizeof(Float64);
AudioObjectPropertyAddress theAddress = { kAudioDevicePropertyNominalSampleRate,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMaster };
err = AudioObjectGetPropertyData(mInputDevice.mID, &theAddress, 0, NULL, &propertySize, &rate);
checkErr(err);
asbd.mSampleRate =rate;
propertySize = sizeof(asbd);
//Set the new formats to the AUs...
err = AudioUnitSetProperty(mInputUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &asbd, propertySize);
checkErr(err);
err = AudioUnitSetProperty(mVarispeedUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &asbd, propertySize);
checkErr(err);
This is how the example sets up the both InputCall back and the output call back
InputCallBack setup:
OSStatus err = noErr;
AURenderCallbackStruct input;
input.inputProc = InputProc;
input.inputProcRefCon = this;
//Setup the input callback.
err = AudioUnitSetProperty(mInputUnit,
kAudioOutputUnitProperty_SetInputCallback,
kAudioUnitScope_Global,
0,
&input,
sizeof(input));
The input call back:
OSStatus CAPlayThrough::InputProc(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *inTimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData)
{
OSStatus err = noErr;
CAPlayThrough *This = (CAPlayThrough *)inRefCon;
if (This->mFirstInputTime < 0.)
This->mFirstInputTime = inTimeStamp->mSampleTime;
//Get the new audio data
err = AudioUnitRender(This->mInputUnit,
ioActionFlags,
inTimeStamp,
inBusNumber,
inNumberFrames, //# of frames requested
This->mInputBuffer);// Audio Buffer List to hold data
checkErr(err);
if(!err) {
err = This->mBuffer->Store(This->mInputBuffer, Float64(inNumberFrames), SInt64(inTimeStamp->mSampleTime));
}
return err;
}
This is how the output callback is setup:
output.inputProc = OutputProc;
output.inputProcRefCon = this;
err = AudioUnitSetProperty(mVarispeedUnit,
kAudioUnitProperty_SetRenderCallback,
kAudioUnitScope_Input,
0,
&output,
sizeof(output));
checkErr(err);
And this is the output proc
OSStatus CAPlayThrough::OutputProc(void *inRefCon,
AudioUnitRenderActionFlags *ioActionFlags,
const AudioTimeStamp *TimeStamp,
UInt32 inBusNumber,
UInt32 inNumberFrames,
AudioBufferList * ioData)
{
OSStatus err = noErr;
CAPlayThrough *This = (CAPlayThrough *)inRefCon;
Float64 rate = 0.0;
AudioTimeStamp inTS, outTS;
if (This->mFirstInputTime < 0.) {
// input hasn't run yet -> silence
MakeBufferSilent (ioData);
return noErr;
}
//use the varispeed playback rate to offset small discrepancies in sample rate
//first find the rate scalars of the input and output devices
err = AudioDeviceGetCurrentTime(This->mInputDevice.mID, &inTS);
// this callback may still be called a few times after the device has been stopped
if (err)
{
MakeBufferSilent (ioData);
return noErr;
}
err = AudioDeviceGetCurrentTime(This->mOutputDevice.mID, &outTS);
checkErr(err);
rate = inTS.mRateScalar / outTS.mRateScalar;
err = AudioUnitSetParameter(This->mVarispeedUnit,kVarispeedParam_PlaybackRate,kAudioUnitScope_Global,0, rate,0);
checkErr(err);
//get Delta between the devices and add it to the offset
if (This->mFirstOutputTime < 0.) {
This->mFirstOutputTime = TimeStamp->mSampleTime;
Float64 delta = (This->mFirstInputTime - This->mFirstOutputTime);
This->ComputeThruOffset();
//changed: 3865519 11/10/04
if (delta < 0.0)
This->mInToOutSampleOffset -= delta;
else
This->mInToOutSampleOffset = -delta + This->mInToOutSampleOffset;
MakeBufferSilent (ioData);
return noErr;
}
//copy the data from the buffers
err = This->mBuffer->Fetch(ioData, inNumberFrames, SInt64(TimeStamp->mSampleTime - This->mInToOutSampleOffset));
if(err != kCARingBufferError_OK)
{
MakeBufferSilent (ioData);
SInt64 bufferStartTime, bufferEndTime;
This->mBuffer->GetTimeBounds(bufferStartTime, bufferEndTime);
This->mInToOutSampleOffset = TimeStamp->mSampleTime - bufferStartTime;
}
return noErr;
}

RtAudio - Playing samples from wav file

I am currently trying to learn audio programming. My goal is to open a wav file, extract everything and play the samples with RtAudio.
I made a WaveLoader class which let's me extract the samples and meta data. I used this guide to do that and I checked that everything is correct with 010 editor. Here is a snapshot of 010 editor showing the structure and data.
And this is how i store the raw samples inside WaveLoader class:
data = new short[wave_data.payloadSize]; // - Allocates memory size of chunk size
if (!fread(data, 1, wave_data.payloadSize, sound_file))
{
throw ("Could not read wav data");
}
If i print out each sample I get : 1, -3, 4, -5 ... which seems ok.
The problem is that I am not sure how I can play them. This is what I've done:
/*
* Using PortAudio to play samples
*/
bool Player::Play()
{
ShowDevices();
rt.showWarnings(true);
RtAudio::StreamParameters oParameters; //, iParameters;
oParameters.deviceId = rt.getDefaultOutputDevice();
oParameters.firstChannel = 0;
oParameters.nChannels = mAudio.channels;
//iParameters.deviceId = rt.getDefaultInputDevice();
//iParameters.nChannels = 2;
unsigned int sampleRate = mAudio.sampleRate;
// Use a buffer of 512, we need to feed callback with 512 bytes everytime!
unsigned int nBufferFrames = 512;
RtAudio::StreamOptions options;
options.flags = RTAUDIO_SCHEDULE_REALTIME;
options.flags = RTAUDIO_NONINTERLEAVED;
//&parameters, NULL, RTAUDIO_FLOAT64,sampleRate, &bufferFrames, &mCallback, (void *)&rawData
try {
rt.openStream(&oParameters, NULL, RTAUDIO_SINT16, sampleRate, &nBufferFrames, &mCallback, (void*) &mAudio);
rt.startStream();
}
catch (RtAudioError& e) {
std::cout << e.getMessage() << std::endl;
return false;
}
return true;
}
/*
* RtAudio Callback
*
*/
int mCallback(void * outputBuffer, void * inputBuffer, unsigned int nBufferFrames, double streamTime, RtAudioStreamStatus status, void * userData)
{
unsigned int i = 0;
short *out = static_cast<short*>(outputBuffer);
auto *data = static_cast<Player::AUDIO_DATA*>(userData);
// if i is more than our data size, we are done!
if (i > data->dataSize) return 1;
// First time callback is called data->ptr is 0, this means that the offset is 0
// Second time data->ptr is 1, this means offset = nBufferFrames (512) * 1 = 512
unsigned int offset = nBufferFrames * data->ptr++;
printf("Offset: %i\n", offset);
// First time callback is called offset is 0, we are starting from 0 and looping nBufferFrames (512) times, this gives us 512 bytes
// Second time, the offset is 1, we are starting from 512 bytes and looping to 512 + 512 = 1024
for (i = offset; i < offset + nBufferFrames; ++i)
{
short sample = data->rawData[i]; // Get raw sample from our struct
*out++ = sample; // Pass to output buffer for playback
printf("Current sample value: %i\n", sample); // this is showing 1, -3, 4, -5 check 010 editor
}
printf("Current time: %f\n", streamTime);
return 0;
}
Inside callback function, when I print out sample values I get exactly like 010 editor? Why isnt rtaudio playing them. What is wrong here? Do I need to normalize sample values to between -1 and 1?
Edit:
The wav file I am trying to play:
Chunksize: 16
Format: 1
Channel: 1
SampleRate: 48000
ByteRate: 96000
BlockAlign: 2
BitPerSample: 16
Size of raw samples total: 2217044 bytes
For some reason it works when I pass input parameters to the openStream()
RtAudio::StreamParameters oParameters, iParameters;
oParameters.deviceId = rt.getDefaultOutputDevice();
oParameters.firstChannel = 0;
//oParameters.nChannels = mAudio.channels;
oParameters.nChannels = mAudio.channels;
iParameters.deviceId = rt.getDefaultInputDevice();
iParameters.nChannels = 1;
unsigned int sampleRate = mAudio.sampleRate;
// Use a buffer of 512, we need to feed callback with 512 bytes everytime!
unsigned int nBufferFrames = 512;
RtAudio::StreamOptions options;
options.flags = RTAUDIO_SCHEDULE_REALTIME;
options.flags = RTAUDIO_NONINTERLEAVED;
//&parameters, NULL, RTAUDIO_FLOAT64,sampleRate, &bufferFrames, &mCallback, (void *)&rawData
try {
rt.openStream(&oParameters, &iParameters, RTAUDIO_SINT16, sampleRate, &nBufferFrames, &mCallback, (void*) &mAudio);
rt.startStream();
}
catch (RtAudioError& e) {
std::cout << e.getMessage() << std::endl;
return false;
}
return true;
It was so random when I was trying to playback my mic. I left input parameters and my wav file was suddenly playing. Is this is a bug?

WaveOutWrite callback creates choppy audio

I have four buffers that I am using for audio playback in a synthesizer. I submit two buffers initially, and then in the callback routine I write data into the next buffer and then submit that buffer.
When I generate each buffer I'm just putting a sine wave into it whose period is a multiple of the buffer length.
When I execute I hear brief pauses between each buffer. I've increased the buffer size to 16K samples at 44100 Hz so I can clearly hear that the whole buffer is playing, but there is an interruption between each.
What I think is happening is that the callback function is only called when ALL buffers that have been written are complete. I need the synthesis to stay ahead of the playback so I need a callback when each buffer is completed.
How do people usually solve this problem?
Update: I've been asked to add code. Here's what I have:
First I connect to the WaveOut device:
// Always grab the mapped wav device.
UINT deviceId = WAVE_MAPPER;
// This is an excelent tutorial:
// http://planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=4422&lngWId=3
WAVEFORMATEX wfx;
wfx.nSamplesPerSec = 44100;
wfx.wBitsPerSample = 16;
wfx.nChannels = 1;
wfx.cbSize = 0;
wfx.wFormatTag = WAVE_FORMAT_PCM;
wfx.nBlockAlign = (wfx.wBitsPerSample >> 3) * wfx.nChannels;
wfx.nAvgBytesPerSec = wfx.nBlockAlign * wfx.nSamplesPerSec;
_waveChangeEventHandle = CreateMutex(NULL,false,NULL);
MMRESULT res;
res = waveOutOpen(&_wo, deviceId, &wfx, (DWORD_PTR)WavCallback,
(DWORD_PTR)this, CALLBACK_FUNCTION);
I initialize the four frames I'll be using:
for (int i=0; i<_numFrames; ++i)
{
WAVEHDR *header = _outputFrames+i;
ZeroMemory(header, sizeof(WAVEHDR));
// Block size is in bytes. We have 2 bytes per sample.
header->dwBufferLength = _codeSpec->OutputNumSamples*2;
header->lpData = (LPSTR)malloc(2 * _codeSpec->OutputNumSamples);
ZeroMemory(header->lpData, 2*_codeSpec->OutputNumSamples);
res = waveOutPrepareHeader(_wo, header, sizeof(WAVEHDR));
if (res != MMSYSERR_NOERROR)
{
printf("Error preparing header: %d\n", res - MMSYSERR_BASE);
}
}
SubmitBuffer();
SubmitBuffer();
Here is the SubmitBuffer code:
void Vodec::SubmitBuffer()
{
WAVEHDR *header = _outputFrames+_curFrame;
MMRESULT res;
res = waveOutWrite(_wo, header, sizeof(WAVEHDR));
if (res != MMSYSERR_NOERROR)
{
if (res = WAVERR_STILLPLAYING)
{
printf("Cannot write when still playing.");
}
else
{
printf("Error calling waveOutWrite: %d\n", res-WAVERR_BASE);
}
}
_curFrame = (_curFrame+1)&0x3;
if (_pointQueue != NULL)
{
RenderQueue();
_nextFrame = (_nextFrame + 1) & 0x3;
}
}
And here is my callback code:
void CALLBACK Vodec::WavCallback(HWAVEOUT hWaveOut,
UINT uMsg,
DWORD dwInstance,
DWORD dwParam1,
DWORD dwParam2 )
{
// Only listen for end of block messages.
if(uMsg != WOM_DONE) return;
Vodec *instance = (Vodec *)dwInstance;
instance->SubmitBuffer();
}
The RenderQueue code is pretty simple - just copies a piece of a template buffer into the output buffer:
void Vodec::RenderQueue()
{
double white = _pointQueue->White;
white = 10.0; // For now just override with a constant value
int numSamples = _codeSpec->OutputNumSamples;
signed short int *data = (signed short int *)_outputFrames[_nextFrame].lpData;
for (int i=0; i<numSamples; ++i)
{
Sample x = white * _noise->Samples[i];
data[i] = (signed short int)(x);
}
_sampleOffset += numSamples;
if (_sampleOffset >= _pointQueue->DurationInSamples)
{
_sampleOffset = 0;
_pointQueue = _pointQueue->next;
}
}
UPDATE: Mostly solved the issue. I need to increment _nextFrame along with _curFrame (not conditionally). The playback buffer was getting ahead of the writing buffer.
However, when I decrease the playback buffer to 1024 samples, it gets choppy again. At 2048 samples it is clear. This happens for both Debug and Release builds.
1024 samples is just about 23ms of audio data. wav is pretty high level API from Windows Vista onwards. If you want low-latency audio playback, you should use CoreAudio. You can get latencies down to 10 ms in shared mode and 3 ms in exclusive mode. Also, the audio depends upon the processes currently running on your system. In other words, it depends on how frequently your audio thread can run to get data. You should also look at MultiMedia Class Scheduler Service and AvSetMmThreadCharacteristics function.