I have a task to write a module to unpack National Imagery Transmission Format (NITF) images and pass around the data in memory to various processing modules. I have chosen to use the NITRO library. I am trying to figure out how to read the image and access the pixel values, but I am having trouble. I am using the C++ bindings.
I successfully compiled the library. Now, I am trying to use the unit tests to understand how to use the library, namely read an image and access the pixel values. There are also some examples here. However, the unit tests and code snippets don't perform this task directly.
My toy example is below. I've tried variations of the code below, but I almost always get some error in image_reader.read(). The code below results in an error about too many bands, but if I limit the number of bands, then I don't get an error but the buffer doesn't seem to have any values in it.
I would be grateful to anyone who could give me some guidance or tips on how to use this library to access pixel values.
#include "stdafx.h"
#define IMPORT_NITRO_API
#include <import/nitf.hpp>
int _tmain(int argc, _TCHAR* argv[])
{
const std::string filename = "my_image.NTF";
nitf::Reader reader;
nitf::IOHandle io(filename.c_str());
nitf::Record record = reader.read(io);
nitf::List images = record.getImages();
nitf::ListIterator iter = images.begin(); // NITF can store more than one image - just try the first
nitf::ImageSegment segment = *iter;
nitf::SubWindow window; // define a subwindow for reading - try to read the whole image although it might be slow
unsigned int numRows = segment.getSubheader().getNumRows();
unsigned int numCols = segment.getSubheader().getNumCols();
const int band_count = segment.getSubheader().getBandCount();
window.setNumRows(numRows);
window.setNumCols(numCols);
window.setNumBands(band_count);
nitf::Uint32* band_list = new nitf::Uint32[band_count];
for (nitf::Uint32 band_number = 0; band_number < band_count; band_number++)
band_list[band_number] = band_number;
window.setBandList(band_list);
auto image_reader = reader.newImageReader(1); // 1 seems to be the image number: nitro-master\c\nitf\tests\test_create_xmltre.c
std::vector< std::vector<nitf::Uint8> > buffer(band_count); // User-defined data buffers for read
for (nitf::Uint32 band_number = 0; band_number < band_count; band_number++)
buffer[band_number].resize(numRows * numCols);
int padded = 0; // Returns TRUE if pad pixels may have been read
image_reader.read(window, (nitf::Uint8**)&buffer[0], &padded);
return 0;
}
Related
I'm working on a project that involves a large JSON file, basically a multidimensional array dumped in JSON form, but the overall size would be larger than the amount of memory I have. If I load it in as a string and then parse the string, that will consume all of the memory.
Are there any methods to limit the memory consumption, such as only retrieving data between specific indices? Could I implement that using solely the Nlohmann json library/the standard libraries?
RapidJSON and others can do it. Here's an example program using RapidJSON's "SAX" (streaming) API: https://github.com/Tencent/rapidjson/blob/master/example/simplereader/simplereader.cpp
This way, you'll get an event (callback) for each element encountered during parsing. The memory consumption of the parsing itself will be quite small.
Could you please specify the context of your question
What programming language you are using (NodeJS, Vanilla JavaScript, Java, React)
What environment your code is running (Monolithic app on a server, AWS Lambda, Serverless)
Computing large JSON files can consume a lot of memory resources on a server, perhaps, make your app to crash.
I have experienced first-hand, that manipulating large JSON files on my local computer with 8 GB of memory RAM is not a problem using a NodeJS script to compute the large JSON files payloads. However, trying to run those large JSON payloads in an application running on a server give me problems too.
I hope this helps.
Using DAW JSON Link, https://github.com/beached/daw_json_link , you can create an iterator pair/range and iterate over the JSON array 1 record at a time. The library also has routines for working with JSONL, which is common in large datasets.
For opening the file, I would use something like mmap/virtual alloc to handle that for us. The examples in the library use this via the daw::filesystem::memory_mapped_file_t type that abstracts the file mapping.
With that, the memory mapped file allows the OS to page the data in/out as needed, and the iterator like interface keeps the memory requirement to that of one array element at a time.
The following demonstrates this, using a simple Record that
struct Point {
int x;
int y;
};
The program to do this looks like
#include <cassert>
#include <daw/daw_memory_mapped_file.h>
#include <daw/json/daw_json_iterator.h>
#include <daw/json/daw_json_link.h>
#include <iostream>
struct Point {
double x;
double y;
};
namespace daw::json {
template<>
struct json_data_contract<Point> {
using type =
json_member_list<json_number<"x">, json_number<"y">>;
};
}
int main( int argc, char** argv ) {
assert( argc >= 1 );
auto json_doc = daw::filesystem::memory_mapped_file_t<char>( argv[1] );
assert( json_doc.size( ) > 2 );
auto json_range = daw::json::json_array_range<Point>( json_doc );
auto sum_x = 0.0;
auto sum_y = 0.0;
auto count = 0ULL;
for( Point p: json_range ) {
sum_x += p.x;
sum_y += p.y;
++count;
}
sum_x /= static_cast<double>( count );
sum_y /= static_cast<double>( count );
std::cout << "Centre Point (" << sum_x << ", " << sum_y << ")\n";
}
https://jsonlink.godbolt.org/z/xoxEd1z6G
Is there a way to decompress sound files using the FMOD library in c++?
I'm developing a sound editor, using the FMOD Engine library, but I got to the problem with compressed audio files, specifically OGG types.
For now I'm just reading the raw data using FMOD::Sound::readData(), and then normalize it and display it to the screen using SFML. This works fine with WAV files, because they are not compressed, but I need to do more steps for compressed formats.
This is what I'm doing now:
FMOD_RESULT r;
FMOD::System* m_fmodSystem = nullptr;
int m_maxChannels = 64;
// Create fmod system
r = FMOD::System_Create(&m_fmodSystem);
FMOD_ERROR_CHECK(r);
// Initialize system
r = m_fmodSystem->init(m_maxChannels, FMOD_INIT_NORMAL, nullptr);
FMOD_ERROR_CHECK(r);
// Create sound
FMOD::Sound* soundResource = nullptr;
FMOD::Channel* channel = nullptr;
r = m_fmodSystem->createSound("640709__chobesha__laser-gun-sound.ogg",
FMOD_DEFAULT | FMOD_OPENONLY,
nullptr,
&soundResource);
FMOD_ERROR_CHECK(r);
// Get sound length in raw bytes
unsigned int audioLength = 0;
FMOD_TIMEUNIT timeUnit = FMOD_TIMEUNIT_RAWBYTES;
r = soundResource->getLength(&audioLength, timeUnit);
FMOD_ERROR_CHECK(r);
// Read sound data
char* audioBuffer = new char[audioLength];
unsigned int readData = 0;
r = soundResource->readData(reinterpret_cast<void*>(audioBuffer),
audioLength,
&readData);
FMOD_ERROR_CHECK(r);
signed short* interpretedData = reinterpret_cast<signed short*>(audioBuffer);
// Analize data to normalize it
signed short maxValue = -32767;
signed short minValue = 32767;
int interpretedDataSize = readData / sizeof(signed short);
for (int i = 0; i < interpretedDataSize; ++i) {
if (interpretedData[i] > maxValue) {
maxValue = interpretedData[i];
}
if (interpretedData[i] < minValue) {
minValue = interpretedData[i];
}
}
float maxValF = static_cast<float>(maxValue);
float minValF = static_cast<float>(minValue);
float* normalizedArray = new float[interpretedDataSize];
// Normalize data
float maxAbsValF = abs(maxValF);
maxAbsValF = maxAbsValF > abs(minValF) ? maxAbsValF : abs(minValF);
for (int i = 0; i < interpretedDataSize; ++i) {
normalizedArray[i] = interpretedData[i] / maxAbsValF;
}
I read on other posts and on the FMOD documentation that I can use the flag FMOD_CREATESAMPLE to tell the createSound function to decompress the data at loadtime, instead of playtime, but It doesn't work in my current structure of the code, I'm guessing because the FMOD_OPENONLY prevents it from closing, and therefor it doesn't gets the chance to decompress, or something. That's what I got from the documentation.
The problem with not using the FMOD_OPENONLY flag, is that I cannot read the data using the readData function, or it returns an error flag.
Searching, I found that I can use the lock function, to help it decompress and to get the pointer to the data of the sound, but even with all of this, it stills appears to be compressed. I donĀ“t know if I'm missing something.
This is the version 2 of the code, with this modifications:
// Create sound
FMOD::Sound* soundResource = nullptr;
FMOD::Channel* channel = nullptr;
r = m_fmodSystem->createSound("640709__chobesha__laser-gun-sound.ogg",
FMOD_DEFAULT | FMOD_CREATESAMPLE,
nullptr,
&soundResource);
FMOD_ERROR_CHECK(r);
// Get sound length in raw bytes
unsigned int audioLength = 0;
FMOD_TIMEUNIT timeUnit = FMOD_TIMEUNIT_RAWBYTES;
r = soundResource->getLength(&audioLength, timeUnit);
FMOD_ERROR_CHECK(r);
// Read sound data
char* audioBuffer = new char[audioLength];
void* ptr2 = nullptr;
unsigned int len1, len2;
r = soundResource->lock(0, audioLength, reinterpret_cast<void**>(&audioBuffer), &ptr2, &len1, &len2);
FMOD_ERROR_CHECK(r);
r = soundResource->unlock(reinterpret_cast<void*>(audioBuffer), ptr2, len1, len2);
FMOD_ERROR_CHECK(r);
This is the graph I get for a WAV sound
The left side is the sound loaded in the Audacity app, and the right side is my graph.
This is the graph for the first try with the OGG file
And this is the graph for the OGG file with the modifications
For what I can see, the first is the same as the second, so I'm assuming both are compressed and what I changed did nothing.
Someone knows a better way to decompress and read the raw data of a sound, preferably using this library of FMOD. If it's not possible with FMOD, what is the best way to decompress any sound file, knowing its format.
This answer is just a couple minority-position takes and a sketchy description of a process I once used. Maybe the thoughts are worth consideration.
One thought: a person who is editing sound (your target audience?) has the know-how to decompress files (e.g., using Audacity), so perhaps adding this capability (handling all possible incoming audio formats) is a lower priority?
Another thought: there are likely many libraries for decompressing sound available. You could employ one of them prior to presenting the results to FMOD. I just did a search on github for "ogg c++" and was shown 51 repositories.
In my own experience, for an application I wrote about seven years ago, I tweaked some code from a Vorbis decoder source so that it output PCM rather than outputting as a .wav. With OGG, the .wav data is converted to PCM prior to compression. So, it decompresses back to PCM before converting that to a .wav. I found the point in the code where the conversion happens and edited that out, leaving the data in a decompressed PCM form.
My application was built to accept PCM, so I actually ended up saving an intermediate step.
I created an application a couple of years ago that allowed me to process audio by downmixing a 6 channel or 8 channel a.k.a 5.1 as 7.1 as matrixed stereo encoded for that purpose I used the portaudio library with great results this is an example of the open stream function and callback to downmix a 7.1 signal
Pa_OpenStream(&Flujo, &inputParameters, &outParameters, SAMPLE_RATE, 1, paClipOff, ptrFunction, NULL);
Notice the use of framesPerBuffer value of just one (1), this is my callback function
int downmixed8channels(const void *input, void *output, unsigned long framesPerBuffer, const PaStreamCallbackTimeInfo * info, PaStreamCallbackFlags state, void * userData)
{
(void)userData;
(void)info;
(void)state;
(void)framesBuffer;
float *ptrInput = (float*)input;
float *ptrOutput = (float*)ouput;
/*This is a struct to identify samples*/
AudioSamples->L = ptrInput[0];
AudioSamples->R = ptrInput[1];
AudioSamples->C = ptrInput[2];
AudioSamples->LFE = ptrInput[3];
AudioSamples->RL = ptrInput[4];
AudioSamples->RR = ptrInput[5];
AudioSamples->SL = ptrInput[6];
AudioSamples->SR = ptrInput[7];
Encoder->8channels(AudioSamples->L,
AudioSamples->R,
AudioSamples->C,
AudioSamples->LFE,
MuestrasdeAudio->SL,
MuestrasdeAudio->SR,
MuestrasdeAudio->RL,
MuestrasdeAudio->RR,);
ptrOutput[0] = Encoder->gtLT();
ptrOutput[1] = Encoder->gtRT();
return paContinue;
}
As you can see the order set by the index in the output and input buffer correspond to a discrete channel
in the case of the output 0 = Left channel, 1 = right Channel. This used to work well, until entered Windows 10 2004, since I updated my system to this new version my audio glitch and I get artifacts like those
Those are captures from the sound of the channel test window under the audio device panel of windows. By the images is clear my program is dropping frames, so the first try to solve this was to use a larger buffer than one to hold samples process them and send then, the reason I did not use a buffer size larger than one in the first place was that the program would drop frames.
But before implementing a I did a proof of concept, would not include audio processing at all, of simple passing of data from input to ouput, for that I set the oputput channelCount parameters to 8 just like the input, resulting in something as simple as this.
for (int i = 0; i < FramesPerBuffer /*1000*/; i++)
{
ptrOutput[i] = ptrOutput[i];
}
but still the program is still dropping samples.
Next I used two callbacks one for writing to a buffer and a second one to read it and send it to output
(void)info;
(void)userData;
(void)state;
(void)output;
float* ptrInput = (float*)input;
for (int i = 0; i < FRAME_SIZE; i++)
{
buffer_input[i] = ptrInput[i];
}
return paContinue;
Callback to store.
(void)info;
(void)userData;
(void)state;
(void)output;
float* ptrOutput = (float*)output;
for (int i = 0; i < FRAME_SIZE; i++)
{
AudioSamples->L = (buffer_input[i] );
AudioSamples->R = (buffer_input[i++]);
AudioSamples->C = (buffer_input[i++] );
AudioSamples->LFE = (buffer_input[i++]);
AudioSamples->SL = (buffer_input[i++] );
AudioSamples->SR = (buffer_input[i++]);
Encoder->Encoder(AudioSamples->L, AudioSamples->R, AudioSamples->C, AudioSamples->LFE,
AudioSamples->SL, AudioSamples->SR);
bufferTransformed[w] = (Encoder->getLT() );
bufferTransformed[w++] = (Encoder->getRT() );
}
w = 0;
for (int i = 0; i < FRAME_REDUCED; i++)
{
ptrOutput[i] = buffer_Transformed[i];
}
return paContinue;
Callback for processing
The processing callback use a reduced frames per buffer since 2 channel is less than eight since it seems in portaudio a frame is composed of a sample for each audio channel.
This also did not work, since the first problem, is how to syncronize the two callback?, after all of this, what recommendation or advice, can you give me to solve this issue,
Notes: the samplerate must be same for both devices, I implemeted logic in the program to prevent this, the bitdepth is also the same I am using paFloat32,
.The portaudio is the modified one use by audacity, since I wanted to use their implementation of WASAPI
loopback
Thank very much in advance!.
At the end of the day it I did not have to change my callbacks functions in any way, what solved it, was changing or increasing the parameter ".suggestedLatency" of the input and output parameters, to 1.0, even the devices defaultLowOutputLatency or defaultHighOutputLatency values where causing to much glitching, I test it until 1.0 was de sweepspot, higher values did not seen to improve.
TL;DR Increased the suggestedLatency until the glitching is gone.
I have no experience in audio programming and C++ is quite low level language so I have a little problems with it. I work with ASIO SDK 2.3 downloaded from http://www.steinberg.net/en/company/developers.html.
I am writing my own host based on example inside SDK.
For now I've managed to go through the whole sample and it looks like it's working. I have external sound card connected to my PC. I've successfully loaded driver for this device, configured it, handled callbacks, casting data from analog to digital etc. common stuff.
And part where I am stuck now:
When I play some track via my device I can see bars moving in the mixer (device's software). So device is connected in right way. In my code I've picked the inputs and outputs with the names of the bars that are moving in mixer. I've also used ASIOCreateBuffers() to create buffer for each input/output.
Now correct me if I am wrong:
When ASIOStart() is called and driver is in running state, when I input the sound signal to my external device I believe the buffers get filled with data, right?
I am reading the documentation but I am a bit lost - how can I access the data being sent by device to application, stored in INPUT buffers? Or signal? I need it for signal analysis or maybe recording in future.
EDIT: If I had made it to complicated then in a nutshell my question is: how can I access input stream data from code? I don't see any objects/callbacks letting me to do so in documentation.
The hostsample in the ASIO SDK is pretty close to what you need. In the bufferSwitchTimeInfo callback there is some code like this:
for (int i = 0; i < asioDriverInfo.inputBuffers + asioDriverInfo.outputBuffers; i++)
{
int ch = asioDriverInfo.bufferInfos[i].channelNum;
if (asioDriverInfo.bufferInfos[i].isInput == ASIOTrue)
{
char* buf = asioDriver.bufferInfos[i].buffers[index];
....
Inside of that if block asioDriver.bufferInfos[i].buffers[index] is a pointer to the raw audio data (index is a parameter to the method).
The format of the buffer is dependent upon the driver and that can be discovered by testing asioDriverInfo.channelInfos[i].type. The types of formats will be 32bit int LSB first, 32bit int MSB first, and so on. You can find the list of values in the ASIOSampleType enum in asio.h. At this point you'll want to convert the samples to some common format for downstream signal processing code. If you're doing signal processing you'll probably want convert to double. The file host\asioconvertsample.cpp will give you some idea of what's involved in the conversion. The most common format you're going to encounter is probably INT32 MSB. Here is how you'd convert it to double.
for (int i = 0; i < asioDriverInfo.inputBuffers + asioDriverInfo.outputBuffers; i++)
{
int ch = asioDriverInfo.bufferInfos[i].channelNum;
if (asioDriverInfo.bufferInfos[i].isInput == ASIOTrue)
{
switch (asioDriverInfo.channelInfos[i].type)
{
case ASIOInt32LSB:
{
double* pDoubleBuf = new double[_bufferSize];
for (int i = 0 ; i < _bufferSize ; ++i)
{
pDoubleBuf[i] = *(int*)asioDriverInfo.bufferInfos.buffers[index] / (double)0x7fffffff;
}
// now pDoubleBuf contains one channels worth of samples in the range of -1.0 to 1.0.
break;
}
// and so on...
Thank you very much. Your answer helped quite much but as I am inexperienced with C++ a bit :P I find it a bit problematic.
In general I've written my own host based on hostsample. I didn't implement asioDriverInfo structure and use common variables for now.
My first problem was:.
char* buf = asioDriver.bufferInfos[i].buffers[index];
as I got error that I can't cast (void*) to char* but this probably solved the problem:
char* buf = static_cast<char*>(bufferInfos[i].buffers[doubleBufferIndex]);
My second problem is with the data conversion. I've checked the file you've recommended me but I find it a little black magic. For now I am trying to follow your example and:
for (int i = 0; i < inputBuffers + outputBuffers; i++)
{
if (bufferInfos[i].isInput)
{
switch (channelInfos[i].type)
{
case ASIOSTInt32LSB:
{
double* pDoubleBuf = new double[buffSize];
for (int j = 0 ; j < buffSize ; ++j)
{
pDoubleBuf[j] = bufferInfos[i].buffers[doubleBufferIndex] / (double)0x7fffffff;
}
break;
}
}
}
I get error there:
pDoubleBuf[j] = bufferInfos[i].buffers[doubleBufferIndex] / (double)0x7fffffff;
which is:
error C2296: '/' : illegal, left operand has type 'void *'
What I don't get is that in your example there is no table there: asioDriverInfo.bufferInfos.buffers[index] after bufferInfos and even if I fix it... to what kind of type should I cast it to make it work. P
PS. I am sure ASIOSTInt32LSB data type is fine for my PC.
The ASIO input and output buffers are accessible using void pointers, but using memcpy or memmove to access I/O buffer will create a memory copy which is to be avoided if you are doing real-time processing. I would suggest casting the pointer type to int* so you can directly access them.
It's also very slow in real-time processing to cast types 1 by 1 when you have like 100+ audio channels when AVX2 is supported on most CPUs.
_mm256_loadu_si256() and _mm256_cvtepi32_ps() will do the conversion much faster.
I have a question regarding a sound synthesis app that I'm working on. I am trying to read in an audio file, create randomized 'grains' using granular synthesis techniques, place them into an output buffer and then be able to play that back to the user using OpenAL. For testing purposes, I am simply writing the output buffer to a file that I can then listen back to.
Judging by my results, I am on the right track but am getting some aliasing issues and playback sounds that just don't seem quite right. There is usually a rather loud pop in the middle of the output file and volume levels are VERY loud at times.
Here are the steps that I have taken to get the results I need, but I'm a little bit confused about a couple of things, namely formats that I am specifying for my AudioStreamBasicDescription.
Read in an audio file from my mainBundle, which is a mono file in .aiff format:
ExtAudioFileRef extAudioFile;
CheckError(ExtAudioFileOpenURL(loopFileURL,
&extAudioFile),
"couldn't open extaudiofile for reading");
memset(&player->dataFormat, 0, sizeof(player->dataFormat));
player->dataFormat.mFormatID = kAudioFormatLinearPCM;
player->dataFormat.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
player->dataFormat.mSampleRate = S_RATE;
player->dataFormat.mChannelsPerFrame = 1;
player->dataFormat.mFramesPerPacket = 1;
player->dataFormat.mBitsPerChannel = 16;
player->dataFormat.mBytesPerFrame = 2;
player->dataFormat.mBytesPerPacket = 2;
// tell extaudiofile about our format
CheckError(ExtAudioFileSetProperty(extAudioFile,
kExtAudioFileProperty_ClientDataFormat,
sizeof(AudioStreamBasicDescription),
&player->dataFormat),
"couldnt set client format on extaudiofile");
SInt64 fileLengthFrames;
UInt32 propSize = sizeof(fileLengthFrames);
ExtAudioFileGetProperty(extAudioFile,
kExtAudioFileProperty_FileLengthFrames,
&propSize,
&fileLengthFrames);
player->bufferSizeBytes = fileLengthFrames * player->dataFormat.mBytesPerFrame;
Next I declare my AudioBufferList and set some more properties
AudioBufferList *buffers;
UInt32 ablSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
buffers = (AudioBufferList *)malloc(ablSize);
player->sampleBuffer = (SInt16 *)malloc(sizeof(SInt16) * player->bufferSizeBytes);
buffers->mNumberBuffers = 1;
buffers->mBuffers[0].mNumberChannels = 1;
buffers->mBuffers[0].mDataByteSize = player->bufferSizeBytes;
buffers->mBuffers[0].mData = player->sampleBuffer;
My understanding is that .mData will be whatever was specified in the formatFlags (in this case, type SInt16). Since it is of type (void *), I want to convert this to float data which is obvious for audio manipulation. Before I set up a for loop which just iterated through the buffer and cast each sample to a float*. This seemed unnecessary so now I pass in my .mData buffer to a function I created which then granularizes the audio:
float *theOutBuffer = [self granularizeWithData:(float *)buffers->mBuffers[0].mData with:framesRead];
In this function, I dynamically allocate some buffers, create random size grains, place them in my out buffer after windowing them using a hamming window and return that buffer (which is float data). Everything is cool up to this point.
Next I set up all my output file ASBD and such:
AudioStreamBasicDescription outputFileFormat;
bzero(audioFormatPtr, sizeof(AudioStreamBasicDescription));
outputFileFormat->mFormatID = kAudioFormatLinearPCM;
outputFileFormat->mSampleRate = 44100.0;
outputFileFormat->mChannelsPerFrame = numChannels;
outputFileFormat->mBytesPerPacket = 2 * numChannels;
outputFileFormat->mFramesPerPacket = 1;
outputFileFormat->mBytesPerFrame = 2 * numChannels;
outputFileFormat->mBitsPerChannel = 16;
outputFileFormat->mFormatFlags = kAudioFormatFlagIsFloat | kAudioFormatFlagIsPacked;
UInt32 flags = kAudioFileFlags_EraseFile;
ExtAudioFileRef outputAudioFileRef = NULL;
NSString *tmpDir = NSTemporaryDirectory();
NSString *outFilename = #"Decomp.caf";
NSString *outPath = [tmpDir stringByAppendingPathComponent:outFilename];
NSURL *outURL = [NSURL fileURLWithPath:outPath];
AudioBufferList *outBuff;
UInt32 abSize = offsetof(AudioBufferList, mBuffers[0]) + (sizeof(AudioBuffer) * 1);
outBuff = (AudioBufferList *)malloc(abSize);
outBuff->mNumberBuffers = 1;
outBuff->mBuffers[0].mNumberChannels = 1;
outBuff->mBuffers[0].mDataByteSize = abSize;
outBuff->mBuffers[0].mData = theOutBuffer;
CheckError(ExtAudioFileCreateWithURL((__bridge CFURLRef)outURL,
kAudioFileCAFType,
&outputFileFormat,
NULL,
flags,
&outputAudioFileRef),
"ErrorCreatingURL_For_EXTAUDIOFILE");
CheckError(ExtAudioFileSetProperty(outputAudioFileRef,
kExtAudioFileProperty_ClientDataFormat,
sizeof(outputFileFormat),
&outputFileFormat),
"ErrorSettingProperty_For_EXTAUDIOFILE");
CheckError(ExtAudioFileWrite(outputAudioFileRef,
framesRead,
outBuff),
"ErrorWritingFile");
The file is written correctly, in CAF format. My question is this: am I handling the .mData buffer correctly in that I am casting the samples to float data, manipulating (granulating) various window sizes and then writing it to a file using ExtAudioFileWrite (in CAF format)? Is there a more elegant way to do this such as declaring my ASBD formatFlag as kAudioFlagIsFloat? My output CAF file has some clicks in it and when I open it in Logic, it looks like there is a lot of aliasing. This makes sense if I am trying to send it float data but there is some kind of conversion happening which I am unaware of.
Thanks in advance for any advice on the matter! I have been an avid reader of pretty much all the source material online, including the Core Audio Book, various blogs, tutorials, etc. The ultimate goal of my app is to play the granularized audio in real time to a user with headphones so the writing to file thing is just being used for testing at the moment. Thanks!
What you say about step 3 suggests to me you are interpreting an array of shorts as an array of floats? If that is so, we found the reason for your trouble. Can you assign the short values one by one into an array of floats? That should fix it.
It looks like mData is a void * pointing to an array of shorts. Casting this pointer to a float * doesn't change the underlying data into float but your audio processing function will treat them as if they were. However, float and short values are stored in totally different ways, so the math you do in that function will operate on very different values that have nothing to do with your true input signal. To investigate this experimentally, try the following:
short data[4] = {-27158, 16825, 23024, 15};
void *pData = data;
The void pointer doesn't indicate what kind of data it points to, so erroneously, one can falsely assume it points to float values. Note that a short is 2 byte wide, but a float is 4 byte wide. It is a coincidence that your code did not crash with an access violation. Interpreted as float the array above is only long enough for two values. Let's just look at the first value:
float *pfData = (float *)pData;
printf("%d == %f\n", data[0], pfData[0]);
The output of this will be -27158 == 23.198200 illustrating how instead of the expected -27158.0f you obtain roughly 23.2f. Two problematic things happened. First, sizeof(float) is not sizeof(short). Second, the "ones and zeros" of a floating point number are stored very differently than an integer. See http://en.wikipedia.org/wiki/Single_precision_floating-point_format.
How to solve the problem? There are at least two simple solutions. First, you could convert each element of the array before you feed it into your audio processor:
int k;
float *pfBuf = (float *)malloc(n_data * sizeof(float));
short *psiBuf = (short *)buffers->mBuffers[0].mData[k];
for (k = 0; k < n_data; k ++)
{
pfBuf[k] = psiBuf[k];
}
[self granularizeWithData:pfBuf with:framesRead];
for (k = 0; k < n_data; k ++)
{
psiBuf[k] = pfBuf[k];
}
free(pfBuf);
You see that most likely you will have to convert everything back to short after your call to granularizeWithData: with:. So the second solution would be to do all processing in short although from what you write, I imagine you would not like that latter approach.