Im trying to learn the basics of ffmpeg writing (reading already works), so im just trying to take in an input .ts file and write out/passthrough the exact same h264 stream to a new output file. I dont get any compilation errors, but for some reason i cant figure out why my output file's framerate is very wrong. Also when i read in my output file, i get printouts saying "Packet corrupt (stream = 0, dts = #)"
I followed the instructions in the ffmpeg library comments so im not sure what im missing. I call initOutStream(), then initH264encoder(), then during the reading/decoding readH264Packet() is called repeatedly. (Removed code for readability sake, left relevant sections below);
Edit: If i put my output file through the actuall ffmpeg cmd app, the framerate issue seems to get fixed. Wonder where im messing up
void test::initOutStream() {
//create muxing context
outstreamContext = avformat_alloc_context();
//oformat
AVOutputFormat *guessFormat; //Populate oformat
guessFormat = av_guess_format(NULL, inputVideoUrl.c_str(), NULL);
outstreamContext->oformat = guessFormat;
outstreamContext->oformat->video_codec = AV_CODEC_ID_H264;
//outstreamContext->bit_rate = 400000; //No affect;
//pb
AVIOContext *outAVIOContext = nullptr;
//int result = avio_open(&outAVIOContext, outputVideoUrl.c_str(), AVIO_FLAG_WRITE);
int result = avio_open2(&outAVIOContext, outputVideoUrl.c_str(), AVIO_FLAG_WRITE, NULL, NULL); //Documentain said to use this method
outstreamContext->pb = outAVIOContext;
}
.
void test::initH264encoder() { //Frame -> packet
int result;
h264OutCodec = avcodec_find_encoder(AV_CODEC_ID_H264);
h264OutStream = avformat_new_stream(outstreamContext, h264OutCodec);
h264OutStream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
h264OutStream->codecpar->codec_id = AV_CODEC_ID_H264;
h264OutStream->codecpar->width = 640;
h264OutStream->codecpar->height = 480;
h264OutStream->id = H264_STREAM_ID;
h264OutStream->codecpar->color_range = AVCOL_RANGE_MPEG;
//h264OutStream->codecpar->bit_rate = 400000;
h264OutContext = avcodec_alloc_context3(h264OutCodec);
h264OutContext->width = 640;
h264OutContext->height = 480;
h264OutContext->time_base = (AVRational){1,static_cast<int>(29.97)};
h264OutContext->pix_fmt = AV_PIX_FMT_YUV420P;
result = avcodec_open2(h264OutContext, h264OutCodec, nullptr);
//Alloc packet + finish
outPacket = av_packet_alloc();
//Write header
result = avformat_write_header(outstreamContext, NULL);
}
Assume that reading was set up correctly
void test::readH264Packet(__unused uint64_t tick) {
//...av_read_frame(streamContext, inPacket);
//...avcodec_send_packet(h264Context, inPacket);
//...avcodec_receive_frame(h264Context, yuvFrame)
//My passthrough:
if(shouldOutputH264Stream){
result = avcodec_send_frame(h264OutContext, yuvFrame); //1. Encode frame to packet
result = avcodec_receive_packet(h264OutContext, outPacket2); //2. get encoded packet
result = av_interleaved_write_frame(outstreamContext, outPacket2); //3. write packet
//Write trailer and free happens later
}
}
Related
I am currently messing the first time with iOS and Objective-C++. Im coming from C/C++ so please excuse my bad coding in the below examples.
I am trying to live stream the microphone audio of my iOS device over tcp, the iOS device is acting as server and sends the data to all clients that connect.
To do so, I am first using AVCaptureDevice and requestAccessForMediaType:AVMediaTypeAudio to request access to the microphone (along with the needed entry in the Info.plist).
Then I create a AVCaptureSession* using the below function:
AVCaptureSession* createBasicARecordingSession(aReceiver* ObjectReceivingAudioFrames){
AVCaptureSession* s = [[AVCaptureSession alloc] init];
AVCaptureDevice* aDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
AVCaptureDeviceInput* aInput = NULL;
if([aDevice lockForConfiguration:NULL] == YES && aDevice){
aInput = [AVCaptureDeviceInput deviceInputWithDevice:aDevice error:nil];
[aDevice unlockForConfiguration];
}
else if(!aDevice){
fprintf(stderr, "[d] could not create device. (%p)\n", aDevice);
return NULL;
}
else{
fprintf(stderr, "[d] could not lock device.\n");
return NULL;
}
if(!aInput){
fprintf(stderr, "[d] could not create input.\n");
return NULL;
}
AVCaptureAudioDataOutput* aOutput = [[AVCaptureAudioDataOutput alloc] init];
dispatch_queue_t aQueue = dispatch_queue_create("aQueue", NULL);
if(!aOutput){
fprintf(stderr, "[d] could not create output.\n");
return NULL;
}
[aOutput setSampleBufferDelegate:ObjectReceivingAudioFrames queue:aQueue];
// the below line does only work on macOS
//aOutput.audioSettings = settings;
[s beginConfiguration];
if([s canAddInput:aInput]){
[s addInput:aInput];
}
else{
fprintf(stderr, "[d] could not add input.\n");
return NULL;
}
if([s canAddOutput:aOutput]){
[s addOutput:aOutput];
}
else{
fprintf(stderr, "[d] could not add output.\n");
return NULL;
}
[s commitConfiguration];
return s;
}
The aReceiver* class (?) is defined below and receives the audio frames provided by the AVCaptureAudioDataOutput* object. The frames are stored inside a std::vector.
(im adding the code as image as I could not get it formatted right...)
Then I start the AVCaptureSession* using [audioSession start]
When a tcp client connects I first create a AudioConverterRef and two AudioStreamBasicDescription to convert the audio frames to AAC, see below:
AudioStreamBasicDescription asbdIn, asbdOut;
AudioConverterRef converter;
asbdIn.mFormatID = kAudioFormatLinearPCM;
//asbdIn.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
asbdIn.mFormatFlags = 12;
asbdIn.mSampleRate = 44100;
asbdIn.mChannelsPerFrame = 1;
asbdIn.mFramesPerPacket = 1;
asbdIn.mBitsPerChannel = 16;
//asbdIn.mBytesPerFrame = (asbdIn.mBitsPerChannel / 8) * asbdIn.mBitsPerChannel;
asbdIn.mBytesPerFrame = 2;
asbdIn.mBytesPerPacket = asbdIn.mBytesPerFrame;
asbdIn.mReserved = 0;
asbdOut.mFormatID = kAudioFormatMPEG4AAC;
asbdOut.mFormatFlags = 0;
asbdOut.mSampleRate = 44100;
asbdOut.mChannelsPerFrame = 1;
asbdOut.mFramesPerPacket = 1024;
asbdOut.mBitsPerChannel = 0;
//asbdOut.mBytesPerFrame = (asbdOut.mBitsPerChannel / 8) * asbdOut.mBitsPerChannel;
asbdOut.mBytesPerFrame = 0;
asbdOut.mBytesPerPacket = asbdOut.mBytesPerFrame;
asbdOut.mReserved = 0;
OSStatus err = AudioConverterNew(&asbdIn, &asbdOut, &converter);
Then I create a AudioBufferList* to store the encoded frames:
while(audioInput.locked){ // audioInput is my aReceiver*
usleep(0.2 * 1000000);
}
audioInput.locked = true;
UInt32 RequestedPackets = 8192;
//AudioBufferList* aBufferList = (AudioBufferList*)malloc(sizeof(AudioBufferList));
AudioBufferList* aBufferList = static_cast<AudioBufferList*>(calloc(1, offsetof(AudioBufferList, mBuffers) + (sizeof(AudioBuffer) * 1)));
aBufferList->mNumberBuffers = 1;
aBufferList->mBuffers[0].mNumberChannels = asbdIn.mChannelsPerFrame;
aBufferList->mBuffers[0].mData = static_cast<void*>(calloc(RequestedPackets, asbdIn.mBytesPerFrame));
aBufferList->mBuffers[0].mDataByteSize = asbdIn.mBytesPerFrame * RequestedPackets;
Then I go through the frames stored in the std::vector mentioned earlier and pass them to AudioConverterFillComplexBuffer(). After conversion, i concat all encoded frames into one NSMutableData which I then write() to the socket connected to the client.
long aBufferListSize = audioInput.aBufferList.size();
while(aBufferListSize > 0){
err = AudioConverterFillComplexBuffer(converter, feedAFrames, static_cast<void*>(&audioInput.aBufferList[audioInput.aBufferList.size() - aBufferListSize]), &RequestedPackets, aBufferList, NULL);
NSMutableData* encodedData = [[NSMutableData alloc] init];
long encodedDataLen = 0;
for(int i = 0; i < aBufferList->mNumberBuffers; i++){
Float32* frame = (Float32*)aBufferList->mBuffers[i].mData;
[encodedData appendBytes:frame length:aBufferList->mBuffers[i].mDataByteSize];
encodedDataLen += aBufferList->mBuffers[i].mDataByteSize;
}
unsigned char* encodedDataBytes = (unsigned char*)[encodedData bytes];
fprintf(stderr, "[d] got %li encoded bytes to send...\n", encodedDataLen);
long bytes = write(Client->GetFD(), encodedDataBytes, encodedDataLen);
fprintf(stderr, "[d] written %li of %li bytes.\n", bytes, encodedDataLen);
usleep(0.2 * 1000000);
aBufferListSize--;
}
audioInput.aBufferList.clear();
audioInput.locked = false;
Below is the feedAFrames() callback used in the AudioConverterFillComplexBuffer() call:
(again this is an image of the code, same reason as above)
Step 5 to 7 are repeated until the tcp connection is closed.
Each step runs without any noticeable error (I know I could include way better error handling here), and I do get data out of step 3 and 7. However it does not seem to be AAC what comes out at the end.
As im rather new to all of this, im really not sure what my error is, im sure there are several things I made wrong. It is kind of hard to find suitable example code of what I am trying to do, and the above is the best I could come up with until now with all that I have found paired with the apple dev documentation.
I hope someone might take some time to explain me what I did wrong and how I can get this to work. Thanks for reading until here!
i using this code to encode video stream using vp8 and i decided to give vp9 a try so i changed every thing with starts with vp_* from 8 to 9.
but the vp9 encoder always return a null packet although the encoder doesn't return any error.
here is the code i'am using for configuring.
vpx_codec_err_t error = vpx_codec_enc_config_default(vpx_codec_vp9_cx(), &enc_cfg, 0);
if(error != VPX_CODEC_OK)
return error;
enc_cfg.g_timebase.den = fps;
enc_cfg.rc_undershoot_pct = 95;
enc_cfg.rc_target_bitrate = bitrate;
enc_cfg.g_error_resilient = 1;
enc_cfg.kf_max_dist = 999999;
enc_cfg.rc_buf_initial_sz = 4000;
enc_cfg.rc_buf_sz = 6000;
enc_cfg.rc_buf_optimal_sz = 5000;
enc_cfg.rc_end_usage = VPX_CBR;
enc_cfg.g_h = height;
enc_cfg.g_w = width;
enc_cfg.rc_min_quantizer = 4;
enc_cfg.rc_max_quantizer = 56;
enc_cfg.g_threads = 4;
enc_cfg.g_pass = VPX_RC_ONE_PASS;
error = vpx_codec_enc_init(&codec, vpx_codec_vp9_cx(), &enc_cfg, 0);
if(error != VPX_CODEC_OK)
return error;
vpx_img_alloc(&vpx_image,VPX_IMG_FMT_I420 , width, height, 1);
configured = true;
return VPX_CODEC_OK;
and the code for the encoding
libyuv::RAWToI420(frame, vpx_image.d_w * 3, vpx_image.planes[VPX_PLANE_Y],vpx_image.stride[VPX_PLANE_Y],
vpx_image.planes[VPX_PLANE_U], vpx_image.stride[VPX_PLANE_U], vpx_image.planes[VPX_PLANE_V],
vpx_image.stride[VPX_PLANE_V], vpx_image.d_w, vpx_image.d_h);
const vpx_codec_cx_pkt_t *pkt;
vpx_codec_err_t error = vpx_codec_encode(&codec, &vpx_image, 0, 1, 0, VPX_DL_GOOD_QUALITY);
if(error != VPX_CODEC_OK)
return vector<byte>();
vpx_codec_iter_t iter = NULL;
if((pkt = vpx_codec_get_cx_data(&codec, &iter)))//always return null ?
{
if(pkt->kind == VPX_CODEC_CX_FRAME_PKT)
{
int length = pkt->data.frame.sz;
byte* buf = (byte*) pkt->data.frame.buf;
vector<byte> data(buf, buf + length);
return data;
}
return vector<byte>();
}
return vector<byte>();
the code is fully working if i'am using vp8 instead of 9, any help is welcomed
Just came across this post because I faced the same problem. Just for other to know: I solved it with setting
enc_cfg.g_lag_in_frames = 0;
This basically disallows the encoder to consume up to default 25 frames until it produces any output.
I've been trying to write a class that derives from FramedSource in Live555 that will allow me to stream live data from my D3D9 application to an MP4 or similar.
What I do each frame is grab the backbuffer into system memory as a texture, then convert it from RGB -> YUV420P, then encode it using x264, then ideally pass the NAL packets on to Live555. I made a class called H264FramedSource that derived from FramedSource basically by copying the DeviceSource file. Instead of the input being an input file, I've made it a NAL packet which I update each frame.
I'm quite new to codecs and streaming, so I could be doing everything completely wrong. In each doGetNextFrame() should I be grabbing the NAL packet and doing something like
memcpy(fTo, nal->p_payload, nal->i_payload)
I assume that the payload is my frame data in bytes? If anybody has an example of a class they derived from FramedSource that might at least be close to what I'm trying to do I would love to see it, this is all new to me and a little tricky to figure out what's happening. Live555's documentation is pretty much the code itself which doesn't exactly make it easy for me to figure out.
Ok, I finally got some time to spend on this and got it working! I'm sure there are others who will be begging to know how to do it so here it is.
You will need your own FramedSource to take each frame, encode, and prepare it for streaming, I will provide some of the source code for this soon.
Essentially throw your FramedSource into the H264VideoStreamDiscreteFramer, then throw this into the H264RTPSink. Something like this
scheduler = BasicTaskScheduler::createNew();
env = BasicUsageEnvironment::createNew(*scheduler);
framedSource = H264FramedSource::createNew(*env, 0,0);
h264VideoStreamDiscreteFramer
= H264VideoStreamDiscreteFramer::createNew(*env, framedSource);
// initialise the RTP Sink stuff here, look at
// testH264VideoStreamer.cpp to find out how
videoSink->startPlaying(*h264VideoStreamDiscreteFramer, NULL, videoSink);
env->taskScheduler().doEventLoop();
Now in your main render loop, throw over your backbuffer which you've saved to system memory to your FramedSource so it can be encoded etc. For more info on how to setup the encoding stuff check out this answer How does one encode a series of images into H264 using the x264 C API?
My implementation is very much in a hacky state and is yet to be optimised at all, my d3d application runs at around 15fps due to the encoding, ouch, so I will have to look into this. But for all intents and purposes this StackOverflow question is answered because I was mostly after how to stream it. I hope this helps other people.
As for my FramedSource it looks a little something like this
concurrent_queue<x264_nal_t> m_queue;
SwsContext* convertCtx;
x264_param_t param;
x264_t* encoder;
x264_picture_t pic_in, pic_out;
EventTriggerId H264FramedSource::eventTriggerId = 0;
unsigned H264FramedSource::FrameSize = 0;
unsigned H264FramedSource::referenceCount = 0;
int W = 720;
int H = 960;
H264FramedSource* H264FramedSource::createNew(UsageEnvironment& env,
unsigned preferredFrameSize,
unsigned playTimePerFrame)
{
return new H264FramedSource(env, preferredFrameSize, playTimePerFrame);
}
H264FramedSource::H264FramedSource(UsageEnvironment& env,
unsigned preferredFrameSize,
unsigned playTimePerFrame)
: FramedSource(env),
fPreferredFrameSize(fMaxSize),
fPlayTimePerFrame(playTimePerFrame),
fLastPlayTime(0),
fCurIndex(0)
{
if (referenceCount == 0)
{
}
++referenceCount;
x264_param_default_preset(¶m, "veryfast", "zerolatency");
param.i_threads = 1;
param.i_width = 720;
param.i_height = 960;
param.i_fps_num = 60;
param.i_fps_den = 1;
// Intra refres:
param.i_keyint_max = 60;
param.b_intra_refresh = 1;
//Rate control:
param.rc.i_rc_method = X264_RC_CRF;
param.rc.f_rf_constant = 25;
param.rc.f_rf_constant_max = 35;
param.i_sps_id = 7;
//For streaming:
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(¶m, "baseline");
encoder = x264_encoder_open(¶m);
pic_in.i_type = X264_TYPE_AUTO;
pic_in.i_qpplus1 = 0;
pic_in.img.i_csp = X264_CSP_I420;
pic_in.img.i_plane = 3;
x264_picture_alloc(&pic_in, X264_CSP_I420, 720, 920);
convertCtx = sws_getContext(720, 960, PIX_FMT_RGB24, 720, 760, PIX_FMT_YUV420P, SWS_FAST_BILINEAR, NULL, NULL, NULL);
if (eventTriggerId == 0)
{
eventTriggerId = envir().taskScheduler().createEventTrigger(deliverFrame0);
}
}
H264FramedSource::~H264FramedSource()
{
--referenceCount;
if (referenceCount == 0)
{
// Reclaim our 'event trigger'
envir().taskScheduler().deleteEventTrigger(eventTriggerId);
eventTriggerId = 0;
}
}
void H264FramedSource::AddToBuffer(uint8_t* buf, int surfaceSizeInBytes)
{
uint8_t* surfaceData = (new uint8_t[surfaceSizeInBytes]);
memcpy(surfaceData, buf, surfaceSizeInBytes);
int srcstride = W*3;
sws_scale(convertCtx, &surfaceData, &srcstride,0, H, pic_in.img.plane, pic_in.img.i_stride);
x264_nal_t* nals = NULL;
int i_nals = 0;
int frame_size = -1;
frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
static bool finished = false;
if (frame_size >= 0)
{
static bool alreadydone = false;
if(!alreadydone)
{
x264_encoder_headers(encoder, &nals, &i_nals);
alreadydone = true;
}
for(int i = 0; i < i_nals; ++i)
{
m_queue.push(nals[i]);
}
}
delete [] surfaceData;
surfaceData = NULL;
envir().taskScheduler().triggerEvent(eventTriggerId, this);
}
void H264FramedSource::doGetNextFrame()
{
deliverFrame();
}
void H264FramedSource::deliverFrame0(void* clientData)
{
((H264FramedSource*)clientData)->deliverFrame();
}
void H264FramedSource::deliverFrame()
{
x264_nal_t nalToDeliver;
if (fPlayTimePerFrame > 0 && fPreferredFrameSize > 0) {
if (fPresentationTime.tv_sec == 0 && fPresentationTime.tv_usec == 0) {
// This is the first frame, so use the current time:
gettimeofday(&fPresentationTime, NULL);
} else {
// Increment by the play time of the previous data:
unsigned uSeconds = fPresentationTime.tv_usec + fLastPlayTime;
fPresentationTime.tv_sec += uSeconds/1000000;
fPresentationTime.tv_usec = uSeconds%1000000;
}
// Remember the play time of this data:
fLastPlayTime = (fPlayTimePerFrame*fFrameSize)/fPreferredFrameSize;
fDurationInMicroseconds = fLastPlayTime;
} else {
// We don't know a specific play time duration for this data,
// so just record the current time as being the 'presentation time':
gettimeofday(&fPresentationTime, NULL);
}
if(!m_queue.empty())
{
m_queue.wait_and_pop(nalToDeliver);
uint8_t* newFrameDataStart = (uint8_t*)0xD15EA5E;
newFrameDataStart = (uint8_t*)(nalToDeliver.p_payload);
unsigned newFrameSize = nalToDeliver.i_payload;
// Deliver the data here:
if (newFrameSize > fMaxSize) {
fFrameSize = fMaxSize;
fNumTruncatedBytes = newFrameSize - fMaxSize;
}
else {
fFrameSize = newFrameSize;
}
memcpy(fTo, nalToDeliver.p_payload, nalToDeliver.i_payload);
FramedSource::afterGetting(this);
}
}
Oh and for those who want to know what my concurrent queue is, here it is, and it works brilliantly http://www.justsoftwaresolutions.co.uk/threading/implementing-a-thread-safe-queue-using-condition-variables.html
Enjoy and good luck!
The deliverFrame method lacks the following check at its start:
if (!isCurrentlyAwaitingData()) return;
see DeviceSource.cpp in LIVE
I tryed to use ffmpeg for encoding video/ But it fails on initialization of AVCodecContext annd AVCodec.
What I do:
_codec = avcodec_find_encoder(CODEC_ID_H264);
_codecContext = avcodec_alloc_context3(_codec);
_codecContext->coder_type = 0;
_codecContext->me_cmp|= 1;
_codecContext->me_method=ME_HEX;
_codecContext->me_subpel_quality = 0;
_codecContext->me_range = 16;
_codecContext->gop_size = 12;
_codecContext->scenechange_threshold = 40;
_codecContext->i_quant_factor = 0.71;
_codecContext->b_frame_strategy = 1;
_codecContext->qcompress = 0.5;
_codecContext->qmin = 2;
_codecContext->qmax = 31;
_codecContext->max_qdiff = 4;
_codecContext->max_b_frames = 3;
_codecContext->refs = 3;
_codecContext->trellis = 1;
_codecContext->width = format.biWidth;
_codecContext->height = format.biHeight;
_codecContext->time_base.num = 1;
_codecContext->time_base.den = 30;
_codecContext->pix_fmt = PIX_FMT_YUV420P;
_codecContext->chromaoffset = 0;
_codecContext->thread_count =1;
_codecContext->bit_rate = (int)(128000.f * 0.80f);
_codecContext->bit_rate_tolerance = (int) (128000.f * 0.20f);
int error = avcodec_open2(_codecContext, _codec, NULL);
if(error< )
{
std::cout<<"Open codec fail. Error "<<error<<"\n";
return NULL;
}
In such way ii fails on avopen_codec2() with:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xae1fdb70 (LWP 30675)]
0xb2eb2cbb in x264_param_default () from /usr/lib/libx264.so.120
If i comment all AVCodecContext parameters settins - I have :
[libx264 # 0xac75edd0] invalid width x height (0x0)
And avcodec_open retunrs negative value. Which steps, I'm doing, are wrong?
Thanks for any help (ffmpeg 0.10 && libx264 daily snapshot for yesterday)
In my experience you should give FFMPEG the least amount of information when initialising your codec as possible. This may seem counter intuitive but it means that FFMPEG will use it's default settings that are more likely to work than your own guesses. See what I would include below:
AVStream *st;
m_video_codec = avcodec_find_encoder(AV_CODEC_ID_H264);
st = avformat_new_stream(_outputCodec, m_video_codec);
_outputCodecContext = st->codec;
_outputCodecContext->codec_id = m_fmt->video_codec;
_outputCodecContext->bit_rate = m_AVIMOV_BPS; //Bits Per Second
_outputCodecContext->width = m_AVIMOV_WIDTH; //Note Resolution must be a multiple of 2!!
_outputCodecContext->height = m_AVIMOV_HEIGHT; //Note Resolution must be a multiple of 2!!
_outputCodecContext->time_base.den = m_AVIMOV_FPS; //Frames per second
_outputCodecContext->time_base.num = 1;
_outputCodecContext->gop_size = m_AVIMOV_GOB; // Intra frames per x P frames
_outputCodecContext->pix_fmt = AV_PIX_FMT_YUV420P;//Do not change this, H264 needs YUV format not RGB
As in previous answers, here is a working example of the FFMPEG library encoding RGB frames to a H264 video:
http://www.imc-store.com.au/Articles.asp?ID=276
An extra thought on your code though:
Have you called register all like below?
avcodec_register_all();
av_register_all();
If you don't call these two functions near the start of your code your subsequent calls to FFMPEG will fail and you'll most likely seg-fault.
Have a look at the linked example, I tested it on VC++2010 and it works perfectly.
I get the AudioBufferList from a wav file(Which Sample Rate is 44100HZ and the long time is 2second).
But I can't get 44100*2=88200 samples. In actually I got an AudiobufferList which contain 512 nNumberBuffers.
How can I get the sample from the AudioBufferList?
here is my method to get the samples from a wav file
-(float *) GetSoundFile:(NSString *)fileName
{
NSBundle *bundle = [NSBundle mainBundle];
NSString *path = [bundle pathForResource:[fileName stringByDeletingPathExtension]
ofType:[fileName pathExtension]];
NSURL *audioURL = [NSURL fileURLWithPath:path];
if (!audioURL)
{
NSLog(#"file: %# not found.", fileName);
}
OSStatus err = 0;
theFileLengthInFrames = 0; //this is global
AudioStreamBasicDescription theFileFormat;
UInt32 thePropertySize = sizeof(theFileFormat);
ExtAudioFileRef extRef = NULL;
void* theData = NULL;
AudioStreamBasicDescription theOutputFormat;
// Open a file with ExtAudioFileOpen()
err = ExtAudioFileOpenURL((__bridge CFURLRef)(audioURL), &extRef);
// Get the audio data format
err = ExtAudioFileGetProperty(extRef, kExtAudioFileProperty_FileDataFormat, &thePropertySize, &theFileFormat);
theOutputFormat.mSampleRate = samplingRate = theFileFormat.mSampleRate;
theOutputFormat.mChannelsPerFrame = numChannels = 1;
theOutputFormat.mFormatID = kAudioFormatLinearPCM;
theOutputFormat.mBytesPerFrame = sizeof(Float32) * theOutputFormat.mChannelsPerFrame ;
theOutputFormat.mFramesPerPacket = theFileFormat.mFramesPerPacket;
theOutputFormat.mBytesPerPacket = theOutputFormat.mFramesPerPacket * theOutputFormat.mBytesPerFrame;
theOutputFormat.mBitsPerChannel = sizeof(Float32) * 8 ;
theOutputFormat.mFormatFlags = 9 | 12;
//set the output property
err = ExtAudioFileSetProperty(extRef, kExtAudioFileProperty_ClientDataFormat, sizeof(AudioStreamBasicDescription), &theOutputFormat);
// Here I Get the total frame count and write it into gloabal variable
thePropertySize = sizeof(theFileLengthInFrames);
err = ExtAudioFileGetProperty(extRef, kExtAudioFileProperty_FileLengthFrames, &thePropertySize, &theFileLengthInFrames);
UInt32 dataSize = (UInt32)(theFileLengthInFrames * theOutputFormat.mBytesPerFrame);
theData = malloc(dataSize);
AudioBufferList theDataBuffer;
if (theData)
{
theDataBuffer.mNumberBuffers = 1;
theDataBuffer.mBuffers[0].mDataByteSize = dataSize;
theDataBuffer.mBuffers[0].mNumberChannels = theOutputFormat.mChannelsPerFrame;
theDataBuffer.mBuffers[0].mData = theData;
// Read the data into an AudioBufferList
err = ExtAudioFileRead(extRef, (UInt32*)&theFileLengthInFrames, &theDataBuffer);
}
return (float*)theDataBuffer.mBuffers[0].mData;
//here is the data that has thefilelengthInframes amount frames
}
AudioBufferList should not be set to a fixed size because is a temporary buffer, hardware dependent.
The size is unpredictable and you can only set a preferable size, moreover is not granted to be the same size between 2 reading calls.
To get 88200 samples (or what you like) you must incrementally fill a new buffer using,
time by time, the samples in AudioBufferList.
I suggest to use this circular buffer https://github.com/michaeltyson/TPCircularBuffer
made for this purpose.
Hope this help
As I know, AudioBufferList must be configured by you to receive data to it and then it must be filled by some reading function (e.g. ExtAudioFileRead()). So you should prepare it by allocating buffers you need (usually 1 or 2), set nNumberBuffers to that number and read audio data to them. AudioBufferList just stores that buffers and they are will contain frames values.