Next iteration of my question:
Thank you for your inputs, it has helped me to understand a little bit more about the Frame and inputSamples utility.
I’ve done modifications to my source code with the new knowledge you’ve given me. But I still have problems, so I might not have understood fully what you meant.
Here is my OpenFile function, sorry for the name but I’ll refactor later; when it’ll work =)
//-----------------------------------------------------------------------------
/*
This Function Open a File containing the Audio, Binary, Data.
*///___________________________________________________________________________
const short* OpenFile(const char* fileName, long& fileSize, WavFormat* wav)
{
// ouvre le fichier
ifstream file;
file.open((char*)fileName, ios::binary|ios::in);
if (file.good())
{
// Read the WAV's Header
wav = CheckWavHeader(file, wav);
cout << "chunkID: " << wav->chunkID <<'\n';
cout << "chunkSize: " << wav->chunkSize <<'\n';
cout << "format: " << wav->format <<'\n';
cout << "subChunk1ID: " << wav->subChunk1ID <<'\n';
cout << "subChunk1Size: " << wav->subChunk1Size <<'\n';
cout << "audioFormat: " << wav->audioFormat <<'\n'; // audioFormat == 1, alors PCM 16bits
cout << "numChannels: " << wav->numChannels <<'\n';
cout << "sampleRate: " << wav->sampleRate <<'\n';
cout << "byteRate: " << wav->byteRate <<'\n';
cout << "blockAlign: " << wav->blockAlign <<'\n';
cout << "bitsPerSample: " << wav->bitsPerSample <<'\n';
cout << "subChunk2ID: " << wav->subChunk2ID <<'\n';
cout << "subChunk2Size: " << wav->subChunk2Size <<'\n';
// Get the file’s size
file.seekg(0L, ios::end);
fileSize = ((long)file.tellg() - DATA_POS);
file.seekg(DATA_POS, ios::beg); // back to the data.
// Read the Data into the Buffer
uint nbSamples = fileSize / sizeof(short);
short* inputArray = new short[nbSamples];
file.read((char*)inputArray, fileSize);
// Close the file and return the Data
file.close();
return (const short*)inputArray;
}
else
{
exit(-1);
}
}
I’m opening the file, checking its size, create a short buffer and read the wav’s data into the short buffer and finally I return it.
In the main, for now I commented the G711 decoder.
When I run the application, the faacEncOpen gives me 2048 for inputSamples (it’s logic since I have 2 channels in the Wav’s file for a FRAME_LEN of 1024).
So if I understood correctly, 1 Frame == 2048 samples for my application. So for each Frame I call the faacEncEncode, I give the tmpInputBuffer that is a buffer of the same size as inputSamples at the inputBuffer[i * inputSamples] index.
//-----------------------------------------------------------------------------
/*
The Main entry Point of the Application
*///_____________________________________________________________________________
int main()
{
// Get the File's Data
WavFormat* wav = new WavFormat;
long fileSize;
const short* fileInput = OpenFile("audioTest.wav", fileSize, wav);
// G711 mu-Law Decoder
//MuLawDecoder* decoder = new MuLawDecoder();
//short* inputBuffer = decoder->MuLawDecode_shortArray((byte*)fileInput, (int)nbChunk);
short* inputBuffer = (short*)fileInput;
// Info for FAAC
ulong sampleRate = wav->sampleRate;
uint numChannels = wav->numChannels;
ulong inputSamples;
ulong maxOutputBytes;
// Ouvre l'Encodeur et assigne la Configuration.
faacEncHandle hEncoder = faacEncOpen(sampleRate, numChannels, &inputSamples, &maxOutputBytes);
faacEncConfigurationPtr faacConfig = faacEncGetCurrentConfiguration(hEncoder);
faacConfig->inputFormat = FAAC_INPUT_16BIT;
faacConfig->bitRate = 64000;
int result = faacEncSetConfiguration(hEncoder, faacConfig);
/*Input Buffer and Output Buffer*/
byte* outputBuffer = new byte[maxOutputBytes];
int nbBytesWritten = 0;
Sink* sink = new Sink();
uint nbFrame = fileSize / inputSamples;
int32_t* tmpInputBuffer = new int32_t[inputSamples];
for (uint i = 0; i < nbFrame; i++)
{
strncpy((char*)tmpInputBuffer, (const char*)&inputBuffer[i * inputSamples], inputSamples);
nbBytesWritten = faacEncEncode(hEncoder, tmpInputBuffer, inputSamples, outputBuffer, maxOutputBytes);
cout << 100.0 * (float)i / nbFrame << "%\t nbBytesWritten = " << nbBytesWritten << "\n";
if (nbBytesWritten > 0)
{
sink->AddAACStream(outputBuffer, nbBytesWritten);
}
}
sink->WriteToFile("output.aac");
// Close AAC Encoder
faacEncClose(hEncoder);
// Supprimer tous les pointeurs
delete sink;
//delete decoder;
delete[] fileInput;
//delete[] inputBuffer;
delete[] outputBuffer;
delete[] tmpInputBuffer;
system("pause");
return 0;
}
When the output Data is Dumped into an .acc file (as RAW AAC), I use the application mp4muxer.exe to create an .mp4 file to listen to the final converted sound. But the sound is not good at all...
I'm wondering if there is something I'm not seeing or do not unserstand that I should.
Thank you in advance for your useful inputs.
Each call to faacEncEncode encodes inputSamples samples, not just one. Your main loop should read that many samples from the WAV file into the input buffer, then call faacEncEncode once for that buffer, and finally write the output buffer to the AAC file.
It's possible that I've misunderstood what you're doing (if so, it would be useful to know: (1) What's the OpenFile function you're calling, and does it (despite its name) actually read the file as well as opening it? (2) How is inputBuffer set up?) but:
faacEncEncode expects to be given a whole frame's worth of samples. A frame is the number of samples you got passed back in inputSamples when you called faacEncOpen. (You can give it less than a whole frame if you've reached the end of the input, of course.)
So you're getting 460 and 539 bytes for each of two frames -- not for 16 bits in each case. And it looks as if your input-data pointers are actually offset by only one sample each time, so you're handing it badly overlapping frames. (And the wrong number of them; nbChunk is not the number of frames you have.)
Related
I am trying to read a binary file using fread() and Rcpp seems to be able to read the file, given ftell() returns the proper size. When I try to print the first byte, it either returns a ☐ or nothing at all. Then RStudio crashes. This code runs perfectly fine in VSCode, but not through Rcpp.
This is how I am trying to read the file.
inline void readFile(string filePath){
//read a file using the C fopen function and store to fileData
FILE* file = fopen(filePath.c_str(), "rb");
if (file == NULL){Rcpp::stop("Cannot open file");}
//find size and print it to console
fseek(file, 0, SEEK_END);
int sizeOfFile = ftell(file);
if (sizeOfFile < 1 || sizeOfFile == NULL){Rcpp::stop("Bad File size");}
Rcpp::Rcout << "File size: " << sizeOfFile << endl;
fileData = (char*)malloc(sizeof(char*)*sizeOfFile);
rewind(file);
fread(fileData, sizeOfFile, 1, file);
fclose(file);
arrayPointer = fileData;
end = fileData + sizeOfFile;
if(arrayPointer == NULL){Rcpp::stop("arrayPointer is null");}
Rcpp::Rcout << "ArrayPointer: " << *(uint8_t*)&arrayPointer[0] << endl; //crashes here
// Rcpp::Rcout << "File size: " << sizeOfFile << endl;
}
If I comment out where I print the first value in arrayPointer then program crashes in the next line after I call this function.
const_array_iterator(string filePath) {
//set up the iterator
readFile(filePath);
//read first 28 bytes of fileData put it into params -> metadata
uint32_t params[7]; //Crashes here too
memcpy(¶ms, arrayPointer, 28);
arrayPointer+=32; //first delimitor is 4 bytes
Rcpp::Rcout << "Copied params" << endl;
magicByteSize = params[0];
rowType = params[1];
nRows = params[2];
colType = params[3];
nCols = params[4];
valueWidth = params[5];
oldIndexType = params[6];
memcpy(&value, arrayPointer, valueWidth);
arrayPointer += valueWidth;
memcpy(&newIndexWidth, arrayPointer, 1);
arrayPointer++; //this should make it point to first index
}
My R code
library(Rcpp)
library(RcppClock)
library(RcppEigen)
sourceCpp("src\\playground.cpp") # The file the previous code blocks belong to
iteratorBenchmark(10, 10, 5.0)
This code is for use in a custom iterator, and I think it will work fine if these issues are fixed. I tried using an ifstream, but ran into similar issues. I have tried running this on linux as well as windows (WSL), but neither seem to work. I know the file is being read, as ftell() returns the correct amount of bytes. The data just seems to not be read properly from the file.
I want to read a per-frame timecode out of a video file using libav (FFMPEG). I've started by digging into FFProbe. Using this as a starting point for my code, I can get to the AVStream that has the timecode in it. From there, I can use the dictionary to look at the stream's metadata.
int show_stream(WriterContext *w, AVFormatContext *fmt_ctx, int stream_idx, InputStream *ist, int in_program)
{
AVStream *stream = ist->st;
...
auto tcr = av_dict_get(stream->metadata, "timecode", NULL, 0);
std::cerr << "Timecode: " << tcr->value << ", Total Frames: << stream->nb_frames << "\n";
The time code is the correct one that was embedded into the video. The nb_frames is correctly the total number of video frames that I have. What I can't get is the per-frame timecode. I don't want to compute it if I don't have to, I want to know exactly what was stamped on each frame. Is this possible?
// Pseudocode for what I want
for(const auto& f : allOfMyFrames)
{
std::cerr << "Frame number " << f.number << ", Timecode: " << f.timecode << "\n";
}
Timecode, such as that in a MOV/MP4, is just a single packet with the starting timecode expressed as a rate-adjusted frame count. There is no per-frame timecode.
I am just getting started with OpenAL for a Game Engine that I am building. My understanding is that there are some libraries that can help you open an use .wav files. I understand that ALUT is deprecated, but I have heard mention of a more current library called libaudio. I cannot, however, find that library online anywhere.
My question is this: Where can I find libaudio? Or is there a better, more maintained library out there like alut that I can use? I really don't want to have to learn how to open a .wav file if I can help it. Any suggestions would be great.
I broke down and wrote it manually based on this awesome tutorial: https://www.youtube.com/watch?v=tmVRpNFP9ys
Here is the code:
//check big vs little endian machine
static bool IsBigEndian(void)
{
int a = 1;
return !((char*)&a)[0];
}
static int ConvertToInt(char* buffer, int len)
{
int a = 0;
if(!IsBigEndian())
{
for(int i = 0; i < len; ++i)
{
((char*)&a)[i] = buffer[i];
}
}
else
{
for(int i = 0; i < len; ++i)
{
((char*)&a)[3-i] = buffer[i];
}
}
return a;
}
//Location and size of data is found here: http://www.topherlee.com/software/pcm-tut-wavformat.html
static char* LoadWAV(string filename, int& channels, int& sampleRate, int& bps, int& size)
{
char buffer[4];
std::ifstream in(filename.c_str());
in.read(buffer, 4);
if(strncmp(buffer, "RIFF", 4) != 0)
{
std::cout << "Error here, not a valid WAV file, RIFF not found in header\n This was found instead: "
<< buffer[0] << buffer[1] << buffer[2] << buffer[3] << std::endl;
}
in.read(buffer, 4);//size of file. Not used. Read it to skip over it.
in.read(buffer, 4);//Format, should be WAVE
if(strncmp(buffer, "WAVE", 4) != 0)
{
std::cout << "Error here, not a valid WAV file, WAVE not found in header.\n This was found instead: "
<< buffer[0] << buffer[1] << buffer[2] << buffer[3] << std::endl;
}
in.read(buffer, 4);//Format Space Marker. should equal fmt (space)
if(strncmp(buffer, "fmt ", 4) != 0)
{
std::cout << "Error here, not a valid WAV file, Format Marker not found in header.\n This was found instead: "
<< buffer[0] << buffer[1] << buffer[2] << buffer[3] << std::endl;
}
in.read(buffer, 4);//Length of format data. Should be 16 for PCM, meaning uncompressed.
if(ConvertToInt(buffer, 4) != 16)
{
std::cout << "Error here, not a valid WAV file, format length wrong in header.\n This was found instead: "
<< ConvertToInt(buffer, 4) << std::endl;
}
in.read(buffer, 2);//Type of format, 1 = PCM
if(ConvertToInt(buffer, 2) != 1)
{
std::cout << "Error here, not a valid WAV file, file not in PCM format.\n This was found instead: "
<< ConvertToInt(buffer, 4) << std::endl;
}
in.read(buffer, 2);//Get number of channels.
//Assume at this point that we are dealing with a WAV file. This value is needed by OpenAL
channels = ConvertToInt(buffer, 2);
in.read(buffer, 4);//Get sampler rate.
sampleRate = ConvertToInt(buffer, 4);
//Skip Byte Rate and Block Align. Maybe use later?
in.read(buffer, 4);//Block Align
in.read(buffer, 2);//ByteRate
in.read(buffer, 2);//Get Bits Per Sample
bps = ConvertToInt(buffer, 2);
//Skip character data, which marks the start of the data that we care about.
in.read(buffer, 4);//"data" chunk.
in.read(buffer, 4); //Get size of the data
size = ConvertToInt(buffer, 4);
if(size < 0)
{
std::cout << "Error here, not a valid WAV file, size of file reports 0.\n This was found instead: "
<< size << std::endl;
}
char* data = new char[size];
in.read(data, size);//Read audio data into buffer, return.
in.close();
return data;
}
Even though the text file to which I saved all the samples contains (possibly) proper samples, the sound file generated using the same set of data contains only the noise. The code responsible for writing the wav file:
void Filter::generateFrequencySound()
{
SNDFILE * outfile;
SF_INFO sfinfo;// = {0};
memset (&sfinfo, 0, sizeof (sfinfo)) ;
//preparing output file
sfinfo.format = SF_FORMAT_WAV | SF_FORMAT_PCM_16;
sfinfo.channels = 1;
sfinfo.samplerate = 44100;
std::cout << "Trying to save samples to a file" << std::endl;
const char* path = "FilterInFrequency.wav";
outfile = sf_open(path, SFM_WRITE, &sfinfo);
if(!(outfile))
{
std::cout << "Failed to create output file" << std::endl;
sf_perror(outfile);
return;
}
unsigned long savedSamples = sf_write_double( outfile,
outputOfFrequencyFiltration,
bufferSize);
if(savedSamples > bufferSize)
{
std::cout << "Failed to save all samples into outflie. Number of sampels " << savedSamples << std::endl;
sf_close(outfile);
return;
}
sf_close(outfile);
QSound::play("FilterInFrequency.wav");
}
The code responsible for writing samples into a text file:
QFile file("finalResult_1.txt");
if(!file.open(QIODevice::WriteOnly))
{
std::cout << "something went wrong";
exit(16);
}
QTextStream outstream(&file);
for(unsigned long i = 0; i < bufferSize; i++)
{
QString line = QString::number(outputOfFrequencyFiltration[i]);
outstream << line << "\n";
}
file.close();
Comparison of divergence between wav and plotted text file can be seen in the attached image. The plots have been created using the same amount of data (~20500 samples- ~10% of the output file). The file size is same for both plots.
What could be the possible reason for the differences?
textfile
wavfile
I have two functions:
a internet-socket function which gets mp3-data and writes it to file ,
a function which decodes mp3-files.
However, I would rather decode the data, which is currently written to disk, be decoded in-memory by the decode function.
My decode function looks like this, and it is all initialized via
avformat_open_input(AVCodecContext, filename, NULL, NULL)
How can I read in the AVCodecContext without a filename, and instead using only the in-memory buffer?
I thought I would post some code to illustrate how to achieve this, I have tried to comment but am pressed for time, however it should all be relatively straightforward stuff. Return values are based on interpolation of the associated message into a hex version of 1337 speak converted to decimal values, and I have tried to keep it as light as possible in tone:)
#include <iostream>
extern "C"
{
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libavutil/avutil.h>
};
std::string tooManyChannels = "The audio stream (and its frames) has/have too many channels to properly fit in\n to frame->data. Therefore, to access the audio data, you need to use\nframe->extended_data to access the audio data."
"It is a planar store, so\neach channel is in a different element.\n"
" E.G.: frame->extended_data[0] has the data for channel 1\n"
" frame->extended_data[1] has the data for channel 2\n"
"And so on.\n";
std::string nonPlanar = "Either the audio data is not planar, or there is not enough room in\n"
"frame->data to store all the channel data. Either use\n"
"frame->data\n or \nframe->extended_data to access the audio data\n"
"both should just point to the same data in this instance.\n";
std::string information1 = "If the frame is planar, each channel is in a separate element:\n"
"frame->data[0]/frame->extended_data[0] contains data for channel 1\n"
"frame->data[1]/frame->extended_data[1] contains data for channel 2\n";
std::string information2 = "If the frame is in packed format( and therefore not planar),\n"
"then all the data is contained within:\n"
"frame->data[0]/frame->extended_data[0] \n"
"Similar to the manner in which some image formats have RGB(A) pixel data packed together,\n"
"rather than containing separate R G B (and A) data.\n";
void printAudioFrameInfo(const AVCodecContext* codecContext, const AVFrame* frame)
{
/*
This url: http://ffmpeg.org/doxygen/trunk/samplefmt_8h.html#af9a51ca15301871723577c730b5865c5
contains information on the type you will need to utilise to access the audio data.
*/
// format the tabs etc. in this string to suit your font, they line up for mine but may not for yours:)
std::cout << "Audio frame info:\n"
<< "\tSample count:\t\t" << frame->nb_samples << '\n'
<< "\tChannel count:\t\t" << codecContext->channels << '\n'
<< "\tFormat:\t\t\t" << av_get_sample_fmt_name(codecContext->sample_fmt) << '\n'
<< "\tBytes per sample:\t" << av_get_bytes_per_sample(codecContext->sample_fmt) << '\n'
<< "\tPlanar storage format?:\t" << av_sample_fmt_is_planar(codecContext->sample_fmt) << '\n';
std::cout << "frame->linesize[0] tells you the size (in bytes) of each plane\n";
if (codecContext->channels > AV_NUM_DATA_POINTERS && av_sample_fmt_is_planar(codecContext->sample_fmt))
{
std::cout << tooManyChannels;
}
else
{
stc::cout << nonPlanar;
}
std::cout << information1 << information2;
}
int main()
{
// You can change the filename for any other filename/supported format
std::string filename = "../my file.ogg";
// Initialize FFmpeg
av_register_all();
AVFrame* frame = avcodec_alloc_frame();
if (!frame)
{
std::cout << "Error allocating the frame. Let's try again shall we?\n";
return 666; // fail at start: 66 = number of the beast
}
// you can change the file name to whatever yo need:)
AVFormatContext* formatContext = NULL;
if (avformat_open_input(&formatContext, filename, NULL, NULL) != 0)
{
av_free(frame);
std::cout << "Error opening file " << filename<< "\n";
return 800; // cant open file. 800 = Boo!
}
if (avformat_find_stream_info(formatContext, NULL) < 0)
{
av_free(frame);
avformat_close_input(&formatContext);
std::cout << "Error finding the stream information.\nCheck your paths/connections and the details you supplied!\n";
return 57005; // stream info error. 0xDEAD in hex is 57005 in decimal
}
// Find the audio stream
AVCodec* cdc = nullptr;
int streamIndex = av_find_best_stream(formatContext, AVMEDIA_TYPE_AUDIO, -1, -1, &cdc, 0);
if (streamIndex < 0)
{
av_free(frame);
avformat_close_input(&formatContext);
std::cout << "Could not find any audio stream in the file. Come on! I need data!\n";
return 165; // no(0) (a)udio s(5)tream: 0A5 in hex = 165 in decimal
}
AVStream* audioStream = formatContext->streams[streamIndex];
AVCodecContext* codecContext = audioStream->codec;
codecContext->codec = cdc;
if (avcodec_open2(codecContext, codecContext->codec, NULL) != 0)
{
av_free(frame);
avformat_close_input(&formatContext);
std::cout << "Couldn't open the context with the decoder. I can decode but I need to have something to decode.\nAs I couldn't find anything I have surmised the decoded output is 0!\n (Well can't have you thinking I am doing nothing can we?\n";
return 1057; // cant find/open context 1057 = lost
}
std::cout << "This stream has " << codecContext->channels << " channels with a sample rate of " << codecContext->sample_rate << "Hz\n";
std::cout << "The data presented in format: " << av_get_sample_fmt_name(codecContext->sample_fmt) << std::endl;
AVPacket readingPacket;
av_init_packet(&readingPacket);
// Read the packets in a loop
while (av_read_frame(formatContext, &readingPacket) == 0)
{
if (readingPacket.stream_index == audioStream->index)
{
AVPacket decodingPacket = readingPacket;
// Audio packets can have multiple audio frames in a single packet
while (decodingPacket.size > 0)
{
// Try to decode the packet into a frame(s)
// Some frames rely on multiple packets, so we have to make sure the frame is finished
// before utilising it
int gotFrame = 0;
int result = avcodec_decode_audio4(codecContext, frame, &gotFrame, &decodingPacket);
if (result >= 0 && gotFrame)
{
decodingPacket.size -= result;
decodingPacket.data += result;
// et voila! a decoded audio frame!
printAudioFrameInfo(codecContext, frame);
}
else
{
decodingPacket.size = 0;
decodingPacket.data = nullptr;
}
}
}
// You MUST call av_free_packet() after each call to av_read_frame()
// or you will leak so much memory on a large file you will need a memory-plumber!
av_free_packet(&readingPacket);
}
// Some codecs will cause frames to be buffered in the decoding process.
// If the CODEC_CAP_DELAY flag is set, there can be buffered frames that need to be flushed
// therefore flush them now....
if (codecContext->codec->capabilities & CODEC_CAP_DELAY)
{
av_init_packet(&readingPacket);
// Decode all the remaining frames in the buffer
int gotFrame = 0;
while (avcodec_decode_audio4(codecContext, frame, &gotFrame, &readingPacket) >= 0 && gotFrame)
{
// Again: a fully decoded audio frame!
printAudioFrameInfo(codecContext, frame);
}
}
// Clean up! (unless you have a quantum memory machine with infinite RAM....)
av_free(frame);
avcodec_close(codecContext);
avformat_close_input(&formatContext);
return 0; // success!!!!!!!!
}
Hope this helps. Let me know if you need more info, and I will try and help out:)
There is also some very good tutorial information available at dranger.com which you may find useful.
Preallocate the format context and set its pb field as suggested in the note of avformat_open_input() documentation.
.