Opus encode & decode no error but not the same value

Opus encode & decode no error but not the same value - c++

I have to use Opus Codec to encode & decode audio datas in C++ and I have to encapsulate the functions.
So I try to send a floats array to try to encode it and I decode the result of the Opus encoding function. Unfortunately, the result is not the same and I get a table that contains no value from the initial table.
Here is my code.
Encapsulation:
std::vector<float> codec::OpusPlugin::decode(packet_t &packet) {
std::vector<float> out(BUFFER_SIZE * NB_CHANNELS);
int ret = 0;
if (!this->decoder)
throw Exception("Can't decode since there is no decoder.");
ret = opus_decode_float(this->decoder, packet.data.data(), packet.size, reinterpret_cast<float*>(out.data()), FRAME_SIZE, 0);
if (ret < 0)
throw Exception("Error while decoding compressed data.");
return out;
}
// ENCODER
packet_t codec::OpusPlugin::encode(std::vector<float> to_encode) {
std::vector<unsigned char> data(BUFFER_SIZE * NB_CHANNELS * 2);
packet_t packet;
int ret = 0;
if (!this->encoder)
throw Exception("Can't encode since there is no decoder.");
ret = opus_encode_float(this->encoder, reinterpret_cast<float const*>(to_encode.data()), FRAME_SIZE, data.data(), data.size());
if (ret < 0)
throw Exception("Error while encoding data.");
packet.size = ret;
packet.data = data;
return packet;
}
And there is the call of the functions:
packet_t packet;
std::vector<float> floats = {0.23, 0, -0.312, 0.401230, 0.1234, -0.1543};
packet = CodecPlugin->encode(floats);
std::cout << "packet size: " << packet.size << std::endl;
std::vector<float> output = CodecPlugin->decode(packet);
for (int i = 0; i < 10; i++) {
std::cout << output.data()[i] << " ";
}
Here is the packet_t structure, where I stock the return value of encode and the unsigned char array (encoded value)
typedef struct packet_s {
int size;
std::vector<unsigned char> data;
} packet_t;
The output of the program is
*-1.44487e-15 9.3872e-16 -1.42993e-14 7.31834e-15 -5.09662e-14 1.53629e-14 -8.36825e-14 3.9531e-14 -8.72754e-14 1.0791e-13 which is not the array I initialize at the beginning.
I read a lot of times the documentation and code examples but I don't know where I did a mistake.
I hope you will be able to help me.
Thanks :)

We don't see how you initialize your encoder and decoder so we don't know what their sample rate, complexity or number of channels is. No matter how you have initialized them you are still going to have the following problems:
First Opus encoding doesn't support arbitrary frame sizes but instead 2.5ms, 5ms, 10ms, 20, 40ms or 60ms RFC 6716 - Definition of the Opus Audio Codec relevant section 2.1.4. Moreover opus supports only 8kHz, 12kHz, 16kHz, 24kHz or 48kHz sample rates. No matter which of those you have chosen your array of 10 elements doesn't correspond to any of the supported frame sizes.
Secondly Opus codec is a lossy audio codec. This means that after you encode any signal you will never (probably except some edge cases) be able to reconstruct the original signal after decoding the encoded opus frame. The best way to test if your encoder and decoder work is with a real audio sample. Opus encoding preserves the perceptual quality of the audio files. Therefore if you try to test it with arbitrary data you might not get the expected results back even if you implemented the encoding and decoding functions correctly.
What you can easily do is to make a sine function of 2000Hz(there are multiple examples on the internet) for 20ms. This means 160 array elements at a sample rate of 8000Hz if you wish to use 8kHz. A sine wave of 2kHz is within the human hearing range so the encoder is going to preserve it. Then decode it back and see whether the elements of the input and output array are similar as we've already established that it is unlikely that they are the same.
I am not good in C++ so I can't help you with code examples but the problems above hold true no matter what language is used.

Related

swr_convert is trying to write to an empty buffer. Is this a bug or am I doing something wrong?

I'm using FFmpe's swr_convert to convert AV_SAMPLE_FMT_FLTP audio. I've been successful converting to a different sample format (e.g. AV_SAMPLE_FMT_FLT and AV_SAMPLE_FMT_S16), but I'm running into trouble when I'm trying to keep the AV_SAMPLE_FMT_FLTP sample format but change the sample rate.
When converting AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_FLTP, swr_convert attempts to write to an empty buffer.
I'm using swr_convert to convert from 22050 Hz AV_SAMPLE_FMT_FLTP to 16000 Hz AV_SAMPLE_FMT_FLTP.
I initialized SwrContext like so:
if (swr_alloc_set_opts2(
&resample_context,
&pAVContext->ch_layout, AV_SAMPLE_FMT_FLTP, 16000,
&pAVContext->ch_layout, AV_SAMPLE_FMT_FLTP, 22050, 0, NULL) < 0)
return ERR_SWR_INIT_FAIL;
if(swr_init(resample_context) < 0)
return ERR_SWR_INIT_FAIL;
and when I call it like this, the program tries to write to a null buffer and crashes.
samples_decoded = swr_convert(ctx->pSwrContext,
&pDecodedAudio, numOutSamples,
(const uint8_t**)&pDecodedFrame->data, pDecodedFrame->nb_samples);
So far I've traced the problem to swr_convert_internal
if(s->int_sample_fmt == s->out_sample_fmt && s->out.planar
&& !(s->out_sample_fmt==AV_SAMPLE_FMT_S32P && (s->dither.output_sample_bits&31))){
//Sample format is planar and input format is same as output format
if(preout==in){
out_count= FFMIN(out_count, in_count);
av_assert0(s->in.planar);
copy(out, in, out_count);
return out_count;
}
else if(preout==postin) preout= midbuf= postin= out;
else if(preout==midbuf) preout= midbuf= out;
else preout= out;
}
That if bit of code assigns out to preout, but out's data is unitialized. Later on FFmpeg tries to write to the uninitialized block.
I've tested this in 5.1 and in the snapshot build, and it crashes both of them.
So, am I doing something wrong, or is this a bug?

I was doing something wrong. Packet audio is a contiguous block of memory and can be referenced by one pointer, but planar audio has a different pointer to each channel. To fix this, I got two pointers to my pDecodedAudio block.
uint8_t* convertedData [2] = {
pDecodedAudio ,
pDecodedAudio + (numOutSamples * ctx->output_sample_size)
};
samples_decoded = swr_convert(ctx->pSwrContext,
convertedData, numOutSamples,
pDecodedFrame->data, pDecodedFrame->nb_samples);
See the comments in AVFrame.
/*
* For planar audio, each channel has a separate data pointer, and
* linesize[0] contains the size of each channel buffer.
* For packed audio, there is just one data pointer, and linesize[0]
* contains the total size of the buffer for all channels.
*
* Note: Both data and extended_data should always be set in a valid frame,
* but for planar audio with more channels that can fit in data,
* extended_data must be used in order to access all channels.
*/
uint8_t **extended_data;

Convert QAudioBuffer to QByteArray : loss of information?

I have a WAV file that I decode with the QAudioDecoder. As a result I have a QAudioBuffer object. I want to store the data stored in QAudioBuffer in a QByteArray for my QIODevice derived class. I want to use this data in the ReadData method of my derived class for audio output. I now have 2 questions:
How to get a QByteArray from a QAuddioBuffer?
I used the following code, but unfortunately this is not correct. The data in QAudioBuffer is coded to 2Bytes, but each element in a QByteArray is coded to 1Byte (right?). Don't we have a loss of information there? To test if QByteArray contains the original data from the WAV file, I save it to a TXT file.
is this approach appropriate? I actually want to apply some operations on the data stored in QAudioBuffer (e.g. filters) and listen to the result in real time.
Thanks in advance.
Here is the code
QAudioFormat *format_decoder;
format_decoder = new QAudioFormat;
format_decoder->setSampleRate(44100);
format_decoder->setChannelCount(1);
format_decoder->setSampleFormat(QAudioFormat::Int16);
QAudioDecoder decoder;
decoder.setSource(filenameSource);
decoder.setAudioFormat(*format_decoder);
decoder.start();
QObject::connect(&decoder, &QAudioDecoder::bufferReady, this, &MainWindow::slot_bufReady);
and the slot
void MainWindow::slot_bufReady(){
QAudioBuffer buffer = m_audioDecoder->read();
QByteArray buffer_ByteArray(buffer.constData<char>(), buffer.byteCount());
QFile file(filenameTest1);
if(!file.open(QIODevice::WriteOnly|QIODevice::Append)) {
qDebug() << "ERRO "; }
QTextStream strem(&file);
for(auto const dat: buffer_ByteArray) {
strem<< qreal(dat)/128.0<< "\r\n";
}
file.cloe();

This looks suspicious:
for(auto const dat: buffer_ByteArray) {
strem<< qreal(dat)/128.0<< "\r\n";
}
Your audio format is 16-bit mono. Reading it byte by byte is a non-starter. Read it sample by sample. That is, read two bytes at a time and convert. More likely this:
int16_t* data = (int16_t*)(buffer.data());
int samples = buffer.sampleCount();
for (int i = 0; i < samples; i++)
{
strem << data[i] << "\r\n";
}
The above will save your samples into a text file. You could plot it with Excel. But as others have said, that's not as useful as saving in as binary. You could prepend a WAV file header such that it can be played and analyzed with other tools.
Update
If your intent is to transcode from 16-bit to 8-bit, this is how you would likely do it:
int16_t* data = (int16_t*)(buffer.data());
QByteArray buffer_ByteArray(buffer.sampleCount(), '\0');
for (size_t i = 0; i < samples; i++) {
buffer_ByteArray[i] = (char)(data[i] / 256); // 16-bit to 8-bit
}
Note: some audio platforms use unsigned integers for 8-bit audio. That is the zero amplitude sample is 0x80. This is the case for 8-bit WAV files. If that's in play, then change this line:
buffer_ByteArray[i] = (char)(data[i] / 256); // 16-bit to 8-bit
To this:
char c = (char)(data[i] / 256); // 16-bit to 8-bit signed
const unsigned char mask = 0x80;
buffer_ByteArray[i] = (char)(mask ^ c);

libav - Decoding H264 Frame Error

I am trying to decode a H264 frame using the libav library. After initialising the library by allocating frame and context, I am using the following code to decode:
AVPacket pkt;
int got_picture, len;
av_init_packet(&pkt);
pkt.size = size;
pkt.data = buffer;
while(pkt.size > 0) {
if((len = avcodec_decode_video2(context, frame, &got_picture, &pkt)) < 0) {
break;
}
if(got_picture) {
// Do something with the picture...
}
avPkt.size -= len;
avPkt.data += len;
}
However, whenever I call avcodec_decode_video2 it prints the following error in the console:
[...]
[h264 # 000000000126db40] AVC: The buffer size 210 is too short to read the nal length size 0 at the offset 210.
[h264 # 000000000126db40] AVC: The buffer size 283997 is too short to read the nal length size 0 at the offset 283997.
[h264 # 000000000126db40] AVC: The buffer size 17137 is too short to read the nal length size 0 at the offset 17137.
[...]
What am I missing? I tried searching for threads concerning a similar issue but nothing came up. Or is there a way I can debug the error to get more information about it?

First off, I assume you allocate the output frame correctly.
And #AntonAngelov, I am using 11.04. Do you know what the error is
supposed to say? What buffer is the error talking about?
I just looked at 11.04's source (in /avcodec/h264.c) but I didn't see where this error is generated, while in older versions it is present.
It seems the error says that the size of the NALU packets, which you send to the decoder is 0.
My guess is that you have to get the SPS and PPS headers somehow from LIVE555 and provide them to the decoder via it's extradata (also you have to set extradata_size), before you call avcodec_open2().
Another idea is to just dump all the packets you receive into a single .h264 file. Then use a software for parsing h264 bitstreams (see here for example). Also try to play it with avplay or VLC to see if the bitstream is correct.
Edit:
Here a similar question is answered.

AVPacket pkt;
int got_picture, len;
av_init_packet(&pkt);
pkt.size = size;
pkt.data = buffer;
while(pkt.size > 0) {
if((len = avcodec_decode_video2(context, frame, &got_picture, &pkt)) < 0) {
Your code worries me, since you're manually initializing a AVPacket, but you're not telling us where buffer/size come from. I'm almost certain, given the error message, that you're reading raw data from a file, socket or something like that, as if it were a raw annexb stream.
FFmpeg (or Libav, for that matter) does not accept such data as input in its H.264 decoder. To solve this, use an AVParser, as explained previously in this post.

C++ Is live PCM fft audio processing with OpenAL?

I'm working on a project that will involve having to process PCM audio data through fft as its being played, preferably in sync. I'm using a linux g++ compiler and currently reading and playing audio data using OpenAL.
My question is this: is there a better way to process PCM audio data with an fft live as the audio is playing then using threads? If not, then what threading library would be best to use for these purposes.
this is my function that loads the wave data into an array of bytes, these can later be cast to ints for processing and all I use to play the data is OpenAL.
char* loadWAV(const char* fn, int& chan, int& samplerate, int& bps, int& size){
char buffer[4];
ifstream in(fn, ios::binary);
in.read(buffer, 4); //ChunkID "RIFF"
if(strncmp(buffer, "RIFF", 4) != 0){
cerr << "this is not a valid wave file";
return NULL;
}
in.read(buffer,4); //ChunkSize
in.read(buffer,4); //Format "WAVE"
in.read(buffer,4); // "fmt "
in.read(buffer,4); // 16
in.read(buffer,2); // 1
in.read(buffer,2); // NUMBER OF CHANNELS
chan = convertToInt(buffer,2);
in.read(buffer,4); // SAMPLE RATE
samplerate = convertToInt(buffer,4);
in.read(buffer,4); // ByteRate
in.read(buffer,2); // BlockAlign
in.read(buffer,2); // bits per sample
bps = convertToInt(buffer,2);
in.read(buffer,4); // "data"
in.read(buffer,4);
size = convertToInt(buffer,4);
char * data = new char[size];
in.read(data,size);
return data;
}
thank you for any and all help.
edit: to anyone who might be interested I wrote the function using this as a reference to know
how a WAV file is formated

Are you hoping to perform the FFT using OpenAL? I don't know if that's possible. Your code will likely be performing the FFT.
You don't need to explicitly set up any threads. However, your audio output library will probably do so on your behalf. I'm not familiar with OpenAL, but the way that a lot of audio libraries operate is by letting you specify a callback that will feed more audio into the output. Thus, your main program will load audio from the audio file, stuff it into a buffer (likely protected by a mutex) for the audio callback to read, compute an FFT over the audio window, and perhaps visualize the data for the user.
Again, the audio library will probably be managing the threading so you don't need to worry about the exact threading implementation or library. But be sure to manage shared data correctly with a mutex.

WAV file from captured PCM sample data

I have several Gb of sample data captured 'in-the-field' at 48ksps using an NI Data Acquisition module. I would like to create a WAV file from this data.
I have done this previously using MATLAB to load the data, normalise it to the 16bit PCM range, and then write it out as a WAV file. However MATLAB baulks at the file size as it does everything 'in-memory'.
I would ideally do this in C++ or C, (C# is an option), or if there is an existing utility I'd use that. Is there a simple way (i.e. an existing library) to take a raw PCM buffer, specify the sample rate, bit depth, and package it into a WAV file?
To handle the large data set, it would need to be able to append data in chunks as it would not necessarily be possible to read the whole set into memory.
I understand that I could do this from scratch using the format specification, but I do not want to re-invent the wheel, or spend time fixing bugs on this if I can help it.

Interesting, I have found a bug on stackoverflow parse of code, it dont support the \ character at the end of the line like you see below, sad
//stolen from OGG Vorbis pcm to wav conversion rountines, sorry
#define VERSIONSTRING "OggDec 1.0\n"
static int quiet = 0;
static int bits = 16;
static int endian = 0;
static int raw = 0;
static int sign = 1;
unsigned char headbuf[44]; /* The whole buffer */
#define WRITE_U32(buf, x) *(buf) = (unsigned char)((x)&0xff);\
*((buf)+1) = (unsigned char)(((x)>>8)&0xff);\
*((buf)+2) = (unsigned char)(((x)>>16)&0xff);\
*((buf)+3) = (unsigned char)(((x)>>24)&0xff);
#define WRITE_U16(buf, x) *(buf) = (unsigned char)((x)&0xff);\
*((buf)+1) = (unsigned char)(((x)>>8)&0xff);
/*
* Some of this based on ao/src/ao_wav.c
*/
static int
write_prelim_header (FILE * out, int channels, int samplerate)
{
int knownlength = 0;
unsigned int size = 0x7fffffff;
// int channels = 2;
// int samplerate = 44100;//change this to 48000
int bytespersec = channels * samplerate * bits / 8;
int align = channels * bits / 8;
int samplesize = bits;
if (knownlength)
size = (unsigned int) knownlength;
memcpy (headbuf, "RIFF", 4);
WRITE_U32 (headbuf + 4, size - 8);
memcpy (headbuf + 8, "WAVE", 4);
memcpy (headbuf + 12, "fmt ", 4);
WRITE_U32 (headbuf + 16, 16);
WRITE_U16 (headbuf + 20, 1); /* format */
WRITE_U16 (headbuf + 22, channels);
WRITE_U32 (headbuf + 24, samplerate);
WRITE_U32 (headbuf + 28, bytespersec);
WRITE_U16 (headbuf + 32, align);
WRITE_U16 (headbuf + 34, samplesize);
memcpy (headbuf + 36, "data", 4);
WRITE_U32 (headbuf + 40, size - 44);
if (fwrite (headbuf, 1, 44, out) != 44)
{
printf ("ERROR: Failed to write wav header: %s\n", strerror (errno));
return 1;
}
return 0;
}
static int
rewrite_header (FILE * out, unsigned int written)
{
unsigned int length = written;
length += 44;
WRITE_U32 (headbuf + 4, length - 8);
WRITE_U32 (headbuf + 40, length - 44);
if (fseek (out, 0, SEEK_SET) != 0)
{
printf ("ERROR: Failed to seek on seekable file: %s\n",
strerror (errno));
return 1;
}
if (fwrite (headbuf, 1, 44, out) != 44)
{
printf ("ERROR: Failed to write wav header: %s\n", strerror (errno));
return 1;
}
return 0;
}

I think you can use libsox for this.

I came across a function called WAVAPPEND on Mathworks' File Exchange site a while ago. I never got around to using it, so I'm not sure if it works or is appropriate for what you're trying to do, but perhaps it'll be useful to you.

Okay... I'm 5 years late here... but I just did this for myself and wanted to put the solution out there!
I had the same issue with running out of memory while writing large wav files in matlab. I got around this by editing the matlab wavwrite function so it pulls data from your harddrive using memmap instead of variables stored on the RAM, then saving it as a new function. This will save you a lot of trouble, as you don't have to worry about dealing with headers when writing the wav file from scratch, and you wont need any external applications.
1) type edit wavwriteto see the code for the function, then save a copy of it as a new function.
2) I modified the y variable in the wavwrite function from an array containing the wav data to a cell array with strings pointing to the locations for the data of each channel saved on my harddrive. Use fwrite to store your wav data on the harddrive first of course. At the beginning of the function I transformed the file locations stored in y into memmap variables and defined the number of channels and samples like so:
replace these lines:
% If input is a vector, force it to be a column:
if ndims(y) > 2,
error(message('MATLAB:audiovideo:wavwrite:invalidInputFormat'));
end
if size(y,1)==1,
y = y(:);
end
[samples, channels] = size(y);
with this:
% get num of channels
channels = length(y);
%Convert y from strings pointing to wav data to mammap variables allowing access to the data
for i = 1:length(y)
y{i} = memmapfile(y{i},'Writable',false,'Format','int16');
end
samples = length(y{1}.Data);
3) Now you can edit the private function write_wavedat(fid,fmt). This is the function that writes the wav data. Turn it into a nested function so that it can read your y memmap variable as a global variable, instead of passing the value to the function and eating up your RAM, then you can make some changes like this:
replace the lines which write the wav data:
if (fwrite(fid, reshape(data',total_samples,1), dtype) ~= total_samples),
error(message('MATLAB:audiovideo:wavewrite:failedToWriteSamples'));
end
with this:
%Divide data into smaller packets for writing
packetSize = 30*(5e5); %n*5e5 = n Mb of space required
packets = ceil(samples/packetSize);
% Write data to file!
for i=1:length(y)
for j=1:packets
if j == packets
fwrite(fid, y{i}.Data(((j-1)*packetSize)+1:end), dtype);
else
fwrite(fid, y{i}.Data(((j-1)*packetSize)+1:j*packetSize), dtype);
end
disp(['...' num2str(floor(100*((i-1)*packets + j)/(packets*channels))) '% done writing file...']);
end
end
This will incrementally copy the data from each memmap variable into the wavfile
4) That should be it! You can leave the rest of the code as is, as it'll write the headers for you. Heres an example of how you'd write a large 2 channel wav file with this function:
wavwriteModified({'c:\wavFileinputCh1' 'c:\wavFileinputCh2'},44100,16,'c:\output2ChanWavFile');
I can verify this approach works, as I just wrote a 800mB 4 channel wav file with my edited wavwrite function, when matlab usually throws an out of memmory error for writing wav files larger then 200mb for me.

C# would be a good choice for this. FileStreams are easy to work with, and could be used for reading and writing the data in chunks. Also, reading WAV file headers is a relatively complicated task (you have to search for RIFF chunks and so on), but writing them is cake (you just fill out a header structure and write it at the beginning of the file).
There are a number of libraries that do conversions like this, but I'm not sure they can handle the huge data sizes you're talking about. Even if they do, you would probably still have to do some programming work to feed smaller chunks of raw data to these libraries.
For writing your own method, normalization isn't difficult, and even resampling from 48ksps to 44.1ksps is relatively simple (assuming you don't mind linear interpolation). You would also presumably have greater control over the output, so it would be easier to create a set of smaller WAV files, instead of one gigantic one.

The current Windows SDK audio capture samples capture data from the microphone and save the captured data to a .WAV file. The code is far from optimal but it should work.
Note that RIFF files (.WAV files are RIFF files) are limited to 4G in size.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js