libav - Decoding H264 Frame Error - c++

I am trying to decode a H264 frame using the libav library. After initialising the library by allocating frame and context, I am using the following code to decode:
AVPacket pkt;
int got_picture, len;
av_init_packet(&pkt);
pkt.size = size;
pkt.data = buffer;
while(pkt.size > 0) {
if((len = avcodec_decode_video2(context, frame, &got_picture, &pkt)) < 0) {
break;
}
if(got_picture) {
// Do something with the picture...
}
avPkt.size -= len;
avPkt.data += len;
}
However, whenever I call avcodec_decode_video2 it prints the following error in the console:
[...]
[h264 # 000000000126db40] AVC: The buffer size 210 is too short to read the nal length size 0 at the offset 210.
[h264 # 000000000126db40] AVC: The buffer size 283997 is too short to read the nal length size 0 at the offset 283997.
[h264 # 000000000126db40] AVC: The buffer size 17137 is too short to read the nal length size 0 at the offset 17137.
[...]
What am I missing? I tried searching for threads concerning a similar issue but nothing came up. Or is there a way I can debug the error to get more information about it?

First off, I assume you allocate the output frame correctly.
And #AntonAngelov, I am using 11.04. Do you know what the error is
supposed to say? What buffer is the error talking about?
I just looked at 11.04's source (in /avcodec/h264.c) but I didn't see where this error is generated, while in older versions it is present.
It seems the error says that the size of the NALU packets, which you send to the decoder is 0.
My guess is that you have to get the SPS and PPS headers somehow from LIVE555 and provide them to the decoder via it's extradata (also you have to set extradata_size), before you call avcodec_open2().
Another idea is to just dump all the packets you receive into a single .h264 file. Then use a software for parsing h264 bitstreams (see here for example). Also try to play it with avplay or VLC to see if the bitstream is correct.
Edit:
Here a similar question is answered.

AVPacket pkt;
int got_picture, len;
av_init_packet(&pkt);
pkt.size = size;
pkt.data = buffer;
while(pkt.size > 0) {
if((len = avcodec_decode_video2(context, frame, &got_picture, &pkt)) < 0) {
Your code worries me, since you're manually initializing a AVPacket, but you're not telling us where buffer/size come from. I'm almost certain, given the error message, that you're reading raw data from a file, socket or something like that, as if it were a raw annexb stream.
FFmpeg (or Libav, for that matter) does not accept such data as input in its H.264 decoder. To solve this, use an AVParser, as explained previously in this post.

Related

zlib error -3 while decompressing archive: Incorrect data check

I am writing a C++ library that also decompresses zlib files. For all of the files, the last call to gzread() (or at least one of the last calls) gives error -3 (Z_DATA_ERROR) with message "incorrect data check". As I have not created the files myself I am not entirely sure what is wrong.
I found this answer and if I do
gzip -dc < myfile.gz > myfile.decomp
gzip: invalid compressed data--crc error
on the command line the contents of myfile.decomp seems to be correct. There is still the crc error printed in this case, however, which may or may not be the same problem. My code, pasted below, should be straightforward, but I am not sure how to get the same behavior in code as on the command line above.
How can I achieve the same behavior in code as on the command line?
std::vector<char> decompress(const std::string &path)
{
gzFile inFileZ = gzopen(path.c_str(), "rb");
if (inFileZ == NULL)
{
printf("Error: gzopen() failed for file %s.\n", path.c_str());
return {};
}
constexpr size_t bufSize = 8192;
char unzipBuffer[bufSize];
int unzippedBytes = bufSize;
std::vector<char> unzippedData;
unzippedData.reserve(1048576); // 1 MiB is enough in most cases.
while (unzippedBytes == bufSize)
{
unzippedBytes = gzread(inFileZ, unzipBuffer, bufSize);
if (unzippedBytes == -1)
{
// Here the error is -3 / "incorrect data check" for (one of) the last block(s)
// in the file. The bytes can be correctly decompressed, as demonstrated on the
// command line, but how can this be achieved in code?
int errnum;
const char *err = gzerror(inFileZ, &errnum);
printf(err, "%s\n");
break;
}
if (unzippedBytes > 0)
{
unzippedData.insert(unzippedData.end(), unzipBuffer, unzipBuffer + unzippedBytes);
}
}
gzclose(inFileZ);
return unzippedData;
}
First off, the whole point of the CRC is to detect corrupted data. If the CRC is bad, then you should be going back to where this file came from and getting the data not corrupted. If the CRC is bad, discard the input and report an error.
You are not clear on the "behavior" you are trying to reproduce, but if you're trying to recover as much data as possible from a corrupted gzip file, then you will need to use zlib's inflate functions to decompress the file. int ret = inflateInit2(&strm, 31); will initialize the zlib stream to process a gzip file.

How do I clear a stream in C++ for a nanoPB protocol buffer to use?

I'm using nanopb in a project on ESP32, in platformIO. It's an arduino flavored C++ codebase.
I'm using some protobufs to encode data for transfer. And I've set up the memory that the protobufs will use at the root level to avoid re-allocating the memory every time a message is sent.
// variables to store the buffer/stream the data will render into...
uint8_t buffer[MESSAGE_BUFFER_SIZE];
pb_ostream_t stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
// object to hold the data on its way into the encode action...
TestMessage abCounts = TestMessage_init_zero;
Then I've got my function that encodes data into this stream via protobufs (using nanoPB)...
void encodeABCounts(int32_t button_a, int32_t button_b, String message)
{
// populate our data structure...
abCounts.a_count = button_a;
abCounts.b_count = button_b;
strcpy(abCounts.message, message.c_str());
// encode the data!
bool status = pb_encode(&stream, TestMessage_fields, &abCounts);
if (!status)
{
Serial.println("Failed to encode");
return;
}
// and here's some debug code I'll discuss below....
Serial.print("Message Length: ");
Serial.println(stream.bytes_written);
for (int i = 0; i < stream.bytes_written; i++)
{
Serial.printf("%02X", buffer[i]);
}
Serial.println("");
}
Ok. So the first time this encode action occurs this is the data I get in the serial monitor...
Message Length: 14
Message: 080110001A087370656369616C41
And that's great - everything looks good. But the second time I call encodeABCounts(), and the third time, and the forth, I get this...
Message Length: 28
Message: 080110001A087370656369616C41080210001A087370656369616C41
Message Length: 42
Message: 080110001A087370656369616C41080210001A087370656369616C41080310001A087370656369616C41
Message Length: 56
Message: 080110001A087370656369616C41080210001A087370656369616C41080310001A087370656369616C41080410001A087370656369616C41
...etc
So it didn't clear out the buffer/stream when the new data went in. Each time the buffer/stream is just getting longer as new data is appended.
How do I reset the stream/buffer to a state where it's ready for new data to be encoded and stuck in there, without reallocating the memory?
Thanks!
To reset the stream, simply re-create it. Now you have this:
pb_ostream_t stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
You can recreate it by assigning again:
stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
Though you can also move the initial stream declaration to inside encodeABCounts() to create it every time, if you don't have any particular reason to keep it around after use. The stream creation is very lightweight, as it just stores the location and size of the buffer.

Opus encode & decode no error but not the same value

I have to use Opus Codec to encode & decode audio datas in C++ and I have to encapsulate the functions.
So I try to send a floats array to try to encode it and I decode the result of the Opus encoding function. Unfortunately, the result is not the same and I get a table that contains no value from the initial table.
Here is my code.
Encapsulation:
std::vector<float> codec::OpusPlugin::decode(packet_t &packet) {
std::vector<float> out(BUFFER_SIZE * NB_CHANNELS);
int ret = 0;
if (!this->decoder)
throw Exception("Can't decode since there is no decoder.");
ret = opus_decode_float(this->decoder, packet.data.data(), packet.size, reinterpret_cast<float*>(out.data()), FRAME_SIZE, 0);
if (ret < 0)
throw Exception("Error while decoding compressed data.");
return out;
}
// ENCODER
packet_t codec::OpusPlugin::encode(std::vector<float> to_encode) {
std::vector<unsigned char> data(BUFFER_SIZE * NB_CHANNELS * 2);
packet_t packet;
int ret = 0;
if (!this->encoder)
throw Exception("Can't encode since there is no decoder.");
ret = opus_encode_float(this->encoder, reinterpret_cast<float const*>(to_encode.data()), FRAME_SIZE, data.data(), data.size());
if (ret < 0)
throw Exception("Error while encoding data.");
packet.size = ret;
packet.data = data;
return packet;
}
And there is the call of the functions:
packet_t packet;
std::vector<float> floats = {0.23, 0, -0.312, 0.401230, 0.1234, -0.1543};
packet = CodecPlugin->encode(floats);
std::cout << "packet size: " << packet.size << std::endl;
std::vector<float> output = CodecPlugin->decode(packet);
for (int i = 0; i < 10; i++) {
std::cout << output.data()[i] << " ";
}
Here is the packet_t structure, where I stock the return value of encode and the unsigned char array (encoded value)
typedef struct packet_s {
int size;
std::vector<unsigned char> data;
} packet_t;
The output of the program is
*-1.44487e-15 9.3872e-16 -1.42993e-14 7.31834e-15 -5.09662e-14 1.53629e-14 -8.36825e-14 3.9531e-14 -8.72754e-14 1.0791e-13 which is not the array I initialize at the beginning.
I read a lot of times the documentation and code examples but I don't know where I did a mistake.
I hope you will be able to help me.
Thanks :)
We don't see how you initialize your encoder and decoder so we don't know what their sample rate, complexity or number of channels is. No matter how you have initialized them you are still going to have the following problems:
First Opus encoding doesn't support arbitrary frame sizes but instead 2.5ms, 5ms, 10ms, 20, 40ms or 60ms RFC 6716 - Definition of the Opus Audio Codec relevant section 2.1.4. Moreover opus supports only 8kHz, 12kHz, 16kHz, 24kHz or 48kHz sample rates. No matter which of those you have chosen your array of 10 elements doesn't correspond to any of the supported frame sizes.
Secondly Opus codec is a lossy audio codec. This means that after you encode any signal you will never (probably except some edge cases) be able to reconstruct the original signal after decoding the encoded opus frame. The best way to test if your encoder and decoder work is with a real audio sample. Opus encoding preserves the perceptual quality of the audio files. Therefore if you try to test it with arbitrary data you might not get the expected results back even if you implemented the encoding and decoding functions correctly.
What you can easily do is to make a sine function of 2000Hz(there are multiple examples on the internet) for 20ms. This means 160 array elements at a sample rate of 8000Hz if you wish to use 8kHz. A sine wave of 2kHz is within the human hearing range so the encoder is going to preserve it. Then decode it back and see whether the elements of the input and output array are similar as we've already established that it is unlikely that they are the same.
I am not good in C++ so I can't help you with code examples but the problems above hold true no matter what language is used.

C++ Socket Buffer Size

This is more of a request for confirmation than a question, so I'll keep it brief. (I am away from my PC and so can't simply implement this solution to test).
I'm writing a program to send an image file taken via webcam (along with meta data) from a raspberryPi to my PC.
I've worked out that the image is roughly around 130kb, the packet header is 12b and the associated meta data another 24b. Though I may increase the image size in future, once I have a working prototype.
At the moment I am not able to retrieve this whole packet successfully as, after sending it to the PC I only ever get approx 64kb recv'd in the buffer.
I have assumed that this is because for whatever reason the default buffer size for a socket declared like:
SOCKET sock = socket(PF_INET, SOCK_STREAM, 0);
is 64kb (please could someone clarify this if you're 'in the know')
So - to fix this problem I intend to increase the socket size to 1024kb via the setsockopt(x..) command.
Please could someone confirm that my diagnosis of the problem, and proposed solution are correct?
I ask this question as I am away form my PC right now and am unable to try it until I get back home.
This most likely has nothing to do with the socket buffers, but with the fact that recv() and send() do not have to receive and send all the data you want. Check the return value of those function calls, it indicates how many bytes have actually been sent and received.
The best way to deal with "short" reads/writes is to put them in a loop, like so:
char *buf; // pointer to your data
size_t len; // length of your data
int fd; // the socket filedescriptor
size_t offset = 0;
ssize_t result;
while (offset < len) {
result = send(fd, buf + offset, len - offset, 0);
if (result < 0) {
// Deal with errors here
}
offset += result;
}
Use a similar construction for receiving data. Note that one possible error condition is that the function call was interrupted (errno = EAGAIN or EWOULDBLOCK), in that case you should retry the send command, in all other cases you should exit the loop.

Confusion about PTS in video files and media streams

Is it possible that the PTS of a particular frame in a file is different with the PTS of the same frame in the same file while it is being streamed?
When I read a frame using av_read_frame I store the video stream in an AVStream. After I decode the frame with avcodec_decode_video2, I store the time stamp of that frame in an int64_t using av_frame_get_best_effort_timestamp. Now if the program is getting its input from a file I get a different timestamp from when I stream the input (from the same file) to the program.
To change the input type I simply change the argv argument from "/path/to/file.mp4" to something like "udp://localhost:1234", then I stream the file with ffmpeg in command line: "ffmpeg -re -i /path/to/file.mp4 -f mpegts udp://localhost:1234". Can it be because the "-f mpegts" arguments change some characteristics of the media?
Below is my code (simplified). By reading the ffmpeg mailing list archives I realized that the time_base that I'm looking for is in the AVStream and not the AVCodecContext. Instead of using av_frame_get_best_effort_timestamp I have also tried using the packet.pts but the results don't change.
I need the time stamps to have a notion of frame number in a streaming video that is being received.
I would really appreciate any sort of help.
//..
//argv[1]="/file.mp4";
argv[1]="udp://localhost:7777";
// define AVFormatContext, AVFrame, etc.
// register av, avcodec, avformat_network_init(), etc.
avformat_open_input(&pFormatCtx, argv, NULL, NULL);
avformat_find_stream_info(pFormatCtx, NULL);
// find the video stream...
// pointer to the codec context...
// open codec...
pFrame=av_frame_alloc();
while(av_read_frame(pFormatCtx, &packet)>=0) {
AVStream *strem = pFormatCtx->streams[videoStream];
if(packet.stream_index==videoStream) {
avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
if(frameFinished) {
int64_t perts = av_frame_get_best_effort_timestamp(pFrame);
if (isMyFrame(pFrame)){
cout << perts*av_q2d(strem->time_base) << "\n";
}
}
}
//free allocated space
}
//..
Timestamps are stored at the container level, so changing the container can change the timestamps. In addition, TS stores a timestamp for every frame (based on a 90kHz clock). MP4 only stores the frame durations with an assumed start time of 0 (this gets more complicated with bframes since the first PTS is zero, and the first DTS is < 0). So to get the time stamp all the frame durations are added. Mp4 also allows the clock rate be set. It is often 1001/3000 ticks per second for 29.97FPS, but it can be set to anything. so av_frame_get_best_effort_timestamp returns you ticks in codec->stream_base units. For TS codec->stream_base is always 1/90000