Why do i get avc1.000000? (FFMPEG, H.264 Video Encoding, C++) - c++

I have a bunch of bitmaps, and need to make them encoded in h.264 in fragmented .mp4.
I'm using C++.
What could cause that my AVC Profile is set to 0, SPS[], PPS[] and codec string to avc1.000000?
Output from the mp4info:
File:
minor version: 200
compatible brand: iso6
compatible brand: mp41
fast start: yes
Movie:
duration: 0 ms
time scale: 1000
fragments: yes
Found 1 Tracks
Track 1:
flags: 3 ENABLED IN-MOVIE
id: 1
type: Video
duration: 0 ms
language: und
media:
sample count: 0
timescale: 90000
duration: 0 (media timescale units)
duration: 0 (ms)
bitrate (computed): 412.672 Kbps
sample count with fragments: 35
duration with fragments: 540000
duration with fragments: 6000 (ms)
display width: 1280.000000
display height: 720.000000
Sample Description 0
Coding: avc1 (H.264)
Width:n 1280
Height: 720
Depth: 24
AVC Profile: 0
AVC Profile Compat: 0
AVC Level: 0
AVC NALU Length Size: 0
AVC SPS: []
AVC PPS: []
Codecs String: avc1.000000
I'm using things like
if (stream->codecpar->codec_id == AVCodecID.AV_CODEC_ID_H264)
{
err = ffmpeg.av_opt_set(cctx->priv_data, "preset", "ultrafast", 0);
err = ffmpeg.av_opt_set(cctx->priv_data, "tune", "zerolatency", 0);
err = ffmpeg.av_opt_set(cctx->priv_data, "profile", "high", 0);
}
...
AVDictionary* opts = null;
ffmpeg.av_dict_set(&opts, "movflags", "default_base_moof+frag_keyframe+empty_moov", 0);
...
AVPacket* pPacket = ffmpeg.av_packet_alloc();
try
{
int error;
do
{
ffmpeg.avcodec_send_frame(cctx, &convertedFrame).ThrowExceptionIfError();
error = ffmpeg.avcodec_receive_packet(cctx, pPacket);
} while (error == ffmpeg.AVERROR(ffmpeg.EAGAIN));
error.ThrowExceptionIfError();
}
finally
{
ffmpeg.av_packet_rescale_ts(pPacket, cctx->time_base, stream->time_base);
pPacket->stream_index = stream->index;
ffmpeg.av_interleaved_write_frame(ofctx, pPacket);
ffmpeg.av_packet_unref(pPacket);
}
What am I missing? I'm using examples from internet. Thought that if AVFrame is encoded (send_frame) with H264 with profiles and presets and received as AVPacket. It should be done automatically.
This is my first post, please be nice. Thanks in advance for helping.

Related

VLC huge buffering times over rtp for local H264 stream

I'm outputting an H264 stream, encoded by my application using ffmpeg. I can display it using ffplay, but when trying to view the stream in VLC, I only get the first frame, or it looks like that's the case.
The messages output shows that it is "buffering", taking around a minute to get to 100% when the frame updates.
When using ffplay, the latency is about 50-100ms at worst.
I am sending to rtp://127.0.0.1:6666?pkt_size=1316 with the format rtp_mpegts.
I am new to this and it's highly likely I haven't set the frame up completely correctly. The process is (minus declarations and error checking)
codec_name = "libx264";
codec = avcodec_find_encoder_by_name(codec_name.c_str());
context = avcodec_alloc_context3(codec);
pkt = av_packet_alloc();
context->bit_rate = 5 * Mega;
context->width = info.DisplayWidth;
context->height = info.DisplayHeight;
context->time_base = { 1, FPS };
context->framerate = { FPS, 1 };
context->gop_size = 100;
context->max_b_frames = 1;
context->pix_fmt = AV_PIX_FMT_YUV420P;
if (codec->id == AV_CODEC_ID_H264)
{
check_ret("set option: preset", av_opt_set(context->priv_data, "preset", "fast", 0));
check_ret("set option: tune", av_opt_set(context->priv_data, "tune", "zerolatency", 0));
check_ret("set option: profile", av_opt_set(context->priv_data, "profile", "baseline", 0));
}
check_ret("open codec", avcodec_open2(context, codec, NULL));
// setup the stream
fmt = (AVOutputFormat*)av_guess_format("rtp_mpegts", NULL, NULL);
avformat_alloc_output_context2(&avfctx, fmt, fmt->name,
"rtp://127.0.0.1:6666?pkt_size=1316");
avio_open(&avfctx->pb, avfctx->url, AVIO_FLAG_WRITE);
AVStream* stream = avformat_new_stream(avfctx, codec);
avcodec_parameters_from_context(stream->codecpar, context);
stream->time_base.num = 1;
stream->time_base.den = FPS;
avformat_write_header(avfctx, NULL);
// then the encoding (in an output loop)
<not shown: get frame from swapchain, sws_scale from rgba to yuv>
yuvFrame->pts = i++; // i is incremented every frame
avcodec_send_frame(enc_ctx, yuvFrame);
while (ret >= 0) {
ret = avcodec_receive_packet(enc_ctx, pkt);
//ret = av_interleaved_write_frame(avfctx, pkt); was using this, don't seem to need it
ret = av_write_frame(avfctx, pkt);
av_packet_unref(pkt);
}
The VLC output looks like this:
main debug: using hw decoder module "d3d11va"
avcodec info: Using D3D11VA (NVIDIA GeForce RTX 2080 Super with Max-Q Design, vendor 10de(NVIDIA), device 1e93, revision a1) for hardware decoding
qt debug: Logical video size: 1280x720
main debug: resized to 1280x720
main debug: VoutDisplayEvent 'resize' 1280x720
main debug: Received first picture
main debug: Buffering 1%
main debug: Buffering 2%
main debug: Buffering 3%
main debug: auto hiding mouse cursor
main debug: Buffering 4%
main debug: Buffering 5%
main debug: Buffering 6%
main debug: Buffering 7%
main debug: Buffering 8%
main debug: Buffering 9%
main debug: Buffering 10%
main debug: auto hiding mouse cursor
main debug: Buffering 11%
rtp warning: 1 packet(s) lost
rtp warning: 1 packet(s) lost
rtp warning: 1 packet(s) lost
ts warning: discontinuity received 0x3 instead of 0xd (pid=256)
ts warning: discontinuity received 0x5 instead of 0xf (pid=256)
ts warning: discontinuity received 0x1 instead of 0xb (pid=256)
main debug: Buffering 12%
main debug: Buffering 13%
main debug: Buffering 14%
main debug: Buffering 15%
main debug: Buffering 16%
main debug: Buffering 17%
main debug: Buffering 18%
main debug: auto hiding mouse cursor
main debug: Buffering 19%
main debug: Buffering 20%
The problem with my approach above was that it was based on the ffmpeg example encode_video.c with some bits for stream output borrowed from google.
Thanks to #rotem I started putting together a standalone executable and stumbled on the example muxing.c in the ffmpeg examples.
This let me find the steps I was missing: set the stream index on the packet, and rescale the time:
av_packet_rescale_ts(pkt, context->time_base, stream->time_base);
pkt->stream_index = stream->index;
int ret = av_interleaved_write_frame(avfctx, pkt);

FFMPEG error when finding stream information with custom AVIOContext

I am writing software that takes in a file as a stream and decodes it. I have the following custom AVIO code for stream input:
/* Allocate a 4kb buffer for copying. */
std::uint32_t bufSize = 4096;
struct vidBuf
{
std::byte* ptr;
int size;
};
vidBuf tmpVidBuf = { const_cast<std::byte*>(videoBuffer.data()),
static_cast<int>(videoBuffer.size()) };
AVIOContext *avioContext =
avio_alloc_context(reinterpret_cast<std::uint8_t*>(av_malloc(bufSize)),
bufSize, 0,
reinterpret_cast<void*>(&tmpVidBuf),
[](void *opaque, std::uint8_t *buf, int bufSize) -> int
{
auto &me = *reinterpret_cast<vidBuf*>(opaque);
bufSize = std::min(bufSize, me.size);
std::copy_n(me.ptr, bufSize, reinterpret_cast<std::byte*>(buf));
me.ptr += bufSize;
me.size -= bufSize;
return bufSize;
}, nullptr, nullptr);
auto avFormatPtr = avformat_alloc_context();
avFormatPtr->pb = avioContext;
avFormatPtr->flags |= AVFMT_FLAG_CUSTOM_IO;
//avFormatPtr->probesize = tmpVidBuf.size;
//avFormatPtr->max_analyze_duration = 5000000;
avformat_open_input(&avFormatPtr, nullptr, nullptr, nullptr);
if(auto ret = avformat_find_stream_info(avFormatPtr, nullptr);
ret < 0)
logerror << "Could not open the video file: " << makeAVError(ret) << '\n';
However, when I run this code I get the error:
[mov,mp4,m4a,3gp,3g2,mj2 # 0x55d10736d580] stream 0, offset 0x30: partial file
[mov,mp4,m4a,3gp,3g2,mj2 # 0x55d10736d580] Could not find codec parameters for stream 0 (Video: h264 (avc1 / 0x31637661), none(tv, bt709), 540x360, 649 kb/s): unspecified pixel format
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.76.100
Duration: 00:04:08.41, start: 0.000000, bitrate: N/A
Stream #0:0(und): Video: h264 (avc1 / 0x31637661), none(tv, bt709), 540x360, 649 kb/s, SAR 1:1 DAR 3:2, 29.97 fps, 29.97 tbr, 30k tbn, 60k tbc (default)
Metadata:
handler_name : ISO Media file produced by Google Inc. Created on: 01/10/2021.
vendor_id : [0][0][0][0]
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : ISO Media file produced by Google Inc. Created on: 01/10/2021.
vendor_id : [0][0][0][0]
Assertion desc failed at libswscale/swscale_internal.h:677
Note the absence of the YUV420p part in the video stream data.
This is strange since if I run my program with a different mp4 file it works perfectly fine, this error only occurs with a specific mp4 file. I know that the mp4 file is valid since mpv can play it, and ffprobe is able to get its metadata:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'heard.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.76.100
Duration: 00:04:08.41, start: 0.000000, bitrate: 724 kb/s
Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 540x360 [SAR 1:1 DAR 3:2], 649 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : ISO Media file produced by Google Inc. Created on: 01/10/2021.
vendor_id : [0][0][0][0]
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 22050 Hz, mono, fltp, 69 kb/s (default)
Metadata:
handler_name : ISO Media file produced by Google Inc. Created on: 01/10/2021.
vendor_id : [0][0][0][0]
As you can see by my code I also tried setting analyzeduration and probesize, but this did not fix the issue.
I also know that this error is because of my custom io because when I have avformat_open_input open the file directly, it is able to be decoded just fine. I am new to ffmpeg, so I might have missed something simple.
As SuRGeoNix pointed out, I had not implemented a seek function for the AVIO context; I think this messed up FFMPEG since it could not figure out the size of the buffer. This is my now working code:
std::uint32_t bufSize = 4096;
struct vidBuf
{
std::byte* ptr;
std::byte* origPtr;
int size;
int fullSize;
};
vidBuf tmpVidBuf = { const_cast<std::byte*>(videoBuffer.data()),
const_cast<std::byte*>(videoBuffer.data()),
static_cast<int>(videoBuffer.size()),
static_cast<int>(videoBuffer.size()), };
AVIOContext *avioContext =
avio_alloc_context(reinterpret_cast<std::uint8_t*>(av_malloc(bufSize)),
bufSize, 0,
reinterpret_cast<void*>(&tmpVidBuf),
[](void *opaque, std::uint8_t *buf, int bufSize) -> int
{
auto &me = *reinterpret_cast<vidBuf*>(opaque);
bufSize = std::min(bufSize, me.size);
std::copy_n(me.ptr, bufSize, reinterpret_cast<std::byte*>(buf));
me.ptr += bufSize;
me.size -= bufSize;
return bufSize;
},
nullptr,
[](void *opaque, std::int64_t where, int whence) -> std::int64_t
{
auto me = reinterpret_cast<vidBuf*>(opaque);
switch(whence)
{
case AVSEEK_SIZE:
/* Maybe size left? */
return me->fullSize;
break;
case SEEK_SET:
if(me->fullSize > where)
{
me->ptr = me->origPtr + where;
me->size = me->fullSize - where;
}
else
return EOF;
break;
case SEEK_CUR:
if(me->size > where)
{
me->ptr += where;
me->size -= where;
}
else
return EOF;
break;
case SEEK_END:
if(me->fullSize > where)
{
me->ptr = (me->origPtr + me->fullSize) - where;
int curPos = me->ptr - me->origPtr;
me->size = me->fullSize - curPos;
}
else
return EOF;
break;
default:
/* On error, do nothing, return current position of file. */
logerror << "Could not process buffer seek: "
<< whence << ".\n";
break;
}
return me->ptr - me->origPtr;
});

FFMPEG Implement RTSP Client, high speed playback

I am writing software to play videos that have been recorded from NVR. I have completed most of the work, but there is one more feature that allows the user to change the play speed such as 0.5x, 2x, 4x, 8x ...
I searched the internet all day and still couldn't find any suggestions. Here is my summary code below.
auto pFormatCtx = avformat_alloc_context();
av_dict_set_int(&opts, "rw_timeout", 5000000, 0);
av_dict_set_int(&opts, "tcp_nodelay", 1, 0);
av_dict_set_int(&opts, "stimeout", 10000000, 0);
av_dict_set(&opts, "user_agent", "Mozilla/5.0", 0);
av_dict_set(&opts, "rtsp_transport", "tcp", 0);
av_dict_set(&opts, "rtsp_flags", "prefer_tcp", 0);
av_dict_set_int(&opts, "buffer_size", BUFSIZE, 0);
int err = avformat_open_input(&pFormatCtx, fullRtspUri, NULL, &opts);
if(err < 0)
return;
err = avformat_find_stream_info(pFormatCtx, NULL);
if (err < 0)
return;
pFormatCtx->flags |= AVFMT_FLAG_NONBLOCK;
pFormatCtx->flags |= AVFMT_FLAG_DISCARD_CORRUPT;
pFormatCtx->flags |= AVFMT_FLAG_NOBUFFER;
av_dump_format(pFormatCtx, 0, fullRtspUri, 0);
int videoStreamInd = -1;
for (int i = 0; i < pFormatCtx->nb_streams; i++)
{
AVStream* stream = pFormatCtx->streams[i];
if (stream->codecpar->codec_type == AVMEDIA_TYPE_VIDEO)
{
if (videoStreamInd == -1)
{
videoStreamInd = i;
break;
}
}
}
if (videoStreamInd == -1)
return;
auto videoStream = pFormatCtx->streams[videoStreamInd];
isRunning = true;
while(isRunning)
{
ret = av_read_frame(pFormatCtx, avPacket);
if (ret < 0)
return;
if (avPacket->stream_index != videoStreamInd)
continue;
//Code for render process here............
}
I have read through this NVR API documentation and see support for 2x, 4x speed play as below
Play in 2× Speed:
PLAY rtsp://10.17.133.46:554/ISAPI/streaming/tracks/101?starttime=20170313T230652Z&endtime=20170314T025706Z RTSP/1.0
CSeq:6
Authorization: Digest username="admin",
realm="4419b66d2485",
nonce="a0ecd9b1586ff9461f02f910035d0486",
uri="rtsp://10.17.133.46:554/ISAPI/streaming/tracks/101?starttime=20170313T230652Z&endtime=20170314T025706Z",
response="fb986d385a7d839052ec4f0b2b70c631"
Session:2049381566;timeout=60
Scale:2.000
User-Agent:NKPlayer-1.00.00.081112
RTSP/1.0 200 OK
CSeq: 6
Session: 2049381566
Scale: 2.000
RTP-Info: url=trackID=1;seq=1,url=trackID=2;seq=1
Date: Tue, Mar 14 2017 10:57:24 GMT
How to play RTSP video with speeds of 0.5x, 2x, 4x ...?
Everyone who can assist me in this case, I am very grateful.
Somewhere in the "Code for render process here..." you should have code that figures out the delay between presentation/display time for each frame. Whatever that delay is, half it to play twice as fast, double it to play twice as slow and so on. You say this is playing from an NVR, which indicates the stream is probably local and accessed as a media file on disk? If so, it's the speed of the disk and local network connection to the NVR that will limit your ability to play as fast as you want. With my own ffmpeg player code, reading a media file from a M.2 NVMe disk can easily reach 400-700 fps by removing any delay between frames.

FFMPEG library- transcode raw image to h264 stream, and the output file does not contains pts and dts info

I am trying using ffmpeg c++ library to convert several raw yuyv image to h264 stream, the image come from memory and passed as string about 24fps, i do the convention as the following steps:
init AVFormatContext,AVCodec,AVCodecContext and create new AVStream. this step i mainly refer to ffmpeg-libav-tutorial, and AVFormatContext use customize write_buffer() function(refer to simplest_ffmpeg_mem_handler)
receive raw frame data, set width and height(1920x1080), and set pts and dts. here i manually set the output fps to 24, and use a global counter to count num of frames, and the pts is calculated by this counter, code snippet(video_avs is AVStream,output_fps is 24 and time_base is 1/24):
input_frame->width = w; // 1920
input_frame->height = h; // 1080
input_frame->pkt_dts = input_frame->pts = global_pts;
global_pts += video_avs->time_base.den/video_avs->time_base.num / output_fps.num * output_fps.den;
convert it from yuyv to yuv422(because h264 does not support yuyv) and resize it from 1920x1080 to 640x480(because i need this resolution output), use sws_scale()
use avcodec_send_frame() and avcodec_receive_packet() to get the output packet. set output_packet duration and stream_index, then use av_write_frame() to write frame data.
AVPacket *output_packet = av_packet_alloc();
int response = avcodec_send_frame(encoder->video_avcc, frame);
while (response >= 0) {
response = avcodec_receive_packet(encoder->video_avcc, output_packet); // !! here output_packet.size is calculated
if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {
break;
}
else if (response < 0) {
printf("Error while sending packet to decoder"); // ??av_err2str(response)会报错
return response;
}
// duration = next_pts - this_pts = timescale / fps = 1 / timebase / fps
output_packet->duration = (encoder->video_avs->time_base.den / encoder->video_avs->time_base.num) / (output_fps.num / output_fps.den);
output_packet->stream_index = 0;
int response = av_write_frame(encoder->avfc, output_packet); // packet order are not ensure
if (response != 0) { printf("Error %d while receiving packet from decoder", response); return -1;}
}
av_packet_unref(output_packet);
av_packet_free(&output_packet);
in write_buffer() function, video stream output is stored to string variable, and then i write this string to file with ostream, and suffix mp4.
after all the above steps, the output.mp4 cannot be played, the ffprobe output.mp4 -show_frames output is
(image):
Input #0, h264, from '/Users/ming/code/dev/haomo/output.mp4':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264 (High 4:2:2), yuv422p(progressive), 640x480, 24.92 fps, 24 tbr, 1200k tbn, 48 tbc
[FRAME]
media_type=video
stream_index=0
key_frame=1
pkt_pts=N/A
pkt_pts_time=N/A
pkt_dts=N/A
pkt_dts_time=N/A
best_effort_timestamp=N/A
best_effort_timestamp_time=N/A
Note that before and after calling av_write_frame() in step 4, the passed argument output_packet contains correct pts and dts info, i cannot figure out why the output stream lost these info.
I figure it out, the output stream is a raw h264 stream, and I directly store this stream into a file with a suffix of ".mp4", so it is actually not a correct mp4 file.
Then it store the stream into output.h264 file, and use ffmpeg to convert it to mp4 file: ffmpeg -framerate 24 -i output.h264 -c copy output.mp4, finally this output.mp4 contains right pts data and can be played.

AVCodecContext::channel_layout 0 for WAV files

I have been successfully loading compressed audio files using FFmpeg and querying their channel_layouts using some code I've written:
AVFormatContext* fmtCxt = nullptr;
avformat_open_input( &fmtCxt, "###/440_sine.wav", nullptr, nullptr );
avformat_find_stream_info( fmtCxt, nullptr );
av_find_best_stream( fmtCxt, AVMEDIA_TYPE_AUDIO, -1, -1, nullptr, 0 );
AVCodecContext* codecCxt = fmtCxt->streams[ret]->codec;
AVCodec* codec = avcodec_find_decoder( codecCxt->codec_id );
avcodec_open2( codecCxt, codec, nullptr );
std::cout << "Channel Layout: " << codecCxt->channel_layout << std::endl;
av_dump_format( fmtCxt, 0, "###/440_sine.wav", 0 );
I've removed all error checking for brevity. However for Microsoft WAV files (mono or stereo) the AVCodecContext::channel_layout member is always 0 - despite ffprobe and av_dump_format(..) both returning valid information:
Input #0, wav, from '###/440_sine.wav':
Duration: 00:00:00.01, bitrate: 740 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 1 channels, s16, 705 kb/s
Also codecCxt->channels returns the correct value. Using a flac file (with exactly the same audio data generated from the same application), gives a channel_layout of 0x4 (AV_CH_FRONT_CENTER).
Your WAV file uses FFmpeg's pcm_s16le codec, which have no information on channel layout. You can only have the number of channels. A lot of explanations can be found here
You have the correct channel_layout with the flac file because FFmpeg's flac codec fills this field. You can find the correspondence table on libavcodec/flac.c file, the flac_channel_layouts array.
If you need to fill channel_layout manually, you can call:
codecCxt->channel_layout = av_get_default_channel_layout( codecCxt->channels );