Zstd decompression error - Unknown frame descriptor - compression

I'm trying to decompress a .zst file the following way :
public byte[] decompress() {
byte[] compressedBytes = Files.readAllBytes(Paths.get(PATH_TO_ZST));
final long size = Zstd.decompressedSize(compressedBytes);
return Zstd.decompress(compressedBytes, (int)size);
}
and I'm running into this :
com.github.luben.zstd.ZstdException: Unknown frame descriptor [java] com.github.luben.zstd.ZstdDecompressCtx.decompressByteArray(ZstdDecompressCtx.java:157) [java] com.github.luben.zstd.ZstdDecompressCtx.decompress(ZstdDecompressCtx.java:214) [java]
Has anyone faced something similar? Thanks!

That error means zstd doesn't recognize the first 4 bytes of the frame. This can be because either:
The data is not in zstd format,
There is excess data at the end of the zstd frame.
You'll also want to check the output of Zstd.decompressedSize() for 0, which means the frame is corrupted, or the size wasn't present in the frame header. See the documentation.

Related

Ffmpeg video output is 0 seconds with correct filesize when uploading to google cloud bucket

I've made a C++ program that lives in gke and takes some videos as input using ffmpeg, then does something with that input using opengl(not relevant), then finally encodes those edited videos as a single output. Normally the program works perfectly fine on my local machine, it encodes just as I want it to with no warnings or valgrind errors whatsoever. Then, after encoding the said video, I want my program to upload that video to the google cloud storage. This is where the problem comes, I have tried 2 methods for this: First, I tried using curl to upload to the cloud using a signed url. Second, I tried mounting the google storage using gcsfuse(I was already mounting the bucket to access the inputs in question). Both of those methods yielded undefined, weird behaviour's ranging from: Outputing a 0byte or 44byte file, (This is the most common one:) encoding in the correct file size ~500mb but the video is 0 seconds long, outputing a 0.4 second video or just encoding the desired output normally (really rare).
From the logs I can't see anything unusual, everything seems to work fine and ffmpeg does not give any errors or warnings, so does valgrind. Everything seems to work normally, even when I use curl to upload the video to the cloud the output is perfectly fine when it first encodes it (before sending it with curl) but the video gets messed up when curl uploads it to the cloud.
I'm using the muxing.c example of ffmpeg to encode my video with the only difference being:
void video_encoder::fill_yuv_image(AVFrame *frame, struct SwsContext *sws_context) {
const int in_linesize[1] = { 4 * width };
//uint8_t* dest[4] = { rgb_data, NULL, NULL, NULL };
sws_context = sws_getContext(
width, height, AV_PIX_FMT_RGBA,
width, height, AV_PIX_FMT_YUV420P,
SWS_BICUBIC, 0, 0, 0);
sws_scale(sws_context, (const uint8_t * const *)&rgb_data, in_linesize, 0,
height, frame->data, frame->linesize);
}
rgb_data is the data I got after editing the inputs. Again, this works fine and I don't think there are any errors here.
I'm not sure where the error is and since the code is huge I can't provide a replicable example. I'm just looking for someone to point me to the right direction.
Running the cloud's output in mplayer wields this result (This is when the video is the right size but is 0 seconds long, the most common one.):
MPlayer 1.4 (Debian), built with gcc-11 (C) 2000-2019 MPlayer Team
do_connect: could not connect to socket
connect: No such file or directory
Failed to open LIRC support. You will not be able to use your remote control.
Playing /media/c36c2633-d4ee-4d37-825f-88ae54b86100.
libavformat version 58.76.100 (external)
libavformat file format detected.
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7f2cba1168e0]moov atom not found
LAVF_header: av_open_input_stream() failed
libavformat file format detected.
[mov,mp4,m4a,3gp,3g2,mj2 # 0x7f2cba1168e0]moov atom not found
LAVF_header: av_open_input_stream() failed
RAWDV file format detected.
VIDEO: [DVSD] 720x480 24bpp 29.970 fps 0.0 kbps ( 0.0 kbyte/s)
X11 error: BadMatch (invalid parameter attributes)
Failed to open VDPAU backend libvdpau_nvidia.so: cannot open shared object file: No such file or directory
[vdpau] Error when calling vdp_device_create_x11: 1
==========================================================================
Opening video decoder: [ffmpeg] FFmpeg's libavcodec codec family
libavcodec version 58.134.100 (external)
[dvvideo # 0x7f2cb987a380]Requested frame threading with a custom get_buffer2() implementation which is not marked as thread safe. This is not supported anymore, make your callback thread-safe.
Selected video codec: [ffdv] vfm: ffmpeg (FFmpeg DV)
==========================================================================
Load subtitles in /media/
==========================================================================
Opening audio decoder: [libdv] Raw DV Audio Decoder
Unknown/missing audio format -> no sound
ADecoder init failed :(
Opening audio decoder: [ffmpeg] FFmpeg/libavcodec audio decoders
[dvaudio # 0x7f2cb987a380]Decoder requires channel count but channels not set
Could not open codec.
ADecoder init failed :(
ADecoder init failed :(
Cannot find codec for audio format 0x56444152.
Audio: no sound
Starting playback...
[dvvideo # 0x7f2cb987a380]could not find dv frame profile
Error while decoding frame!
[dvvideo # 0x7f2cb987a380]could not find dv frame profile
Error while decoding frame!
V: 0.0 2/ 2 ??% ??% ??,?% 0 0
Exiting... (End of file)
Edit: Since the code runs on a VM, I'm using xvfb-run ro start my application, but again even when using xvfb-run it works completely fine on when not encoding to the cloud.
Apparently, I'm assuming for security reasons, the google cloud storage does not allow us to do multiple continuous operations on a file, just a singular read/write operation. So I found a workaround by encoding my video to a local file inside the pod and then doing a copy operation to the cloud.

How do I properly unwrap FLV video into raw and valid h264 segments for gstreamer buffers?

I have written an RTMP server in rust that successfully allows RTMP publishers to connect, push a video stream, and RTMP clients can connect and watch those video streams successfully.
When a video RTMP packet comes in, I attempt to unwrap the video from the FLV container via:
// TODO: The FLV spec has the AVCPacketType and composition time as the first parts of the
// AVCPACKETTYPE. It's unclear if these two fields are part of h264 or FLV specific.
let flv_tag = data.split_to(1);
let is_sequence_header;
let codec = if flv_tag[0] & 0x07 == 0x07 {
is_sequence_header = data[0] == 0x00;
VideoCodec::H264
} else {
is_sequence_header = false;
VideoCodec::Unknown
};
let is_keyframe = flv_tag[0] & 0x10 == 0x10;
After this runs data contains the AVCVIDEOPACKET with the flv tag removed. When I send this video to other RTMP clients i just prepend the correct flv tag to it and send it off.
Now I am trying to pass the video packets to gstreamer in order to do in process transcoding. To do this I set up an appsrc | avdec_264 pipeline, and gave the appsrc component the following caps:
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "nal")
.field("stream-format", "byte-stream")
.build()
));
Now when an RTMP publisher sends a video packet, I take the (attempted) unwrapped video packet and pass it to my appsrc via
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len()).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 0..data.len() {
samples[index] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
When this occurs the following gstreamer debug output appears
2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #0 (is_sequence_header:true, is_keyframe=true)
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Connection 63397d56-16fb-4b54-a622-d991b5ad2d8e sent audio data
0:00:05.531722000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event bytes segment start=0, offset=0, stop=-1, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0, base=0, position 0, duration -1
0:00:05.533525000 7516 000001C0C04011C0 INFO basesrc gstbasesrc.c:3018:gst_base_src_loop:<video_source> marking pending DISCONT
0:00:05.535385000 7516 000001C0C04011C0 WARN videodecoder gstvideodecoder.c:2818:gst_video_decoder_chain:<video_decode> Received buffer without a new-segment. Assuming timestamps start from 0.
0:00:05.537381000 7516 000001C0C04011C0 INFO GST_EVENT gstevent.c:973:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:00.000000000, duration 99:99:99.999999999
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #1 (is_sequence_header:false, is_keyframe=true)
0:00:05.563445000 7516 000001C0C04011C0 INFO libav :0:: Invalid NAL unit 0, skipping.
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #2 (is_sequence_header:false, is_keyframe=false)
0:00:05.579274000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.581338000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.583337000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
[2022-02-09T18:25:15Z INFO gstreamer_mmids_scratchpad] Pushing packet #3 (is_sequence_header:false, is_keyframe=false)
0:00:05.595253000 7516 000001C0C04011C0 ERROR libav :0:: No start code is found.
0:00:05.597204000 7516 000001C0C04011C0 ERROR libav :0:: Error splitting the input into NAL units.
0:00:05.599262000 7516 000001C0C04011C0 WARN libav gstavviddec.c:2068:gst_ffmpegviddec_handle_frame:<video_decode> Failed to send data for decoding
Based on this I figured this might be caused by the non-data portions of the AVCVIDEOPACKET not being part of the h264 flow, but an FLV specific flow. So I tried ignoring the first 4 bytes (AVCPacketType and CompositionTime fields) of each packet I wrote to the buffer:
pub fn push_video(&self, data: Bytes, timestamp: RtmpTimestamp) {
let mut buffer = Buffer::with_size(data.len() - 4).unwrap();
{
let buffer = buffer.get_mut().unwrap();
buffer.set_pts(ClockTime::MSECOND * timestamp.value as u64);
let mut samples = buffer.map_writable().unwrap();
{
let samples = samples.as_mut_slice();
for index in 4..data.len() {
samples[index - 4] = data[index];
}
}
}
self.video_source.push_buffer(buffer).unwrap();
}
This essentially gave me the same logging output and errors. This is reproducible with the h264parse plugin as well.
What am I missing in the unwrapping process to pass raw h264 video to gstreamer?
Edit:
Realizing I misread the pad template I tried the following caps instead
video_source.set_caps(Some(
&Caps::builder("video/x-h264")
.field("alignment", "au")
.field("stream-format", "avc")
.build()
));
This also failed with pretty simmilar output.
I think I finally figured this out.
The first thing is that I need to include removing the AVCVIDEOPACKET headers (packet type and composition time fields). These are not part of the h264 format and thus cause parsing errors.
The second thing I needed to do was to not pass the sequence header as a buffer to the source. Instead the sequence header bytes need to be set as the codec_data field for the appsrc's caps. This now allows for no parsing errors when passing the video data to h264parse, and even gives me a correctly sized window.
The third thing I was missing is the correct dts and pts values. It turns out the RTMP timestamp I'm given is the dts, and pts = AVCVIDEOPACKET.CompositionTime + dts.

Media Foundation video re-encoding producing audio stream sync offset

I'm attempting to write a simple windows media foundation command line tool to use IMFSourceReader and IMFSyncWriter to load in a video, read the video and audio as uncompressed streams and re-encode them to H.246/AAC with some specific hard-coded settings.
The simple program Gist is here
sample video 1
sample video 2
sample video 3
(Note: the video's i've been testing with are all stereo, 48000k sample rate)
The program works, however in some cases when comparing the newly outputted video to the original in an editing program, I see that the copied video streams match, but the audio stream of the copy is pre-fixed with some amount of silence and the audio is offset, which is unacceptable in my situation.
audio samples:
original - |[audio1] [audio2] [audio3] [audio4] [audio5] ... etc
copy - |[silence] [silence] [silence] [audio1] [audio2] [audio3] ... etc
In cases like this the first video frames coming in have a non zero timestamp but the first audio frames do have a 0 timestamp.
I would like to be able to produce a copied video who's first frame from the video and audio streams is 0, so I first attempted to subtract that initial timestamp (videoOffset) from all subsequent video frames which produced the video i wanted, but resulted in this situation with the audio:
original - |[audio1] [audio2] [audio3] [audio4] [audio5] ... etc
copy - |[audio4] [audio5] [audio6] [audio7] [audio8] ... etc
The audio track is shifted now in the other direction by a small amount and still doesn't align. This can also happen sometimes when a video stream does have a starting timestamp of 0 yet WMF still cuts off some audio samples at the beginning anyway (see sample video 3)!
I've been able to fix this sync alignment and offset the video stream to start at 0 with the following code inserted at the point of passing the audio sample data to the IMFSinkWriter:
//inside read sample while loop
...
// LONGLONG llDuration has the currently read sample duration
// DWORD audioOffset has the global audio offset, starts as 0
// LONGLONG audioFrameTimestamp has the currently read sample timestamp
//add some random amount of silence in intervals of 1024 samples
static bool runOnce{ false };
if (!runOnce)
{
size_t numberOfSilenceBlocks = 1; //how to derive how many I need!? It's aribrary
size_t samples = 1024 * numberOfSilenceBlocks;
audioOffset = samples * 10000000 / audioSamplesPerSecond;
std::vector<uint8_t> silence(samples * audioChannels * bytesPerSample, 0);
WriteAudioBuffer(silence.data(), silence.size(), audioFrameTimeStamp, audioOffset);
runOnce= true;
}
LONGLONG audioTime = audioFrameTimeStamp + audioOffset;
WriteAudioBuffer(dataPtr, dataSize, audioTime, llDuration);
Oddly, this creates an output video file that matches the original.
original - |[audio1] [audio2] [audio3] [audio4] [audio5] ... etc
copy - |[audio1] [audio2] [audio3] [audio4] [audio5] ... etc
The solution was to insert extra silence in block sizes of 1024 at the beginning of the audio stream. It doesn't matter what the audio chunk sizes provided by IMFSourceReader are, the padding is in multiples of 1024.
My problem is that there seems to be no detectable reason for the the silence offset. Why do i need it? How do i know how much i need? I stumbled across the 1024 sample silence block solution after days of fighting this problem.
Some videos seem to only need 1 padding block, some need 2 or more, and some need no extra padding at all!
My question here are:
Does anyone know why this is happening?
Am I using Media Foundation incorrectly in this situation to cause this?
If I am correct, How can I use the video metadata to determine if i need to pad an audio stream and how many 1024 blocks of silence need to be in the pad?
EDIT:
For the sample videos above:
sample video 1 : the video stream starts at 0 and needs no extra blocks, passthrough of original data works fine.
sample video 2 : video stream starts at 834166 (hns) and needs 1 1024 block of silence to sync
sample video 3 : video stream starts at 0 and needs 2 1024 blocks of silence to sync.
UPDATE:
Other things I have tried:
Increasing the duration of the first video frame to account for the offset: Produces no effect.
I wrote another version of your program to handle NV12 format correctly (yours was not working) :
EncodeWithSourceReaderSinkWriter
I use Blender as video editing tools. Here is my results with Tuning_against_a_window.mov :
from the bottom to the top :
Original file
Encoded file
I changed the original file by settings "elst" atoms with the value of 0 for number entries (I used Visual Studio hexa editor)
Like Roman R. said, MediaFoundation mp4 source doesn't use the "edts/elst" atoms. But Blender and your video editing tools do. Also the "tmcd" track is ignored by mp4 source.
"edts/elst" :
Edits Atom ( 'edts' )
Edit lists can be used for hint tracks...
MPEG-4 File Source
The MPEG-4 file source silently ignores hint tracks.
So in fact, the encoding is good. I think there is no audio stream sync offset, comparing to the real audio/video data. For example, you can add "edts/elst" to the encoded file, to get the same result.
PS: on the encoded file, i added "edts/elst" for both audio/video tracks. I also increased size for trak atoms and moov atom. I confirm, Blender shows same wave form for both original and encoded file.
EDIT
I tried to understand relation between mvhd/tkhd/mdhd/elst atoms, in the 3 video samples. (Yes I know, i should read the spec. But i'm lazy...)
You can use a mp4 explorer tool to get atom's values, or use the mp4 parser from my H264Dxva2Decoder project :
H264Dxva2Decoder
Tuning_against_a_window.mov
elst (media time) from tkhd video : 20689
elst (media time) from tkhd audio : 1483
GREEN_SCREEN_ANIMALS__ALPACA.mp4
elst (media time) from tkhd video : 2002
elst (media time) from tkhd audio : 1024
GOPR6239_1.mov
elst (media time) from tkhd video : 0
elst (media time) from tkhd audio : 0
As you can see, with GOPR6239_1.mov, media time from elst is 0. That's why there is no video/audio sync problem with this file.
For Tuning_against_a_window.mov and GREEN_SCREEN_ANIMALS__ALPACA.mp4, i tried to calculate the video/audio offset.
I modified my project to take this into account :
EncodeWithSourceReaderSinkWriter
For now, i didn't find a generic calculation for all files.
I just find the video/audio offset needed to encode correctly both files.
For Tuning_against_a_window.mov, i begin encoding after (movie time - video/audio mdhd time).
For GREEN_SCREEN_ANIMALS__ALPACA.mp4, i begin encoding after video/audio elst media time.
It's OK, but I need to find the right unique calculation for all files.
So you have 2 options :
encode the file and add elst atom
encode the file using right offset calculation
it depends on your needs :
The first option permits you to keep the original file.But you have to add the elst atom
With the second option you have to read atom from the file before encoding, and the encoded file will loose few original frames
If you choose the first option, i will explain how I add the elst atom.
PS : i'm intersting by this question, because in my H264Dxva2Decoder project, the edts/elst atom is in my todo list.
I parse it, but i don't use it...
PS2 : this link sounds interesting :
Audio Priming - Handling Encoder Delay in AAC

ffmpeg based multi threaded c++ application fails on decoding

I am using remuxing example from ffmpeg sources as reference. I wrote a multi-threaded application based on boost threads to perform a codec copy and remux using ffmpeg API. That works fine . The problem arises when I try to decode the frame
"
ret = avcodec_decode_video2(dec_ctx, frame, &got_frame, &pkt);
if (ret < 0) {
av_log(NULL, AV_LOG_ERROR, "Error decoding video %s\n",av_make_error_string(errorBuff,80,ret));
return -1;
}"
I need the decoded frame to convert it to Opencv Mat object. For a single instance this code works fine. But as soon as I run multiple threads I start getting decoding errors like these
left block unavailable for requested intra mode at 0 0
[h264 # 0x7f9a48115100] error while decoding MB 0 0, bytestream 1479
[h264 # 0x7f9a480825e0] number of reference frames (0+2) exceeds max (1; probably corrupt input), discarding one
[h264 # 0x7f9a480ae680] error while decoding MB 13 5, bytestream -20
[h264 # 0x7f9a48007700] number of reference frames (0+2) exceeds max (1; probably corrupt input), discarding one
[h264 # 0x7f9a48110340] top block unavailable for requested intra4x4 mode -1 at 31 0
[h264 # 0x7f9a48110340] error while decoding MB 31 0, bytestream 1226
[h264 # 0x7f9a48115100] number of reference frames (0+2) exceeds max (1; probably corrupt input), discarding one
[h264 # 0x7f9a480825e0] top block unavailable for requested intra4x4 mode -1 at 4 0
[h264 # 0x7f9a480825e0] error while decoding MB 4 0, bytestream 1292
[h264 # 0x7f9a480ae680] number of reference frames (0+2) exceeds max (1; probably corrupt input), discarding one
All variables used by ffmpeg api are declared local to the thread function. I am not sure how ffmpeg frame allocs or context allocs work.
any help in making the decoding process multi-threaded ?
Update:
I have included ff_lockmgr
static int ff_lockmgr(void **mutex, enum AVLockOp op)
{
pthread_mutex_t** pmutex = (pthread_mutex_t**) mutex;
switch (op) {
case AV_LOCK_CREATE:
*pmutex = (pthread_mutex_t*) malloc(sizeof(pthread_mutex_t));
pthread_mutex_init(*pmutex, NULL);
break;
case AV_LOCK_OBTAIN:
pthread_mutex_lock(*pmutex);
break;
case AV_LOCK_RELEASE:
pthread_mutex_unlock(*pmutex);
break;
case AV_LOCK_DESTROY:
pthread_mutex_destroy(*pmutex);
free(*pmutex);
break;
}
return 0;
}
and initialized it as well "av_lockmgr_register(ff_lockmgr);"
Now the video is being decoded in all threads BUT the images saved from the decoded frame using FFMPEG AVFrame to OpenCv Mat conversion and imwrite results in garbled (mixed) frame. Part of the frame is from one camera and rest is from another or the image doesnt make any sense at all.
Not every format decoder supports multiple threads, and even for the decoders which support it, it might not be supported for a particular file.
For example, consider a MPEG4 file with a single keyframe at the beginning, followed by P frames. In this case every next frame depends on previous, and using multiple threads would not likely produce any benefits.
In my app I had to disable multithreaded encoders because of that.

FMOD using excessive codec memory with non-streamed samples

I'm using FMODEx 4.40.10 currently. We load OGG samples like this:
uint uiFlags(FMOD_SOFTWARE | FMOD_LOWMEM | FMOD_CREATESAMPLE);
FMOD::SOUND* pSound(NULL);
m_pFMODSystem->createSound(sPath, uiFlags, NULL, &pSound);
When looking at the memory usage of this sound via:
FMOD_MEMORY_USAGE_DETAILS usage;
pSound->getMemoryInfo(FMOD_MEMBITS_ALL, 0, NULL, &usage);
usage.codec reports greater than 0. This doesn't make sense to me, as the FMOD documentation states that FMOD_MEMORY_USAGE_DETAILS::codec is:
codec
[out] Codecs allocated for streaming
As you can see from how sounds are loaded, there should be no streaming.
With multiple OGG files loaded, when I query the system's memory usage, it shows codec being a large number - all of the individual file codec usages added together. The memory numbers being reported by FMOD match the memory usage that I see from my own memory profiling.
When I load raw PCM data, usage.codec reports as 0.
Why is "codec" greater than 0 when I'm loading non-streaming OGG files? Is there a way to disable this memory usage?
Edit: As a test, after loading the OGG, I extract the PCM data and have FMOD create a new sound. I then free the sound made from the OGG and replace it with the new sound loaded from the PCM data. This works flawlessly. This is further evidence that the codec memory it has allocated is unnecessary.
http://www.fmod.org/forum/viewtopic.php?f=7&t=15762
FMOD confirmed my findings and said the workaround of loading the compressed sound, extracting the PCM data, unloading the sound, and then creating a new one from the extracted PCM data is fine.