I have been using a recorder (based on muxer example) satisfactorily for quite some time for various formats. Now I need to use uncompressed audio to go with MJPEG video and I notice video speeds up considerable (like 10 times as fast) in the recorded file. Audio is OK, and if I use a compressed audio format (like mp3) video is fine as always. Does anyone have an idea why video speeds up the moment I use uncompressed audio (CODEC_ID_PCM_S16LE)?
Related
I'm creating avi videos from device dependent bitmaps, DDB's.
The pipeline is quite simple, a GigE camera provides frame by frame, and each frame, a DDB, is piped to a ffmpeg process creating a final AVI file, using h264 compression.
These videos are scientific in nature, and we would like to store/embed experimental hardware information, such as the states of a few digital lines, with each frame.
This information need to be available in the final avi video
Question is, is this possible?
Looking at this: https://learn.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-bitmap it does not seem that adding additional data to the DDB themselves is possible, but I'm not sure.
I am having a task to build a decoder that generates exactly 1 raw audio frame for 1 raw video frame, from an encoded mpegts network stream, so that users can use the API by calling getFrames() and receive exactly these two frames.
Currently I am reading with av_read_frame in a thread, decode as packets come, audio or video; collect until a video packet is hit. Problem is generally multiple audio packets are received before video is seen.
av_read_frame is blocking, returns when certain amount of audio data is collected (1152 samples for mp2); and decoding that packet gives a raw AVFrame having duration of T (depends on samplerate); whereas the video frame generally has duration bigger than T (depends on fps), so multiple audio frames are received before it.
I was guessing I have to find a way to merge collected audio frames into 1 single frame just when video is hit. Also resampling and setting timestamp to align with video is needed I guess. I don't know if this is even valid though.
What is the smoothest way to sync video and audio in this manner ?
Has anyone tried to modify the CISCO openh264 library to take JPEG images as input and compress them into P and I frames (output as frames, NOT video) and similarly to modify decoder to take compressed P and I frames and generate uncompressed-frames ?
I have a camera looking at a static scene and taking pictures (1280x720p) every 30 second. The scene is almost static. Currenlty I am using JPEG compression to compress each frame individually and it is resulting in an image size of ~270KB. This compressed frame is transferred via internet to a storage server. Since there is very little motion in the scene, the 'I' frame size will be very small (I think it should be ~20-50KB). So it will be very cost effective to transmit I frames over internet instead of JPEG images.
Can anyone guide me to some project or about how to proceed with this task ?
You are describing exactly what a codec does. It takes images, and compresses them. There relationship in time is irrelevant to the compression step. The decoder than decides how to display or just write them to disk. You don't need to modify open264, what you want to do is exactly what it is designed to do.
Anybody, has tried upsampling audio stream from 8K to 44.1K?
I need to resample input audio stream 8KHz to 44.1K since Mac OSX default audio output device support minimum 44.1K audio sampling rate.
I tried to up-sampling using FFMPEG swr_convert() API, it converts with lots of noise. Which is not good.
If anybody has tried successfully upscale 8K to 44.1 or 48K then please share it.
Solution with C/C++ library code is preferable. Didn't tried Core-audio samples.
I Tried swr_convert() code from following link https://www.ffmpeg.org/doxygen/2.1/group__lswr.html#details
Thanks,
Ramanand
I have a two dump files of raw video and raw audio from an encoder and I want to be able to measure the "Lip-sync". Imagine a video of a hammer striking an anvil. I want to go frame by frame and see that when the hammer finally hits the anvil, there is a spike in amplitude on the audio track.
Because of the speed that everything happens at, I cannot merely listen to the audio, i need to see the waveform in time domain.
Are there any tools out there that will let me see both the video and audio?
If you are concerned about validating a decoder then generally from a validation perspective the goal is to check Audio and Video PTS values against a common real time clock.
Raw YUV and PCM files do not include timestamps. If you know the frame-rate and sample-rate you can use a raw yuv file viewer (I wrote my own) to figure out the time (from start of file) of a given frame in the video, and a tool like Audacity to figure out the time form start of file to a start of tone in the audio file. this still may not tell you the whole story since tools usually embed a delay between the audio and video in the ts/ps file. Or you can hook up ab OScope and go old school.