I'm currently writing a small application that's making use of the FFmpeg library in order to decode audio files (especially avformat and swresample) in C++.
Now I need the total number of samples in an audio stream. I know that the exact number can only be found out by actually decoding all the frames, I just need an estimation. What is the preferred method here? How can I find out the duration of a file?
There's some good info in this question about how to get info out of ffmpeg: FFMPEG Can't Display The Duration Of a Video.
To work out the number of samples in an audio stream, you need three basic bits of info:
The duration (in seconds)
The sample rate (in samples per second)
The number of channels in the stream (e.g. 2 for stereo)
Once you have that info, the total number of samples in your stream is simply [duration] * [rate] * [channels].
Note that this is not equivalent to bytes, as the samples are likely to be at least 16 bit, and possibly 24.
I believe what you need is the formula that is AUDIORATE / FRAMERATE. For instance, if ar=48000, and frame rate of video is let's say 50fps then 48000/50 = 960 samples per frame you need.
Buffer calculation comes later as samples_per_frame * nChannels * (audiobit/8).
AudioBit is usually 16bit (24 or 32bits also possible). So for 8 channels audio at 16bit 48Khz, you'll need 960 * 8 * 2 = 15360 bytes per audio frame.
Offical way to do this last calculation is to use :
av_samples_get_buffer_size(NULL, nChannels, SamplesPerFrame, audio_st->codec->sample_fmt, 0)
function.
av_samples_get_buffer_size(NULL, 8, 960, audio_st->codec->sample_fmt, 0)
will return also 15360 (For experts: yes I'm assuming format is pcm_s16le).
So this answers first part of your question. Hope that helps.
Related
I am trying to make a .mp4 file from the audio and video rtp packets of an IP camera.
When the audio format in the camera is configured as MPEG-2 ADTS i receive rtp packets with a payload size of 144 bytes, but these are made of two audio frames 72 bytes each.
However, when i configure the format as MPEG-1 the payload is made of only one audio frame.
What is the reason for this difference? I could get this info from some bits of the payload? as i do for the bitrate, samplerate, etc. I have read that the theoric packet size is 144 bytes, so how could i retrieve the frame size and number of frames in the package?
Besides, in order to calculate the theoric frame duration i am using the next formula:
time = 1/bitrate * framesize (in bytes) * 8
This is working well in the case of MPEG-2 with different combinations of bitrate and samplerate. However, it does not seem to work for the MPEG-1. Am i doing something wrong here?
I am trying to build a simple transcoder that can take MP3 and WAV files, and segment them using the segment formatting option, while also possibly changing the sample rate, bit rate and channel layout.
For this, I followed the code in the transcoding.c example. The issue is that when trying to transcode from a 32K HZ MP3 to 48K HZ MP3. The problem is that the MP3 encoder expects 1152 frame size, but libavfilter provides me with frames that contain 1254 number of samples. So when I try to do the encoding, I get this message: more samples than frame size. This problem can also be reproduced using the example code, just set the sample rate of the encoder to 48K.
One option is to use the asetnsamples filter, and set it to 1152, that will fix upsampling to 48K, but then downsampling to 24K won't work, because the encoder expects frame sizes of 576.
I wouldn't want to set this filter's value depending on the input information, it may become messy later if I support more file types, such as AAC.
Is there any way of making the libavfilter libraries know about this flow, and trigger proper filtering and transcoding without having to use lower level APIs, like libswresample or doing frame buffering?
sample rate is 44.1kHZ
Bits per sample is 16
so BPS Will be 2*16*44100 which leads 1411.2 KBPS
A ratio is simply one quantity divided by another quantity, assuming both quantities are expressed in the same units.
CD audio does indeed provide two stereo 16-bit samples 44100 times per second, for a total of 1411200 bits per second. This is accompanied on disk with a header, but this is usually very small compared to the audio data.
At 96 Kbps, an MP3-format compressed audio file can be assumed to use 96000 bits per second.
Dividing one by the other gives a compression ratio of 14.7:1.
I have a fpga board and I write a VHDL code that can get Images (in binary) from serial port and save them in a SDRAM on my board. then FPGA display images on a monitor via a VGA cable. my problem is filling the SDRAM take to long(about 10 minutes with 115200 baud rate).
on my computer I wrote a python code to send image(in binary) to FPGA via serial port. my code read binary file that saved in my hard disk and send them to FPGA.
my question is if I use buffer to save my images insted of binary file, do I get a better result? if so, can you help me how to do that, please? if not, can you suggest me a solution, please?
thanks in advans,
Unless you are significantly compressing before download, and decompressing the image after download, the problem is your 115,200 baud transfer rate, not the speed of reading from a file.
At the standard N/8/1 line encoding, each byte requires 10 bits to transfer, so you will be transferring 1150 bytes per second.
In 10 minutes, you will transfer 1150 * 60 * 10 = 6,912,000 bytes. At 3 bytes per pixel (for R, G, and B), this is 2,304,600 pixels, which happens to be the number of pixels in a 1920 by 1200 image.
The answer is to (a) increase the baud rate; and/or (b) compress your image (using something simple to decompress on the FPGA like RLE, if it is amenable to that sort of compression).
I'm trying to write some 4:2:0 rawdata received from a capture card into an AVI-File. For every pixel the char buffer contains 2 Bytes (16 Bit). The order of the data is the same as FOURCC UYVY: YUV 4:2:2 (Y sample at every pixel, U and V sampled at every second pixel horizontally on each line). A macropixel contains 2 pixels in 1 u_int32.
First I tried the OpenCV Videowriter. But this is simply too slow for this huge amount of video data (I'm capturing 2 video streams, each is 1080p25 format), so I switched to the "Video for Windows"-Library by Windows. But even this one does't proceed the file writing in real time. My last chance is Directshow. I want to use the AVI Mux and the File Writer Filters to store my raw data as an AVI-File, but I'm not shure how to "give" the AVI Mux my raw data (char array) which contains just video data in UYVY-order and no audio. Maybe you can give me an advice. This is what I've got until now:
CoInitialize(NULL);
IGraphBuilder*pGraph= NULL;
CoCreateInstance(CLSID_FilterGraph, NULL,CLSCTX_INPROC_SERVER,IID_IGraphBuilder, (void **)&pGraph);
IMediaControl*pMediaControl= NULL;
pGraph->QueryInterface(IID_IMediaControl,(void **)&pMediaControl);
ICaptureGraphBuilder2 *pCapture= NULL;
CoCreateInstance(CLSID_CaptureGraphBuilder2, NULL,CLSCTX_INPROC,IID_ICaptureGraphBuilder2, (void **)&pCapture);
IBaseFilter *pMux;
pCapture->SetOutputFileName(&MEDIASUBTYPE_Avi,L"Test.avi",&pMux,NULL);
IBaseFilter *pCap;
pCapture->RenderStream(&PIN_CATEGORY_CAPTURE, &MEDIATYPE_Video,pCap,NULL,pMux);
Thx a lot and Regards,
Valentin
(As you mentioned 10 fps in previous question, which I assume to be effective frame rate) Are you writing dual 1920x1080 12 bits per pixel 10 fps into a file? This is 60 megabytes per second, you might be just hitting your HDD writing capacity limit.
Choosing different API is not going to help if your HDD is not powerful enough. You need to either compress data, or lower resolution or FPS. Or use faster drives.