Resample 8KHz audio sample rate to 44.1KHz using swr_convert[FFMPEG] - c++

Anybody, has tried upsampling audio stream from 8K to 44.1K?
I need to resample input audio stream 8KHz to 44.1K since Mac OSX default audio output device support minimum 44.1K audio sampling rate.
I tried to up-sampling using FFMPEG swr_convert() API, it converts with lots of noise. Which is not good.
If anybody has tried successfully upscale 8K to 44.1 or 48K then please share it.
Solution with C/C++ library code is preferable. Didn't tried Core-audio samples.
I Tried swr_convert() code from following link


Do ffmpeg libs know how to correctly set number of samples per frame, according to the encoder used?

I am trying to build a simple transcoder that can take MP3 and WAV files, and segment them using the segment formatting option, while also possibly changing the sample rate, bit rate and channel layout.
For this, I followed the code in the transcoding.c example. The issue is that when trying to transcode from a 32K HZ MP3 to 48K HZ MP3. The problem is that the MP3 encoder expects 1152 frame size, but libavfilter provides me with frames that contain 1254 number of samples. So when I try to do the encoding, I get this message: more samples than frame size. This problem can also be reproduced using the example code, just set the sample rate of the encoder to 48K.
One option is to use the asetnsamples filter, and set it to 1152, that will fix upsampling to 48K, but then downsampling to 24K won't work, because the encoder expects frame sizes of 576.
I wouldn't want to set this filter's value depending on the input information, it may become messy later if I support more file types, such as AAC.
Is there any way of making the libavfilter libraries know about this flow, and trigger proper filtering and transcoding without having to use lower level APIs, like libswresample or doing frame buffering?

Capturing H264 with logitech C920 to OpenCV

I’ve been trying to capture a H264 stream from my two C920 Logitech camera with OpenCV (On a Raspberry Pi 2). I have come to the conclusion that this is not possible because it is not yet implemented. I’ve looked a little in OpenCV/modules/highgui/cap_libv4l.cpp and found that the “Videocapture-function” always convert the pixelformat to BGR24. I tried to change this to h264, but only got a black screen. I guess this is because it is not being decoded the right way.
So I made a workaround using:
(You can find the loopback and rtspserver on github)
First I setup a virtual device using v4l2loopback. Then the rtspserver captures in h264 then streams rtsp to my localhost( Then I catch it again with gstreamer and pipe it to my virtual v4l2 video device made by loopback using the “v4l2sink” option in gst-launch-0.10.
This solution works and I can actually connect to the virtual device with the opencv videocapture and get a full HD picture without overloading the cpu, but this is nowhere near a good enough solution. I get a roughly 3 second delay which is too high for my stereo vision application and it uses a ton of bandwidth.
So I was wondering if anybody knew a way that I could use the v4l2 capture program from Derek Molloys boneCV/capture program (which i know works) to capture in h264 then maybe pipe it to gst-launche-0.10 and then again pipe it to the v4l2sink for my virtual device?
(You can find the capture program here:
The gstreamer command I use is:
“gst-launch-0.10 rtspsrc location=rtsp://admin:pi# ! decodebin ! v4l2sink device=/dev/video4”
OR maybe in fact you know what I would change in the opencv highgui code to be able to capture h264 directly from my device without having to use the virtual device? That would be amazingly awesome!
Here is the links to loopback and the rtspserver that I use:
Sorry about the wierd links I don't have enough reputation yet to poste more links..
I don't know exactly where you need to change in the OpenCV, but very recently I started to code using video on Raspberry PI.
I'll share my findings with you.
I got this so far:
can read the C920 h264 stream directly from the camera using V4L2 API at 30 FPS (if you try to read YUYV buffers the driver has a limit of 10 fps, 5 fps or 2 fps from USB...)
can decode the stream to YUV 4:2:0 buffers using the broadcom chip from raspberry using OpenMax IL API
My Work In Progress code is at: GitHub.
Sorry about the code organization. But I think the abstraction I made is more readable than the plain V4L2 or OpenMAX code.
Some code examples:
Reading camera h264 using V4L2 Wrapper:
v4l2_buffer bufferQueue;
while (!exit_requested){
//capture code
// use the h264 buffer inside bufferPtr[bufferQueue.index]
device.queueBuffer(bufferQueue.index, &bufferQueue);
Decoding h264 using OpenMax IL:
BroadcomVideoDecode decoder;
while (!exit_requested) {
//capture code start
//decoding code
//capture code end
check out Derek Molloy on youtube. He's using a Beaglebone, but presumably ticks this box

recording with uncompressed audio speeds up video

I have been using a recorder (based on muxer example) satisfactorily for quite some time for various formats. Now I need to use uncompressed audio to go with MJPEG video and I notice video speeds up considerable (like 10 times as fast) in the recorded file. Audio is OK, and if I use a compressed audio format (like mp3) video is fine as always. Does anyone have an idea why video speeds up the moment I use uncompressed audio (CODEC_ID_PCM_S16LE)?

FFMpeg encoding RGB images to H264

I'm developing a DirectShow filter which has 2 input pins (1 for audio, 1 for video). I'm using libavcodec/libavformat/libavutil of FFMpeg for encoding the video to H264, audio to AAC and mux it/stream using RTP. So far I was able to encode video and audio correctly using libavcodec but now I see that FFMpeg seems to support RTP muxing too. Unfortunatelly, I can't find any example code which shows how to perform H264 encoding and RTP muxing. Does anybody know good samples?
Try checking out the code in HandBrake. Specifically, this file muxmp4.c, which was a jem I found working with FFMpeg / RTP. Be sure and use av_interleaved_write_frame() and the extradata fields correctly. Those were some key differences I remember for RTP.
Still, I had some stability issues with RTP/RTSP with FFMpeg, (I'm sure it's getting better). I had much better luck with live555, and you can look at the code in VLC and MPlayer for good examples on how to use it.

encoding camera with audio source in realtime with WMAsfWriter - jitter problem

I build a DirectShow graph consisting of my video capture filter
(grabbing the screen), default audio input filter both connected
through spliiter to WM Asf Writter output filter and to VMR9 renderer.
This means I want to have realtime audio/video encoding to disk
together with preview. The problem is that no matter what WM profile I
choose (even very low resolution profile) the output video file is
always "jitter" - every few frames there is a delay. The audio is ok -
there is no jitter in audio. The CPU usage is low < 10% so I believe
this is not a problem of lack of CPU resources. I think I'm time-
stamping my frames correctly.
What could be the reason?
Below is a link to recorder video explaining the problem:
Dominik Tomczak
I have had this problem in the past. Your problem is the volume of data being written to disk. Writing to a faster drive is a great and simple solution to this problem. The other thing I've done is placing a video compressor into the graph. You need to make sure both input streams are using the same reference clock. I have had a lot of problems using this compressor scheme and keeping a good preview. My preview's frame rate dies even if i use an infinite Tee rather than a Smart Tee, the result written to disk was fine though. Its also worth noting that the more of a beast the machine i was running it on was the less of an issue so it may not actually provide much of a win if you need both over sticking a new faster hard disk in the machine.
I don't think this is an issue. The volume of data written is less than 1MB/s (average compression ratio during encoding). I found the reason - when I build the graph without audio input (WM ASF writer has only video input pint) and my video capture pin is connected through Smart Tree to preview pin and to WM ASF writer input video pin then there is no glitch in the output movie. I reckon this is the problem with audio to video synchronization in my graph. The same happens when I build the graph in GraphEdit. Without audio, no glitch. With audio, there is a constant glitch every 1s. I wonder whether I time stamp my frames wrongly bu I think I'm doing it correctly. How is the general solution for audio to video synchronization in DirectShow graphs?