How to extract audio from a video with ffmpeg in C++? - c++

I'm using FFmpeg to extract informations about a video file.
But i want to extract the audio channels to read it with FMOD.
How can I do that ? Is it simple ?
Do you know a good tutorial about FFmpeg in C++ ?
Thanks

For tutorials on FFMPEG, have a look at this question: FFmpeg API books, tutorial, etc. The tutorials are in C, but so is FFMPEG. They're a bit out of date and won't compile with the most recent FFMPEG, but the required changes aren't that great. I've started updating them on my github.
Work your way through the tutorials (it will take around a week at a slow pace) and you'll enough to know where to grab the audio from. If you just want the stream, you can probably just dump the audio packets to a file. If you want to transcode the audio, then it will require more effort.
Since the ffmpeg source is open, feel free to look around yourself. The main file to look at is ffmpeg.c. It's big, so you're better off getting your hands dirty with the tutorial first.

Related

Can I see a simple example of ogg audio decoding

I want to write my own icecast2 client because I can't seem to find one that will work with what I need. But, I am so overwhelmed by everything that entails. I'm familiar with networking, and I know a thing or too about audio as well. But, I've never written or used a decoder. I see a lot of things on xiph.org that could be useful if I could just see the basics.
Could someone post an answer that contains a few frames of ogg with an example of how to convert that raw ogg into audio (does it ultimately turn into wav to play?)

video/audio encoding/decoding/playback

I've always wanted to try and make a media player but I don't understand how. I found FFmpeg and GStreamer but I seem to be favoring FFmpeg despite its worse documentation even though I haven't written anything at all. That being said, I feel I would understand how things worked more if I knew what they were doing. I have no idea how video/audio streams work and the several media types so that doesn't help. At the end of the day, I'm just 'emulating' some of the code samples.
Where do I start to learn how to encode/decode/playback video/audio streams without having to read hundreds of pages of several 'standards'. Perhaps to a certain extent also be enough knowledge to playback media without relying on another API. Googling 'basic video audio decoding encoding' doesn't seem to help. :(
This seem to be a black art that nobody is out to tell anyone about.
The first part is extracting streams from the container. From there, you need to decode the streams into media. I recommend finding a small Theora video and seeing how the pieces relate there.
you want that we write one answer and you read that and be master in multimedia domain..!!!!
Anyway that can not be by one answer.
First of all understand this terminolgy by googling
1> container -- muxer/demuxer
2> codec --coder/decoder
If you like ffmpeg then go with its basic video plater application. iT is well documented at here http://dranger.com/ffmpeg/ it will shows the method of demuxing container and decoding any elementry stream with ffmpeg api. more about this at http://ffmpeg.org/ffplay.html
i like gstreamer more then ffmpeg. it has well documentation. it will be good choise if you start with gstreamer

Analysing audio data for attributes at time intervals

I've been wanting to play around with audio parsing for a while now but I haven't really been able to find the correct library for what I want to do.
I basically just want to parse through a sound file and get amplitudes/frequencies and other relevant information at certain times during the song (like every 10 ms or so) so I can graph the data for example where the song speeds up a lot and where it gets really loud.
I've looked at OpenAL quite a bit but it doesn't look like it provides this ability, other than that I have not had much luck with finding out where to start. If anyone has done this or used a library which can do this a point in the right direction would be greatly appreciated. Thanks!
For parsing and decoding audio files I had good results with libsndfile, which runs on Windows/OSX/Linux and is open source (LGPL license). This library does not support mp3 (the author wants to avoid licensing issues), but it does support FLAC and Ogg/Vorbis.
If working with closed source libraries is not a problem for you, then an interesting option could be the Quicktime SDK from Apple. This SDK is available for OSX and Windows and is free for registered developers (you can register as an Apple developer for free as well). With the QT SDK you can parse all the file formats that the Quicktime Player supports, and that includes .mp3. The SDK gives you access to all the codecs installed by QuickTime, so you can read .mp3 files and have them decoded to PCM on the fly. Note that to use this SDK you have to have the free QuickTime Player installed.
As far as signal processing libraries I honestly can't recommend any, as I have written my own functions (for speech recognition, in case you are curious). There are a few open source projects that seem interesting listed in this page.
I recommend that you start simple, for example working on analyzing amplitude data, which is readily available from the PCM samples without having to do any processing. Being able to visualize the data is very useful, I have found Audacity to be an excellent visualization tool, and since it is open source you can build your own tests inside it.
Good luck!

combining separate audio and video files into one file C++

I am working on a C++ project with openCV. It is a simple web cam application with basic features like capturing pictures and videos. I have already been able to save video (w/o audio). Since openCV doesnot support audio processing, I was wondering if there is any way I can record audio separately in a different file and later combine those together to get one video file.
While searching on the internet, I did hear something about using ffmpeg with openCV. But I just cant figure out how to do it exactly.....
Can you guys help me? I would be very grateful... Thankyou!
P.S. I have used openCV and QT (for GUI)
As you said, opencv doesn't by itself deal with audio. However once you get a separate audio and video file, you can combine them using a technique called muxing. There are many many ways to do this. I use VirtualDub for most of my muxing needs, although it is windows only (not sure if that's a problem). I know ffmpeg is also capable of muxing (via the command line interface), I can't recall what the command is. There's also mplayer and a multitude of other programs out there to do this.
as far as i know openCV is good for video/image processing. To support audio processing, you can use other libraries e.g. PortAudio or C-sound.

Extracting raw audio/waveform from an MP3

This question has been in my mind for a few years and I never actually found the answer for this.
What I would like to do is extract the actual waveform/PCM of an MP3 file, so that I can play it using the soundcard (of course).
Ideally I would be experimenting some DSP effects.
My first step was to look into LAME, but I didn't find anything relevant about MP3 decoding in a program or stuff like that.
So I'm asking where I could find something like this.
What language should I use? I was thinking C, but maybe there are programming languages out there that would do the job more efficiently.
Thanks!
Guillaume.
The question boils down to: what are you trying to accomplish?
From the description of your question of decoding an MP3 and playing it on the sound card makes it sounds as if you are trying to make a media player.
However, if your intent is to play around with DSP effects, then it sounds like the question is more about processing the sound rather than decoding MP3s. if that's the case, probably looking into writing plug-ins for existing media players (such as Windows Media Player and Winamp) would be easiest path to what you're trying to accomplish.
Frankly, learning to write your own decoder from scratch is not just a programming problem but a mathematical one, so using existing libraries are the way to go. Talking to the operating system or libraries like DirectSound to output audio seems like unnecessary work if anything. I feel that working on plug-ins for existing players would be the way to go, unless your goal is to make your own media player.
If what you really want to accomplish is playing with audio data, then probably decoding an MP3 to uncompressed PCM using any MP3 decoder, then manipulating it in the language of your choice would accomplish your goal of dealing with effects with sound.
The language choice is going to depend on whether you are going to interact directly with MP3 decoding libraries, or whether you can just use raw audio input, which would allow you to use pretty much any language of your choice.
There was a similar question a while back, Getting started with programmatic audio, where I posted an answer on some basic ways to manipulate audio, such as amplification, changing playback speed, and doing some work with FFT.
libmpg123 should do the trick.
I have been using the Windows Media SDK, not for this purpose, but I am pretty sure there are hooks let that let you intercept the audio stream, or convert MP4 to uncompressed WAV. I used C++.
Lots:
http://www.mp3-tech.org/programmer/decoding.html
Pick your poison...
Also, LAME does decode MP3s (check out --decode option), so you might find something interesting in that source.
-Adam
It really depends what platform you are programming on and what you want to do with the code. If you are on Windows you should look at the windows media format sdk or DirectShow. They should both have the ability to decode mp3 files into the raw waveform. On the Mac, I would expect Quicktime to have this same ability. Others have already suggested source for Linux/open source code.
I would recommend looking at Cubase and Wavelab as both will convert MP3 to WAV etc and allow you to play around with the waveform