Extracting raw audio/waveform from an MP3

Extracting raw audio/waveform from an MP3 - mp3

This question has been in my mind for a few years and I never actually found the answer for this.
What I would like to do is extract the actual waveform/PCM of an MP3 file, so that I can play it using the soundcard (of course).
Ideally I would be experimenting some DSP effects.
My first step was to look into LAME, but I didn't find anything relevant about MP3 decoding in a program or stuff like that.
So I'm asking where I could find something like this.
What language should I use? I was thinking C, but maybe there are programming languages out there that would do the job more efficiently.
Thanks!
Guillaume.

The question boils down to: what are you trying to accomplish?
From the description of your question of decoding an MP3 and playing it on the sound card makes it sounds as if you are trying to make a media player.
However, if your intent is to play around with DSP effects, then it sounds like the question is more about processing the sound rather than decoding MP3s. if that's the case, probably looking into writing plug-ins for existing media players (such as Windows Media Player and Winamp) would be easiest path to what you're trying to accomplish.
Frankly, learning to write your own decoder from scratch is not just a programming problem but a mathematical one, so using existing libraries are the way to go. Talking to the operating system or libraries like DirectSound to output audio seems like unnecessary work if anything. I feel that working on plug-ins for existing players would be the way to go, unless your goal is to make your own media player.
If what you really want to accomplish is playing with audio data, then probably decoding an MP3 to uncompressed PCM using any MP3 decoder, then manipulating it in the language of your choice would accomplish your goal of dealing with effects with sound.
The language choice is going to depend on whether you are going to interact directly with MP3 decoding libraries, or whether you can just use raw audio input, which would allow you to use pretty much any language of your choice.
There was a similar question a while back, Getting started with programmatic audio, where I posted an answer on some basic ways to manipulate audio, such as amplification, changing playback speed, and doing some work with FFT.

libmpg123 should do the trick.

I have been using the Windows Media SDK, not for this purpose, but I am pretty sure there are hooks let that let you intercept the audio stream, or convert MP4 to uncompressed WAV. I used C++.

Lots:
http://www.mp3-tech.org/programmer/decoding.html
Pick your poison...
Also, LAME does decode MP3s (check out --decode option), so you might find something interesting in that source.
-Adam

It really depends what platform you are programming on and what you want to do with the code. If you are on Windows you should look at the windows media format sdk or DirectShow. They should both have the ability to decode mp3 files into the raw waveform. On the Mac, I would expect Quicktime to have this same ability. Others have already suggested source for Linux/open source code.

I would recommend looking at Cubase and Wavelab as both will convert MP3 to WAV etc and allow you to play around with the waveform

Related

What is the path from BITMAP[+WAVE(s)] to RTSP (Twitch) via C/C++ in Windows?

So I'm trying to get a basic tool to output video/audio(s) to Twitch. I'm new to this side (AV) of programming so I'm not even sure what to look for. I'm trying to use mainly Windows infrastructure and third party where not available.
What are the steps of getting raw bitmap and wave data into a codec and then into a rtsp client and finally showing up on Twitch? I'm not looking for code. I'm looking for concepts so I can search for as I'm not absolutely sure what to search for. I'd rather not go through OBS source code to figure it out and use that as last resort.
So I capture the monitor via Output Duplication and also the Sound on the system as a wave and the microphone as another wave. I'm trying to push this to Twitch. I know that there's Media Foundation on Windows but I don't know how far to streaming it can get as I assume there no netcode integrated in it? And also the libav* collection in FFMPEG.
What are the basic steps of sending bitmap/wave to Twitch via any of thee above libraries or even others as long as they work on Windows. Please don't add code, I just need a not very long conceptual explanation and I'll take it from there. Try to cover also how bitrate and framerate gets regulated (do I have do it or the codec does it)?
Assume absolute noob level in this area (concept-wise not code-wise).

adding "read aloud" feature to book app written in Cocos2D

I created a book app and used Cocos2D and physics engine (Chipmunk) to create it. I would like to add "read aloud" feature to it.
So far I found instructions/books and tutorials how to add read aloud feature when book is created with iBook Author (but I couldn't use iBook Author due to some limitations) using Epub3 and SMIL.
I also found a good tutorial from J. Shapiro how to make narrated book using AVSpeechSynthesizer. This helps, only that I would like to use recorded voice, rather than synthesized sound. I don't know if this approach can be modified to do so?
I also know how it can be done in Sprite Kit framework.
The only info that I couldn't find is how to add "read aloud" feature to the app written using Cocos2D. Could it be done within SimpleAudioEngine, or it can be combined with some other engine (possibly from Sprite Kit framework)?
I would appreciate very much if somebody can give me some references/pointers or tutorial links where to look for some answers how to add this feature.
Thanking you in advance.

I would like to use recorded voice, rather than synthesized sound
Good. Add your voice recording audio files (caf, wav or mp3 format) to the project. Play it back at the appropriate time using:
[[SimpleAudioEngine sharedEngine] playEffect:#"someVoiceRecordingFile.wav"];

Define what read aloud means to you because I find that a lot of terms, especially semi-vague ones like this, are used differently depending on who is using it.
When you say read aloud book do you essentially mean a digital storybook that reads the story to you by simply playing narration audio? I've created dozens of these and what you are asking has multiple steps depending on what features you are going for in your book. If you mean simply playing audio and that is it, then yes you could do that in cocos2d using SimpleAudioEngine (as one option) but I assume you already knew that which is why this question has a tab bit of vagueness to it. Either way you probably wouldn't want to play narration as an effect but rather stream it. To do that along with background music you'd stream background music via the left channel and narration via the right. You can easily add a method to SimpleAudioEngine to make this nice and neat. To get you started something similar to this can be used to access the right channel:
CDLongAudioSource* sound = [[CDAudioManager sharedManager] audioSourceForChannel:kASC_Right];
if ([sound isPlaying])
{
[sound stop];
}
[sound load:fileName];
Also use the proper settings and recommended formats for streaming audio such as aifc (or really all audio in general). Although I believe you can stream mp3 without it being decompressed first, the problem is with timing. If you are using highlighted text or looping audio then aifc is the better option. Personally I've never had a reason to use mp3. Wav with narration is something I'd avoid even if just for the file size increase. If the mp3 is decompressed even for streaming (which I'm not sure if it is off the top of my head) then you'd have a huge spike in memory that will be both highly unwanted and at times down right bad.
There are many other things that can go into it but those are the basic first steps. If you want to do things like highlighted text, per-word animations, etc then that will take more work of course and you'd need to be comfortable with cocos2d, SpriteKit, or whatever you decide to use. I'll be doing a tutorial series on it one day soon so I'll cover all of that stuff.
On the other hand, if you are talking about recording someone's voice and having it playback i.e. a mother recording herself reading the story so her child can hear her voice whenever they are using your app, then you'd simply record the audio like you would any other piece of audio, save it to the device, and play it back when the page is displayed in the proper reading mode (or whatever you personally call it). One place to look is the AVAudioRecorder that is part of the AVFoundation framework. Simply Google "iOS audio recording" for examples if you need them.

reading mp3 file for game development

I am currently creating a game. My game will use music from an mp3 file that the user sends in in order to make decisions on where to place things, how fast the level moves, etc. I am fairly new at this, I have been reading information about mp3. Currently I have found all the frames in the mp3 file that I am using. I don't really know where to go from here. What I want to do is measure the frequencies of the sound wave of the music at certain times (like every sec) and then based on that frequency, do what I need to for the game. I don't know whether I should decode the mp3, that looks like a lot of work and I don't want to do that if I don't have 2 or if I can just read the bytes in the frame and convert them without decoding anything. I am developing this in c#, using the game engine FlatRedBall. I am not using any libraries. I am also planning on selling this game so I would like to avoid using other people's code if I can avoid it. Please someone help me, I just need a direction to go from here. I know how to parse the header and calculate the framelength, I just don't know the next step in what I want to do...

Convert your music to .ogg format which is free and use free library to play it.

Note: I was going to post this as a comment but it quickly grew too big. :)
Writing your own MP3 enconder/decoder is probably going to take a good ammount of effort; effort which would probably be better spent on your game itself. Therefore, is possible, I would be all means try to use an open source library.
That said, most good MP3 libraries are LGPL/GPL licensed. This means you can use it in a commercial setting, as long as you dynamically link to it. Also the SDL Mixer library, as of version 1.2.12, supports MP3s and is under a more permissive zlib license, but since you mention C# I don't know if stable and up-to-date bindings are available. Also since your project isn't written in SDL to begin with, it might be hard to integrate it.
Also, as #pro_metedor hinted, perhaps using a more open format could help in licensing issues. In general, OGG achieves better compression than MP3, which is a plus for things like download size, bandwidth/resource usage, etc.
Just shop around for a while, and try to be a little flexible. I'm sure you'll find something nice! :)

video/audio encoding/decoding/playback

I've always wanted to try and make a media player but I don't understand how. I found FFmpeg and GStreamer but I seem to be favoring FFmpeg despite its worse documentation even though I haven't written anything at all. That being said, I feel I would understand how things worked more if I knew what they were doing. I have no idea how video/audio streams work and the several media types so that doesn't help. At the end of the day, I'm just 'emulating' some of the code samples.
Where do I start to learn how to encode/decode/playback video/audio streams without having to read hundreds of pages of several 'standards'. Perhaps to a certain extent also be enough knowledge to playback media without relying on another API. Googling 'basic video audio decoding encoding' doesn't seem to help. :(
This seem to be a black art that nobody is out to tell anyone about.

The first part is extracting streams from the container. From there, you need to decode the streams into media. I recommend finding a small Theora video and seeing how the pieces relate there.

you want that we write one answer and you read that and be master in multimedia domain..!!!!
Anyway that can not be by one answer.
First of all understand this terminolgy by googling
1> container -- muxer/demuxer
2> codec --coder/decoder
If you like ffmpeg then go with its basic video plater application. iT is well documented at here http://dranger.com/ffmpeg/ it will shows the method of demuxing container and decoding any elementry stream with ffmpeg api. more about this at http://ffmpeg.org/ffplay.html
i like gstreamer more then ffmpeg. it has well documentation. it will be good choise if you start with gstreamer

Absolute beginners guide to working with audio in C/C++?

I've always been curious about audio conversion software, but I have never seen a proper explanation from a beginners point of view as to how to write a simple program that converts for example, a mp3 file to a wav. I'm not asking about any of the complex algorithms involved, just a small example using a simple library. Searching on SO, I came up with several names including:
Lame
The Synthesis Toolkit
OpenAL
DirectSound
But I'm unable to find a straightforward example of any of these libraries. Usually I don't mind wading through tons of code, but here I have absolutely no knowledge about the subject and so I always feel like I'm shooting in the dark.
Anyone here have a simple example / tutorial on converting a sound file using any of these libraries? My question is specifically directed towards C/C++ because those are the two languages I'm currently learning and so I'd like to continue to focus on them.
Edit: One thing I forgot to mention: I'm on a *NIX system.

Thanks everyone for the responses! I sort of cobbled them together to successfully make a small utility that converts a AIFF/WAV/etc file to an mp3 file. There seems to be some interest in this question, so here it what I did, step by step:
Step 1:
Download and install the libsndfile library as suggested by James Morris. This library is very easy to use – its only shortcoming is it won't work with mp3 files.
Step 2:
Look inside the 'examples' folder that comes with libsndfile and find generate.c. This gives a nice working example of converting any non-mp3 file to various file formats. It also gives a glimpse of the power behind libsndfile.
Step 3:
Borrowing code from generate.c, I created a c file that just converts an audio file to a .wav file. Here is my code: http://pastie.org/719546
Step 4:
Download and install the LAME encoder. This will install both the libmp3lame library and the lame command-line utility.
Step 5:
Now you can peruse LAME's API or just fork & exec a process to lame to convert your wav file to an mp3 file.
Step 6: Bring out the champagne and caviar!
If there is a better way (I'm sure there is) to do this, please let me know. I personally have never seen a step-by-step roadmap like this so I thought I'd put it out there.

For converting between various formats (except MP3) check libsndfile http://mega-nerd.com/libsndfile/
Libsndfile is a library designed to
allow the reading and writing of many
different sampled sound file formats
(such as MS Windows WAV and the
Apple/SGI AIFF format) through one
standard library interface.
During read and write operations,
formats are seamlessly converted
between the format the application
program has requested or supplied and
the file's data format. The
application programmer can remain
blissfully unaware of issues such as
file endian-ness and data format
It is also simple to use, with the API following the style of the Standard C library function names:
http://mega-nerd.com/libsndfile/api.html
And examples are included in the source distribution.
For actual audio output, another library will be needed, SDL as already mentioned might be a good place to start. While SDL can also read/write audio files, libsndfile is far superior.

If your curious about DSP and computers, take a look at the Synthesis Toolkit. It's sweet. It's designed for learning. The examples and tutorials they have on their website are straightforward and thorough. Keep in mind, the guys who wrote it, wrote it so they could create acoustic models of real instruments. As a result, they've included some instruments that are just plain wacky, but fun. It will give you a core understanding of processing PCM sound. And you'll probably be able to hack together some fun little noisemakers while your at it.
https://ccrma.stanford.edu/software/stk/

Check libmad http://mad.sourceforge.net " "M"peg "A"udio "D"ecoder library", should provide a good example.
Also for an easy cross-platform audio handling, check SDL http://www.libsdl.org/.
Hope that helps.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js