How to convert from any format to PCM in Windows

How to convert from any format to PCM in Windows - c++

I am using WASAPI to get audio data in c++ and Yeh!. I learned that conversion of audio data is not support in WASAPI, since it gives to and take from the core audio end points. I am working on a project to find exact audio frequency which needs simple PCM data only. but using WASAPI, I'm getting data with different format depends on audio devices. So, is there any simple API by windows to convert any data to PCM.
Note: I get audio using the method
IAudioCaptureClient::GetBuffer(&data,...);
or is there is any other api, that I can use It get data in PCM format directly for windows desktop and windows phone?

Audio Resampler DSP - most recent stock conversion API
Audio Compression Manager (ACM) - legacy converter offering conversions between PCM formats (also available through DirectShow wrapper over it - most likely you don't want to use it, but let's mention for the answer completeness)
Also worth mentioning FFmpeg's libswresample - popular alternative option for the conversion; you can easily find other libraries as well
See also:
How can I resample wav file
WASAPI Resampling / Windows Media Foundation
Change Audio Samplerate through this code which currently changed Bit depth?

Related

Problem Converting WAV file with more than 2 channels to MP3

I am developing a C# application that records streaming audio to MP3.
I’m new to this but from what I’ve seen so far, the easiest way to do this is record to WAV using NAudio and then create an MP3 version using either LameMP3FileWriter or MediaFoundationEncoder.
I’m running into problems with the conversion, however, as my PC sound system is 5.1 and the MP3 conversion crashes due to the number of channels in the recorded WAV file. It works fine when I reconfigure my sound system to stereo but this is a bit of a pain; firstly, it means I cannot use my 5.1 system when recording the music but more of an issue, for some reason that I cannot figure out, if I set my speakers to stereo, they revert to quadraphonic when the PC (Windows 10) is rebooted!
Can anyone suggest how I can do this conversion without the need to configure my 5.1 sound?
One obvious solution to do something like resampling the WAV file to 2 channels before the conversion to MP3 but that seems something of a ‘long shortcut’. I’m also unclear as to the advantage of recording to WAV in the first place – audio streams are compressed and unlikely to have more than 2 channels to start with so playing it over 5.1, no matter how good it sounds, is really a bit illusory.
It would seem more sensible to just record the stream direct to MP3 but I cannot find any straightforward way of doing that.

Mp3 specification does not handle 5.1. So it seems your mp3 encoders fail with 5.1.
Perhaps you can try an encoder that support MP3 Surround, an mp3 extension for 5.1.
Also, Perhaps you should consider using AAC encondig, a more friendly codec for 5.1.

Full quality MP3 streaming via webRTC

I'm interested in webRTC's ability to P2P livestream an mp3 audio from user's machine. Only example, that I found is this: https://webrtc-mp3-stream.herokuapp.com/ from this article http://servicelab.org/2013/07/24/streaming-audio-between-browsers-with-webrtc-and-webaudio/
But, as you can see, the audio quality on receiving side is pretty poor (45kb\sec), is there any way to get a full quality MP3 streaming + ability to manipulate this stream's data (like adjusting frequencies with equalizer) on the each user's sides?
If impossible through webRTC, is there any other flash-plugin or pluginless options for this?
Edit: also I stumbled upon this 'shoutcast kinda' guys http://unltd.fm/ , declaring, that they are using webRTC to deliver top quality radio broadcasting including streaming mp3. If they are, then how?

WebRTC supports 2 audio codecs: OPUS (max bitrate 510kbit/s) and G711. You stick with OPUS, it is modern and more promising, introduced in 2012.
Main files in webrtc-mp3-stream are outdated by 2 years (Jul 18, 2013). I couldn't find OPUS preference in the code, so possibly demo runs via G711.
The webrtc-mp3-stream demo does the encoding job (MP3 as a media source), then it transmits the data over UPD/TCP via WebRTC. I do not think you need to decode it to MP3 on receiver side, this would be an overkill. Just try to enable OPUS to make the code of webrtc-mp3-stream more up-to-date.
Please refer to Is there a way to choose codecs in WebRTC PeerConnection? to enable OPUS to see the difference.

I'm the founder of unltd.fm.
igorpavlov is right but I can't comment answer. We also use OPUS (Stereo / 48Khz) codec over WebRTC.
Decoding mp3 ( or any other audio format ) using webaudio then encoding it in OPUS is the way to go. You "just" need to force SDP negotiations to use OPUS.
You should have send us an email you would have saved your 50 points ;)

You can increase the quality of a stream by setting the SDP to be stereo and increase the maxaveragebitrate:
let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);
This should output a SDP string which looks like this:
a=fmtp:111 minptime=10;useinbandfec=1; stereo=1; maxaveragebitrate=510000
This gives a potential maximum bitrate of 520kb/s for stereo, which is 260kps per channel. Actual bitrate depends on the speed of your network and strength of your signal.
You can read more about the other available SDP attributes at: https://www.rfc-editor.org/rfc/rfc7587

How to stream raw synthesized PCM audio in C++/CX Windows 8 apps?

Simply put, I want my C++/CX XAML Windows 8 app to output continuous synthesized sound (not sound effects). However I've been looking all over the Web and I cannot figure out how to get the system feed it buffers of PCM samples (or better, have it ask me for them through callbacks) for them to be played. I would use the old waveOut* APIs, however they are banned in Store app development.
So, what is the simplest way to do this? Please note that I am not interested in playing media files (.wav, .mp3) or web audio streaming.
Thanks in advance.

You need to use WASAPI which is enabled in Windows Store apps. This article will get you started with how to use the API to render audio. One annoyance is that WASAPI devices generally don't resample for you so you'll have to be willing to go with what the device is using (probably 44.1kHz or 48kHz) or do the resampling yourself (for which you can make use of the Resampler Media Foundation transform).

Portable library to play samples on individual 5.1 channels with C/C++?

I'm looking for a free, portable C or C++ library which allows me to play mono sound samples on specific channels in a 5.1 setup. For example the sound should be played with the left front speaker whereby all other speakers remain silent. Is there any library capable of doing this?
I had a look at OpenAL. However, I can only specify the position from which the sound should come, but it seems to me that I cannot say something like "use only the front left channel to play this sound".
Any hints are welcome!

I had a look at OpenAL. However, I can only specify the position from which the sound should come, but it seems to me that I cannot say something like "use only the front left channel to play this sound".
I don't think this is quite true. I think you can do it with OpenAL, although it's not trivial. OpenAL only does the positional stuff if you feed it mono format data. If you give it stereo or higher, it plays the data the way it was provided. However, you're only guaranteed stereo support. You'll need to check to see if the 5.1 channel format extension is available on your system (AL_FORMAT_51CHN16). If so, then, I think that you feed your sound to the channel you want and feed zeroes to all the others channels when you buffer the samples. Note that you need hardware support for this on the sound card. A "generic software" device won't cut it.
See this discussion from the OpenAL mailing list.
Alternatively, I think that PortAudio is Open, cross-platform, and supports multiple channel output. You do still have to interleave the data so that if you're sending a sound to a single channel, you have to send zeroes to all the others. You'll also still need to do some checking when opening a stream on a device to make sure the device supports 6 channels of output.

A long time ago I used RTAudio. But I cannot say if this lib can do what you want to archive, but maybe it helps.

http://fmod.org could do the trick too

I use the BASS Audio Library http://www.un4seen.com for all my audio, sound and music projects. I am very happy with it.
BASS is an audio library to provide developers with powerful and efficient sample, stream (MP3, MP2, MP1, OGG, WAV, AIFF, custom generated, and more via add-ons), MOD music (XM, IT, S3M, MOD, MTM, UMX), MO3 music (MP3/OGG compressed MODs), and recording functions. All in a tiny DLL, under 100KB* in size. C/C++, Delphi, Visual Basic, MASM, .Net and other APIs are available. BASS is available for the Windows, Mac, Win64, WinCE, Linux, and iOS platforms.
I have never used it to play different samples in a 5.1 configuration. But, according their own documentation, it should be possible.
Main features
Samples Support for WAV/AIFF/MP3/MP2/MP1/OGG and custom generated samples
Sample streams Stream any sample data in 8/16/32 bit, with both "push" and "pull" systems. File streams MP3/MP2/MP1/OGG/WAV/AIFF file streaming. Internet file streaming. Stream data from HTTP and FTP servers (inc. Shoutcast, Icecast & Icecast2), with IDN and proxy server support and adjustable buffering. ** Custom file streaming ** Stream data from anywhere using any delivery method, with both "push" and "pull" systems
Multi-channel Support for more than plain stereo, including multi-channel OGG/WAV/AIFF files
...
Multiple outputs Simultaneously use multiple soundcards, and move channels between them
Speaker assignment Assign streams and MOD musics to specific speakers to take advantage of hardware capable of more than plain stereo (up to 4 separate stereo outputs with a 7.1 soundcard)
3D sound Play samples/streams/musics in any 3D position
Licensing
BASS is free for non-commercial use. If you are a non-commercial entity (eg. an individual) and you are not making any money from your product (through sales, advertising, etc), then you can use BASS in it for free. Otherwise, one of the following licences will be required.

How to encode pixels from buffer to h.264 or VP8

I have a application (qt c++) that reads data from USB-device, decodes that data into 24bit RGB pixels which are stored in a uchar array.
Framerate is ~10 FPS. Framesize is 128x4096.
Question is: How to encode these frames into VP8 or h.264 video in real time?
No external processes are allowed, everything needs to run inside my application.
ffmpeg is an option but how to include it to my project and use it? Documentation is rather bad to say the least. Also x264 could be an option but same question as to ffmpeg. And it's also quite expensive, 1$ for unit and minimum of 10000.
Simple guide would be helpful but I doubt there exists one.
Application should run in Windows and Linux.

The problem with the VP8 SDK is that the examples only encode to IVF. That codec appears to have been shut down by Microsoft due to a security flaw (buffer overflow). It's pretty hard to even get the VP8 project setup when you can't even check the results. It at least uses a BSD license scheme and its supposedly unencumbered with patents.
The VP8 SDK has some routines for converting formats, but they are buried in the source tree.
An option not mentioned is the Intel Media SDK, but that locks you to windows.
There is also Theora and Dirac.
X264 has an encoder, but it would be expensive to get a commercial license.
GPLv2 source code is not "free". I don't care what they try to get you to believe.
There is also a project called "Revel - the Really Easy Video Encoding Library". That is a path to getting MPEG-4 part 2 files encoded. H264 is MPEG-4 part 10. H264 is also called AVC. Revel is also GPL'd.
Ffmpeg is a catch all utility that tries to create a wrapper around the various encoders/decoders. If you use the x264 encoder with it, it becomes GPLv2.

The VP8 SDK has documentation and even some sample code

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js