I'm right now reading the microsoft documentation about drivers and core audio apis. At the moment I'm still confuse which way to go to achieve what I need.
I have an audio application which is Standalone and coded with framework JUCE in C++. And I need to build a Windows solution that would capture the audio stream that is going to an audio endpoint device to use it as an input of my audio application.
This stream must have an unaltered volume: always 1.0 (no matter if the hardware volume is changed or muted).
I must be able to choose between the different endpoint devices, for exemple if I have an external soundcard that is plugged, my audio application should be able to intercept and copy the stream that is going to that external soundcard, or do the same for the stream that is going to the built-in speakers.
The idea is to capture the output streams before they are modified by hardware volume modifications, and make a copy of them routed to my application without changing the output routing and behaviour.
The microsoft documentation is very furnished, but even if the WASAPI provides a lot of ways to capture and stream from audio endpoint devices, I'm not sure it is possible to get an unaltered volume, as it will always capture what's exactly coming out of the speakers.
This is why I don't know If I can implement a feature directly in my audio application that will get the streams I want with WASAPIs or if I have to code a proper Audio Driver that would make a copy of the streams I want for my application to be able to use these streams.
The documentations I refer to:
Audio Drivers design guide
Core Audio APIs / WASAPI
Thanks for the help,
Best,
Maxime
Sometimes the volume control is implemented in software, and sometimes it is implemented in hardware. You can call IAudioEndpointVolume::QueryHardwareSupport to see if the volume control for the audio endpoint you're working with is implemented in hardware or software.
Sometimes the audio loopback is implemented in software, and sometimes it is implemented in hardware. There is no API to tell which.
If the audio loopback is implemented in software, and the volume control is implemented in hardware, then you will get back the data you want.
If the audio loopback is implemented in hardware, or the volume control is implemented in software, the the audio data you get back has already had the volume adjustment applied.
What does your application do with the audio data it receives? The primary use case for audio loopback data is echo cancelation, where you usually WANT the volume to be applied.
Related
Does anyone know how to programmatically capture the sound that is being played (that is, everything that is coming from the sound card, not the input devices such as a microphone).
Assuming that you are talking about Windows, there are essentially three ways to do this.
The first is to open the audio device's main output as a recording source. This is only possible when the driver supports it, although most do these days. Common names for the virtual device are "What You Hear" or "Wave Out". You will need to use a suitable API (see WaveIn or DirectSound in MSDN) to do the capturing.
The second way is to write a filter driver that can intercept the audio stream before it reaches the physical device. Again, this technique will only work for devices that have a suitable driver topology and it's certainly not for the faint-hearted.
This means that neither of these options will be guaranteed to work on a PC with arbitrary hardware.
The last alternative is to use a virtual audio device, such as Virtual Audio Cable. If this device is set as the defualt playback device in Windows then all well-behaved apps will play through it. You can then record from the same device to capture the summed output. As long as you have control over the device that the application you want to record uses then this option will always work.
All of these techniques have their pros and cons - it's up to you to decide which would be the most suitable for your needs.
You can use the Waveform Audio Interface, there is an MSDN article on how to access it per PInvoke.
In order to capture the sound that is being played, you just need to open the playback device instead of the microphone. Open for input, of course, not for output ;-)
If you were using OSX, Audio Hijack Pro from Rogue Amoeba probably is the easiest way to go.
Anyway, why not just looping your audio back into your line in and recording that? This is a very simple solution. Just plug a cable in your audio output jack and your line in jack and start recordung.
You have to enable device stero mix. if you do this, direct sound find this device.
I've been stuck on this problem for weeks now and Google is no help, so hopefully some here can help me.
I am programming a software sound mixer in C++, getting audio packets from the network and Windows microphones, mixing them together as PCM, and then sending them back out over the network and to speakers/USB headsets. This works. I have a working setup using the PortAudio library to handle the interface with Windows. However, my supervisors think the latency could be reduced between this software and our system, so in an attempt to lower latency (and better handle USB headset disconnects) I'm now rewriting the Windows interface layer to directly use WASAPI. I can eliminate some buffers and callbacks doing this, and theoretically use the super low latency interface if that's still not fast enough for the higher ups.
I have it only partially working now, and the partially part is what is killing me here. Our system has the speaker and headphones as three separate mono audio streams. The speaker is mono, and the headset is combined from two streams to be stereo. I'm outputting this to windows as two streams, one for a device of the user's choice for speaker, and one of another device of the user's choice for headset. For testing, they're both outputting to the default general stereo mix on my system.
I can hear the speaker perfectly fine, but I cannot hear the headset, no matter what I try. They both use the same code path, they both go through a WMF resampler to convert to 2 channel audio at the sample rate Windows wants. But I can hear the speaker, but never the headset stream.
It's not an exclusive mode problem: I'm using shared mode on all streams, and I've even specifically tried cutting down the streams to only the headset, in case one was stomping the other or something, and still the headset has no audio output.
It's not a mixer problem upstream, as I haven't modified any code from when it worked with PortAudio streams. I can see the audio passing through the mixer and to the output via my debug visualizers.
I can see the data going into the buffer I get from the system, when the system calls back to ask for audio. I should be hearing something, static even, but I'm getting nothing. (At one point, I bypassed the ring buffer entirely and put random numbers directly into the buffer in the callback and I still got no sound.)
What am I doing wrong here? It seems like Windows itself is the problem or something, but I don't have the expertise on Windows APIs to know what, and I'm apparently the most expert for this stuff in my company. I haven't even looked yet as to why the microphone input isn't working, and I've been stuck on this for weeks now. If anyone has any suggestions, it'd be much appreciated.
Check the re-sampled streams: output the stereo stream to the speaker, and output the mono stream to the handset.
Use IAudioClient::IsFormatSupported to check supported formats for the handset.
Verify your code using an mp3 file. Use two media players to play different files with different devices simultaneously.
I am trying to write a pro music/audio processing application, and I would like to be able to interact with the audio inputs/outputs at a very low level - ideally something allowing me to apply effects to the audio inputs and output this in real-time, similar to programs like Logic, Ableton etc.
I have written a pretty basic program that detects audio endpoint devices and can change their volumes using the MMDevice interface, but this is nowhere near the functionality I would like.
I have learned from the Microsoft docs that the four core-audio APIs are:
MMDevice
WASAPI
DeviceTopology
EndpointVolume
but it doesn't seem like any of these have the capabilities that I need. I'm thinking that I will need to be able to interact with the speakers at the level of setting the position of the membrane at a given time.
Is this even possible? If so, what can I use to do this?
The Windows Audio Session API (WASAPI) is the best bet for this purpose. It allows interaction with audio endpoints and setting up audio streams (which are streams of data that you can send or receive in real time). A good example is here.
I was wondering if it is possible to capture a copy of the audio output in Qt so I can process it. Here they said it's possible to monitor the playback, but I think it's only possible if you use a self made music player, which I don't want. I want to capture the signal from no matter where it is player (youtube, spotify, facebook, etc.). Is there a way to analyze this data with Qt? Is it possible to set my output of my soundcard as a QMediaSource?
Thank you in advance.
In general, no, that isn't possible, simply because your process (and therefore the Qt library that is loaded into your process) does not have access to that information. (I believe this lack of access is deliberate; since if it did have access like that, there might be security and/or privacy implications, i.e. app A could use it to spy on the audio output of app B, etc)
There may be an OS-specific mechanism that you can use; for example, if you are running your program under MacOS/X, you can install the SoundFlower audio driver that can function as a loopback device, allowing programs to read audio from its "audio input" that was previously routed to its "audio output". But without that kind of external support, it's not currently possible to record the computer's audio output via Qt.
I was curious if there is a way to read the data that is being sent to an audio output. My end goal is to capture the audio and then send it over serial for audio processing. I'm using a Windows computer.
The thing that seems to be making this more difficult is that I'm not reading the captured microphone input, but rather the streamed speaker output.
Can anybody help me out?
A more or less easy way is to take advantage of Stereo Mix device, where available. This way you have an audio capture device, which makes you available device audio output mixed down. You can read from this device as if it were a real audio input device such as Line In, or a microphone, using standard and well documented APIs or audio libraries.
Other options are more sophisticated and require both hooking into system and deeper understanding of the internals: you either hook audio APIs to intercept what applications send to audio outputs, or you install a virtual audio device the applications use and you have the data available from.