I am working on this project in which we've got to implement real time audio streaming. Think of it as more or less a phone conversation, so the audio data that is fed in needs to be played while being fed.
I want to know if there are any libraries (Linux) that would enable me to do this. If this can be done via SDL it would be great because we are already using SDL for many other purposes.
This is doable using SDL. SDL_OpenAudio provides lowish level access to the audio device, registering a callback function that is called from a separate thread that will fill the audio buffer with sound data whenever it is ready for more.
Related
I've been stuck on this problem for weeks now and Google is no help, so hopefully some here can help me.
I am programming a software sound mixer in C++, getting audio packets from the network and Windows microphones, mixing them together as PCM, and then sending them back out over the network and to speakers/USB headsets. This works. I have a working setup using the PortAudio library to handle the interface with Windows. However, my supervisors think the latency could be reduced between this software and our system, so in an attempt to lower latency (and better handle USB headset disconnects) I'm now rewriting the Windows interface layer to directly use WASAPI. I can eliminate some buffers and callbacks doing this, and theoretically use the super low latency interface if that's still not fast enough for the higher ups.
I have it only partially working now, and the partially part is what is killing me here. Our system has the speaker and headphones as three separate mono audio streams. The speaker is mono, and the headset is combined from two streams to be stereo. I'm outputting this to windows as two streams, one for a device of the user's choice for speaker, and one of another device of the user's choice for headset. For testing, they're both outputting to the default general stereo mix on my system.
I can hear the speaker perfectly fine, but I cannot hear the headset, no matter what I try. They both use the same code path, they both go through a WMF resampler to convert to 2 channel audio at the sample rate Windows wants. But I can hear the speaker, but never the headset stream.
It's not an exclusive mode problem: I'm using shared mode on all streams, and I've even specifically tried cutting down the streams to only the headset, in case one was stomping the other or something, and still the headset has no audio output.
It's not a mixer problem upstream, as I haven't modified any code from when it worked with PortAudio streams. I can see the audio passing through the mixer and to the output via my debug visualizers.
I can see the data going into the buffer I get from the system, when the system calls back to ask for audio. I should be hearing something, static even, but I'm getting nothing. (At one point, I bypassed the ring buffer entirely and put random numbers directly into the buffer in the callback and I still got no sound.)
What am I doing wrong here? It seems like Windows itself is the problem or something, but I don't have the expertise on Windows APIs to know what, and I'm apparently the most expert for this stuff in my company. I haven't even looked yet as to why the microphone input isn't working, and I've been stuck on this for weeks now. If anyone has any suggestions, it'd be much appreciated.
Check the re-sampled streams: output the stereo stream to the speaker, and output the mono stream to the handset.
Use IAudioClient::IsFormatSupported to check supported formats for the handset.
Verify your code using an mp3 file. Use two media players to play different files with different devices simultaneously.
I'm a beginner when it comes to programming and I wanted to do a personal project in C++ to develop my skills. The project I had in mind involves playing audio on my laptop (running Windows 10), analyzing it, and sending data to an arduino that will change the color and brightness of LED lights in sync with the audio that's playing. I would like it so that I can simply, for example, just play a song on Spotify or a music video on Youtube etc. and the program will get data from that audio stream as an input. Elsewhere I've seen programs use audio from recorded WAV files or streams from a microphone as input, but not what I have in mind. I want to use this program for parties, so using a microphone as a workaround wouldn't be ideal.
Is this even possible? And if so how should I approach this problem? Are there certain APIs I should look to or what? If the program gets audio as the input, would I still be able to play music on something like a bluetooth speaker as well? Or can it only send data to one place at a time?
My roommate who is much better at programming than me accomplished this on Mac using Swift, and while I don't have a Mac, would using Linux instead make this easier?
Modern windows has “Stereo Mix” recording device, just for that. Here’s how to enable: https://technicalustad.com/enable-stereo-mix-in-windows-10/
After that setup, in your C++ program use any recording API you want.
Here’s a sample that does what you ask for, opens a recording device, starts recording, and sends audio samples to the class provided in the argument: https://learn.microsoft.com/en-us/windows/win32/coreaudio/capturing-a-stream You probably want to trade CPU time for latency for your application, i.e. don’t Sleep for hnsActualDuration/REFTIMES_PER_MILLISEC/2, change into Sleep( 0 ) or Sleep( 1 )
My company is currently working on what could be called an audio analysis program which needs to process to multiple audio inputs (8 or so) in real time. This means that we need a framework that can handle multichannel audio interface devices that have up to 8 input channels. On top of this, the framework should be as portable as possible. We actually started our development using Java but it ran into issues with the sound API.
When looking for alternate ways to do what we need, I started thinking about using C++ and Qt. I have some experience with both, but I've never done anything remotely similar (in any language for that matter)
Now, the question is, can Qt/Phonon handle audio interfaces/sound cards with over 2 input channels (assuming that the OS can see the devices just fine)? Would it dependent on the backend being used?
Phonon as no input function. it's for playback only if i'm right. but if you want to process input audio you can use QAudioInput. I've used it with just one audio input but I think this constructor with the right QAudioDeviceInfo could do what you want.
I'm trying to get mpg123 audio decoder to work with QT on windows. How do i play the decoded audio data at the right speed with Qmultimedia module in push mode. Currently i'm using simple timer to get it to play audio but it's not very efficient way to do it, if I do anything else at the same time audio get all distorted. Is there any better way to send the decoded data to audio output? It would be nice if anyone could point me to any nice examples using Qmultimedia module and Qaudiooutput class. I've tried to figure out QT example project "audiooutput" but it seems that it's also using timer to send audio to output in push mode.. Hope that I'm not too confusing.
I also had to figure that out and I would also suggest using the Phonon framework to do this.
It uses Windows Media Player as host on Windows, QuickTime on Mac and some KDE stuff on Linux.
So it's pretty platform independent.
If you need more low-level functionality, you should take a look into an open-source project called portaudio. It's very easy to use and you can manipulate or even fill buffers from code.
I used it to build an oscillator.
Hope that helps!
Best,
guitarflow
I wanted to get some ideas one how some of you would approach this problem.
I've got a robot, that is running linux and uses a webcam (with a v4l2 driver) as one of its sensors. I've written a control panel with gtkmm. Both the server and client are written in C++. The server is the robot, client is the "control panel". The image analysis is happening on the robot, and I'd like to stream back the video from the camera to the control panel for two reasons:
A) for fun
B) to overlay image analysis results
So my question is, what are some good ways to stream video from the webcam to the control panel as well as giving priority to the robot code to process it? I'm not interested it writing my own video compression scheme and putting it through the existing networking port, a new network port (dedicated to video data) would be best I think. The second part of the problem is how do I display video in gtkmm? The video data arrives asynchronously and I don't have control over main() in gtkmm so I think that would be tricky.
I'm open to using things like vlc, gstreamer or any other general compression libraries I don't know about.
thanks!
EDIT:
The robot has a 1GHz processor, running a desktop like version of linux, but no X11.
Gstreamer solves nearly all of this for you, with very little effort, and also integrates nicely with the Glib event system. GStreamer includes V4L source plugins, gtk+ output widgets, various filters to resize / encode / decode the video, and best of all, network sink and sources to move the data between machines.
For prototype, you can use the 'gst-launch' tool to assemble video pipelines and test them, then it's fairly simply to create pipelines programatically in your code. Search for 'GStreamer network streaming' to see examples of people doing this with webcams and the like.
I'm not sure about the actual technologies used, but this can end up being a huge synchronization ***** if you want to avoid dropped frames. I was streaming a video to a file and network at the same time. What I eventually ended up doing was using a big circular buffer with three pointers: one write and two read. There were three control threads (and some additional encoding threads): one writing to the buffer which would pause if it reached a point in the buffer not read by both of the others, and two reader threads that would read from the buffer and write to the file/network (and pause if they got ahead of the producer). Since everything was written and read as frames, sync overhead could be kept to a minimum.
My producer was a transcoder (from another file source), but in your case, you may want the camera to produce whole frames in whatever format it normally does and only do the transcoding (with something like ffmpeg) for the server, while the robot processes the image.
Your problem is a bit more complex, though, since the robot needs real-time feedback so can't pause and wait for the streaming server to catch up. So you might want to get frames to the control system as fast as possible and buffer some up in a circular buffer separately for streaming to the "control panel". Certain codecs handle dropped frames better than others, so if the network gets behind you can start overwriting frames at the end of the buffer (taking care they're not being read).
When you say 'a new video port' and then start talking about vlc/gstreaming i'm finding it hard to work out what you want. Obviously these software packages will assist in streaming and compressing via a number of protocols but clearly you'll need a 'network port' not a 'video port' to send the stream.
If what you really mean is sending display output via wireless video/tv feed that's another matter, however you'll need advice from hardware experts rather than software experts on that.
Moving on. I've done plenty of streaming over MMS/UDP protocols and vlc handles it very well (as server and client). However it's designed for desktops and may not be as lightweight as you want. Something like gstreamer, mencoder or ffmpeg on the over hand is going to be better I think. What kind of CPU does the robot have? You'll need a bit of grunt if you're planning real-time compression.
On the client side I think you'll find a number of widgets to handle video in GTK. I would look into that before worrying about interface details.