Process audio and video independently - c++

due to the unpopularity of my last posts here and here , I'll try something else.
I have corresponding audio (.wav) and video files (.mpg). Let's consider that those two streams where recorded synchronously. I want to process both stream, with opencv for the images, and with "I don't know which audio lib" (you tell me ?) for audio, and I want to process those streams online and keep the synchronicity.
Note that the length of the video is less that 2 minutes.
Thanks for any help!

If you mean "play" then WAV files require little work to output to a PCM signal, if you are running Linux (and maybe for other OS's too with there respective audio IO libs) this can then be streamed to alsa.

Related

Use ffmpeg as external tool to stream 2 or more different sources via pipeline

I have an application running on an embedded system. This application has 2 video sources (and, theorically, 1 audio source). Concentrating to the video sources, I have 2 subprocess that computes different frames sets (unrelated each others). I want to send these frames to 2 differents streams.
I would to avoid to write a lot of ffmpeg/libav code. I have ffmpeg compiled for the embeeded system and I can use it as tool. For example, I can write to the stdout the first frames set and pass it to the ffmpeg like this:
./my_app|ffmpeg -an -i - -vcodec copy -f rtp rtp://"remote_ip
This basically works. But, now i would to send the other one frame set. How to do that? Theorically I need anoter ffmpeg instance that read from another source that can't be the stdout of "my_app", because is already busy.
I'm thinking to use 2 video files as support. I can record the 2 frames sets into 2 video files and then run 2 ffmpeg instances from these sources. In this case I think I need a way to limit the video files dimensions (like a circular buffer), because 2 streams can become really huge in time. It is a possibility?
This can sound "weird" to me: I need to record a video source in realtime and stream it via ffmpeg (always in realtime). I don't know if it is a good idea, there are realtime problems for sure:
loop:
my_app --write_into--> video_stream1.mp4
ffmpeg <--read_from-- video_stream1.mp4
my_app --write_into--> video_stream2.mp4
ffmpeg <--read_from-- video_stream2.mp4
Have you some suggestion to address this kind of situation?
many thanks, bye.

Output Raw Data to Speakers

I'm a reasonably advanced C++ programmer, as a bit of background. At this point, I'm wanting to experiment a bit with sound. Rather than use a library to load and play files, I'm wanting to figure out how to actually do that myself, for the understanding. For this application, I would like to read in a .wav file (I already have that part down), then output that data to the speakers. How do I push a waveform or the data from the file to the speakers on my computer? I'm on Windows, by the way.
You can read this article about how to set up the audio device and how to stream data into the device for playback on Windows. If using this library is too high-level for you and you'd like to go deeper and write your own decoding of WAV files and outputting that to a sound card, you have far more research to do than what's appropriate for an answer here.

Reading audio stream to output device

I was curious if there is a way to read the data that is being sent to an audio output. My end goal is to capture the audio and then send it over serial for audio processing. I'm using a Windows computer.
The thing that seems to be making this more difficult is that I'm not reading the captured microphone input, but rather the streamed speaker output.
Can anybody help me out?
A more or less easy way is to take advantage of Stereo Mix device, where available. This way you have an audio capture device, which makes you available device audio output mixed down. You can read from this device as if it were a real audio input device such as Line In, or a microphone, using standard and well documented APIs or audio libraries.
Other options are more sophisticated and require both hooking into system and deeper understanding of the internals: you either hook audio APIs to intercept what applications send to audio outputs, or you install a virtual audio device the applications use and you have the data available from.

How to stream live audio and video while keeping low latency

I'm writing a program similar to StreamMyGame with the difference of the client being free and more importantly, open source, so I can port it to other devices (in my case an OpenPandora), or even make an html5 or flash client.
Because the objective of the program is to stream video games, latency should be reduced to a minimum.
Right now I can capture video of Direct 3D 9 games at a fixed frame rate, encode it using libx264 and dumping it to disk, and send input remotely, but I'm stumped at sending the video and eventually the audio through the network.
I don't want to implement a way just to discover that it introduces several seconds of delay and I don't care how it is done as long as it is done.
Off of my head I can think several ways:
My current way, encode video with libx264 and audio with lame or as ac3 and send them with live555 as a RTSP feed, though the library is not playing nice with MSVC and I’m still trying to understand its functioning.
Have the ffmpeg library do all the grunt work, where it encodes and sends (I guess I'll have to use ffserver to get an idea on how to do it)
Same but using libvlc, perhaps hurting encoding configurability in the process.
Using several pipes with the independent programs (ie: piping data to x264.exe or ffmpeg.exe)
Use other libraries such as pjsip or JRTPLIB that might simplify the process.
The hard way, sending video and audio through an UDP channel and figuring out how to synchronizing everything at the client (though the reason to use RTSP is to avoid this).
Your way, if I didn't think of something.
The second option would really be the best as it would reduce the number of libraries (integrate swscale, libx264, the audio codec and the sender library), simplify the development and bringing more codec variety (CELT looks promising) but I worry about latency as it might have a longer pipeline.
100 ms would already be too much, especially when you consider you might be adding another 150 ms of latency when it is used trough broadband.
Does any of you have experience with these libraries, to recommend me to switch to ffmpeg, keep wrestling live555 or do anything else (even if I didn’t mentioned it)?
I had very good results of streaming large blocks of data with low latency using UDT4 library. But first I would suggest checking ffmpegs network capabilities, so you have a native solution in all operations.

streaming video to and from multiple sources

I wanted to get some ideas one how some of you would approach this problem.
I've got a robot, that is running linux and uses a webcam (with a v4l2 driver) as one of its sensors. I've written a control panel with gtkmm. Both the server and client are written in C++. The server is the robot, client is the "control panel". The image analysis is happening on the robot, and I'd like to stream back the video from the camera to the control panel for two reasons:
A) for fun
B) to overlay image analysis results
So my question is, what are some good ways to stream video from the webcam to the control panel as well as giving priority to the robot code to process it? I'm not interested it writing my own video compression scheme and putting it through the existing networking port, a new network port (dedicated to video data) would be best I think. The second part of the problem is how do I display video in gtkmm? The video data arrives asynchronously and I don't have control over main() in gtkmm so I think that would be tricky.
I'm open to using things like vlc, gstreamer or any other general compression libraries I don't know about.
thanks!
EDIT:
The robot has a 1GHz processor, running a desktop like version of linux, but no X11.
Gstreamer solves nearly all of this for you, with very little effort, and also integrates nicely with the Glib event system. GStreamer includes V4L source plugins, gtk+ output widgets, various filters to resize / encode / decode the video, and best of all, network sink and sources to move the data between machines.
For prototype, you can use the 'gst-launch' tool to assemble video pipelines and test them, then it's fairly simply to create pipelines programatically in your code. Search for 'GStreamer network streaming' to see examples of people doing this with webcams and the like.
I'm not sure about the actual technologies used, but this can end up being a huge synchronization ***** if you want to avoid dropped frames. I was streaming a video to a file and network at the same time. What I eventually ended up doing was using a big circular buffer with three pointers: one write and two read. There were three control threads (and some additional encoding threads): one writing to the buffer which would pause if it reached a point in the buffer not read by both of the others, and two reader threads that would read from the buffer and write to the file/network (and pause if they got ahead of the producer). Since everything was written and read as frames, sync overhead could be kept to a minimum.
My producer was a transcoder (from another file source), but in your case, you may want the camera to produce whole frames in whatever format it normally does and only do the transcoding (with something like ffmpeg) for the server, while the robot processes the image.
Your problem is a bit more complex, though, since the robot needs real-time feedback so can't pause and wait for the streaming server to catch up. So you might want to get frames to the control system as fast as possible and buffer some up in a circular buffer separately for streaming to the "control panel". Certain codecs handle dropped frames better than others, so if the network gets behind you can start overwriting frames at the end of the buffer (taking care they're not being read).
When you say 'a new video port' and then start talking about vlc/gstreaming i'm finding it hard to work out what you want. Obviously these software packages will assist in streaming and compressing via a number of protocols but clearly you'll need a 'network port' not a 'video port' to send the stream.
If what you really mean is sending display output via wireless video/tv feed that's another matter, however you'll need advice from hardware experts rather than software experts on that.
Moving on. I've done plenty of streaming over MMS/UDP protocols and vlc handles it very well (as server and client). However it's designed for desktops and may not be as lightweight as you want. Something like gstreamer, mencoder or ffmpeg on the over hand is going to be better I think. What kind of CPU does the robot have? You'll need a bit of grunt if you're planning real-time compression.
On the client side I think you'll find a number of widgets to handle video in GTK. I would look into that before worrying about interface details.