Virtual audio mixer on libVLC - c++

I have no experience in audio programming and I want to start with an audio player (C++, Qt, MacOS), that plays multi channel track on a multi channel audio card, multiplexing all input channels on all output.
Things, I need from a framework are:
Decode popular audio formats (FLAC) and get PCM streams for each channel
Query OS about currently installed audio cards and their capabilities (channel count)
Actually mix and transfer sound between these entities
VU Meter
Better cross-platform
From what I've learned, VLC is a powerful media framework. But I haven't found info neither if it is sufficient for my task nor any good tutorial about it.
Alternatively, I consider using Phonon (it's default media framework in Qt) or Apple CoreAudio API. What suits better for this task? Is there any good tutorials on audio programming in general and using VLC and other frameworks in particular?

Related

UWP Hardware Video Decoding - DirectX12 vs Media Foundation

I would like to use DirectX 12 to load each frame of an H264 file into a texture and render it. There is however little to no information on doing this, and the Microsoft website has limited superficial documentation.
Media Foundation has plenty of examples and offers Hardware Enabled decoding. Is the Media Foundation a wrapper around DirectX or is it doing something else?
If not, how much less optimised would the Media Foundation equivalent be in comparison to a DX 12 approach?
Essentially, what are the big differences between Media Foundation and DirectX12 Video Decoding?
I am already using DirectX 12 in my engine so this is specifically regarding DX12.
Thanks in advance.
Hardware video decoding comes from DXVA (DXVA2) API. It's DirectX 11 evolution is D3D11 Video Device part of D3D11 API. Microsoft provides wrappers over hardware accelerated decoders in the format of Media Foundation API primitives, such as H.264 Video Decoder. This decoder is offering use of hardware decoding capabilities as well as fallback to software decoding scenario.
Note that even though Media Foundation is available for UWP development, your options are limited and you are not offered primitives like mentioned transform directly. However if you use higher level APIs (Media Foundation Source Reader API in particular) you can leverage hardware accelerated video decoding in your UWP application.
Media Foundation implementation offers interoperability with Direct3D 11, in the part of video encoding/decoding in particular, but not Direct3D 12. You will not be able to use Media Foundation and DirectX 12 together out of the box. You will either have to implement Direct3D 11/12 interop to transfer the data between the APIs (or, where applicable, use shared access to the same GPU data).
Or alternatively you will have to step down to underlying ID3D12VideoDevice::CreateVideoDecoder which is further evolution of mentioned DXVA2 and Direct3D 11 video decoding APIs with similar usage.
Unfortunately if Media Foundation is notoriously known for poor documentation and hard-to-start development, Direct3D 12 video decoding has zero information and you will have to enjoy a feeling of a pioneer.
Either way all the mentioned are relatively thin wrappers over hardware assisted video decoding implementation with the same great performance. I would recommend taking Media Foundation path and implement 11/12 interop if/when it becomes necessary.
You will get a lot of D3D12 errors caused by Media Foundation if you pass a D3D12 device to IMFDXGIDeviceManager::ResetDevice.
The errors could be avoided if you call IMFSourceReader::ReadSample slowly. It doesn't matter that you adopt sync or async mode to use this method. And, how slowly it should be depends on the machine that runs the program. I use ::Sleep(1) between ReadSample calls for sync mode playing a stream from network, and ::Sleep(3) for sync mode playing a local mp4 file on my machine.
Don't ask who I am. My name is 'the pioneer'.

Using DirectShow with Direct2D

I have a windows only Direct2D application and would like to implement a video playback system for cutscenes. These files are mp4 but the format can be changed, if need be.
It seems like DirectShow is the advised way to render video/audio on windows.
Now how do I let DirectShow render the video frames to my Direct2D render target?
The VMR-9 filter looks like the best route, but I can't seem to find an elegant way of integrating it into my application
There is no Direct2D/DirectShow interoperability layer in Windows. To fit these two technologies you would have to copy data between the APIs in a rather inefficient way (and this will still take some time to develop the fitting).
With H.264/HEVC MP4 video files you would be better off using Media Foundation to read and decode frames, then load them into Direct2D bitmaps and display in your application. Performance wise it is possible to transfer video frames to Direct2D bitmaps via GPU at reasonable cost and with reasonable development effort, but even if you make a shortcut and do integration roughly and inefficiently it will be on par with DirectShow.
I recommend to start with looking at reading and decoding video frames with Media Foundation Source Reader API. Once you get familiar with fitting the technologies, you will take next step and optimize the transfer by using GPU capacity and interop between Direct3D and Direct2D.

C++ library for decoding MKV video (Dolby Digital / AVC)

Before you say it: i know that audio decoding of Dolby Digital audio is not suitable for the market for free due to license limitations. But I only want to decode MKV video on my iPhone for study purposes.
Which libraries should I use for the audio and video? It's important to get the channels of the 5.1 audio separated.
Ffmpeg (https://ffmpeg.org) should do it. Ffmpeg should build on iOS.

Getting to work kinect v2 as microphone in ubuntu 14.04?

I'm using Kinect v2 in Ubuntu 14.04 and trying to get a way to use it as a microphone using C++ as the programming language. I already have an application in C++ and Qt where the application redirects the audio streams from different audio input devices to some audio output devices.
In that application, it is possible to get a list of available audio input devices for that PC. Currently, the application already lists Xbox NUI Sensor Analog 4-channel Input as one of the audio input devices, as does ubuntu sound settings application. I have also checked if the Kinect Input audio device supports audio format of 44100 Hz sampling rate, sampling size of 16 bit, audio/pcm codec and 2 channel counts and apparently it does support.
The problem is that, in my application, I cannot hear any sound on the output when I use kinect xbox as the microphone, whereas in case of other audio input devices I can hear the sound just fine. I'm not sure what the solution could be.
I didn't find much about the microphone in libfreenect2 pages either. I know that libfreenect2 lists "audio transfer" as one of it's missing features, but on the other hand in the documentation it is also written in the Issues and Future Work section.
Audio. There is basic access to Kinect v2's audio via ALSA (Linux). However, this is directional audio with intricate calibration, which is probably beyond the scope of this image processing library.
Does this mean that it's still possible to get access to the audio stream from Kinect V2 microphone via ALSA in ubuntu 14.04 or only the fact, at most, that the Ubunut system can only detect Kinect v2 as an audio input device but cannot be used for actual recording purposes?
If it is the first case, could you suggest me how I can get an access to the audio stream of the Kinect microphone(I cannot find anything regarding the audio or the microphone in any of the docs of libfreenect2)? Do you have any other way to get kinect v2 microphone running apart from using libfreenect2?

Rendering Video in OpenGL

Is there a good solution for playing a compressed video in OpenGL?
It needs to
Be cross-platform (Windows and MacOSX)
Render to a texture (preferably but not 100% needed)
Cost less than Bink
Any ideas?
Qt can be used to render widgets (including a video player) in an OpenGL scene. It has a multimedia framework called phonon that can play video and audio.
See this demo video.
Qt is cross-platform and is now licensed under LGPL.
I recommend the Theora video format.
Here are the benefits:
Totally open, free and patent-unencoumbered specification
Free working library implementation (encoder/decoder) and source-code examples, available under a BSD-style license
Not too shabby documentation.
Portable
The decoder lets you decode to R'G'B', which can easily be uploaded with an OpenGL buffer object and fetched in a shader via a sampler.
if you mean by solution that you can build/code it, i can suggest quicktime (easy on mac with cocoa, strange on windows but it works) or you can checkout mplayer/vlc sources and try to integrate that. there are a lot of demos about this on the web.
since you need cross platform, i guess gstreamer, video4linux and directshow are nothing for you. but there are video players that support different backends on different platforms - like openFrameworks