Open-source or low-cost cross-platform video codec library that can be used for commercial purposes and supports RGBA format - c++

I'm looking for a way to display videos in a 2D game. The videos need to support an alpha-channel so they can be overlayed on top of the other game elements.
Currently I just have a series of PNG files which are decompressed and then flipped through for the animation. This works, but it is a massive memory hog; a 1024x1024 animation that is 5 seconds long at 24 frames per second takes up well over 400MB. And I'm targeting embedded systems, so this is really not good.
I've been looking for some video codecs that can support these requirements, yet so far all I've really been able to come up with that support RGBA are licensed under GPL, so we can't use them in a commercial product.
Any such beast(s) out there?

Most codecs don't support an alpha channel - the only one I can think of is the QuickTime animation codec, which isn't very popular.
If you only need binary alpha channel (transparent or not) then setting the top bit of one of the color channels is a common approach.
If these are animation type frames then something like MJPEG might work well and there are lots of LGPL licensed mjpeg libs

Related

Is there a direct way to render/encode Vulkan output as an ffmpeg video file?

I'm about to generate 2D and 3D music animations and render them to video using C++. I was thinking about using OpenGL, but I've read that, unfortunately, it is being discontinued in favour of Vulkan, which seems to offer higher performance using a GPU, but is also a lower-level API, making it more difficult to learn. I still have almost no knowledge in both OpenGL and Vulkan, beginning to learn now.
My question is:
is there a way to encode the Vulkan render output (showing a window or not) into a video file, preferentially through FFPMEG? If so, how could I do that?
Requisites:
Speed: the decrease in performance should be nearly that of encoding the video only, not much more than that (e.g. by having to save lossless frames as images first and then encoding a video from them).
Controllable FPS and resolution: the video fps and frame resolution can be freely chosen.
Reliability, reproducibility: running a code that gives a same Vulkan output twice should result in 2 equal videos independently of the system, i.e. no dropping frames, async problems (I want to sync with audio) or whatsoever. The chosen video fps should stay fixed (e.g. 60 fps), no matter if the computer can render 300 or 3 fps.
What I found out so far:
An example of taking "screenshots" from Vulkan output: it writes to a ppm image at the end, which is a binary uncompressed image file.
An encoder for rendering videos from OpenGL output, which is what I want, but using OpenGL in that case.
That Khronos includes in the Vulkan API a video subset.
A video tool to decode, demux, process videos using FFMPEG and Vulkan.
That is possible to render the output into a buffer without the need of a screen to display it.
First of all, ffmpeg is a framework used for video encoding and decoding. Second, if you have no experience with any of the GPU rendering API you should start with OpenGL. Vulkan is very low-level and complicated. OpenGL will be here for a very long time and will not be immediately replaced with Vulkan.
The off-screen rendering option you mentioned is probably the best one. It doesn't really matter though, you can also use the image from the framebuffer. The image is just a matrix of RGBA pixels. You need these data as the input for the video encoding. Please take a look at how ffmpeg works. You need to send the rendered frame data in the encoder which produces video packets that are stored in a video file. You need to chose a container (mp4, mkv, avi,...) and video format (h265, av1, vp9,...). You can of course implement a frame limiter and render the scene with a constant framerate or just pick the frames that have a constant timestep.
The performance problem happens, when you transfer the data from RAM to GPU memory and vice versa. For example, when downloading the rendered image from the buffer and passing it to the CPU encoder. Therefore, the most optimal approach would be with Vulkan, using the new video extension and directly sending the rendered frames in the HW accelerated encoder without any transfers from the GPU memory. You can also run the encoder in a different thread to make it work asynchronously.
But honestly, it's not trivial. The most simple solution (not realtime) for you to create a video from 3D render would be to:
Create a fixed FPS game loop
Make screenshots of the scene by downloading the framebuffer data in OGL or Vulkan
Process the frames by ffmpeg binary to create a video file
Another hack would be to use a screen recording software (OBS, Fraps, etc.) to create the video form your 3D app.

Using DirectShow with Direct2D

I have a windows only Direct2D application and would like to implement a video playback system for cutscenes. These files are mp4 but the format can be changed, if need be.
It seems like DirectShow is the advised way to render video/audio on windows.
Now how do I let DirectShow render the video frames to my Direct2D render target?
The VMR-9 filter looks like the best route, but I can't seem to find an elegant way of integrating it into my application
There is no Direct2D/DirectShow interoperability layer in Windows. To fit these two technologies you would have to copy data between the APIs in a rather inefficient way (and this will still take some time to develop the fitting).
With H.264/HEVC MP4 video files you would be better off using Media Foundation to read and decode frames, then load them into Direct2D bitmaps and display in your application. Performance wise it is possible to transfer video frames to Direct2D bitmaps via GPU at reasonable cost and with reasonable development effort, but even if you make a shortcut and do integration roughly and inefficiently it will be on par with DirectShow.
I recommend to start with looking at reading and decoding video frames with Media Foundation Source Reader API. Once you get familiar with fitting the technologies, you will take next step and optimize the transfer by using GPU capacity and interop between Direct3D and Direct2D.

custom opengl compressed texture

I am developing mobile games on windows. Our image resources are in the PVR-TC 4 format. When we run our game on simulator, images are decoded by CPU which is really slow, as our PC graphic card don't support GPU decode. Is it possible to make PC OpenGL support PVR-TC or ETC hardware decode?
You cannot force an implementation to implement a particular extension or image format.
Your best bet is to convert the images yourself offline. That is, instead of loading images of a format your hardware can't handle, load images of the format that it can.
After all, it's not like the images are originally in PVRTC format, right? They were originally authored in a regular format like PNG or whatever, then converted to PVRTC. So just add another conversion for S3TC or whatever format desktop hardware actually supports.

Windows Enhanced Video Renderer (EVR): Layer multiple 1080p Videos with transparency?

I am looking for ways to layer multiple 1080p Videos with transparency on Windows in C++ and DirectX or Opengl. The videos will start at different moments in time. Ideally the videos can be blended with another render target with other game content, so the resulting video texture should contain transparent pixels.
Can this be done with EVR and hardware acceleration? Which codecs are supported? http://en.wikipedia.org/wiki/Media_Foundation mentions transparency, but does not answer my questions. It sounds as if all the videos have to start at the same time and the resulting video texture has no transparency.
TIA
Christoph
This are my research results from around 03/14 with no definitive answer to this problem.
I did not try the mentioned possibility in Media Foundation, since it sounded as if the result has no transparency.
I was able to use a second gray scale video to mask the rgb video inside a shader. This can be done with a separate video stream, but syncing is needed. Moreover it is possible to encode a video with two frames side by side, but many HW accelerated video codecs do not allow this, WMF being the exception. Performance is not great but I was able to play 3 1080p30 videos simultaneously.
On a side note, to my surprise Flash was able to play 5+ 1080p30 videos with transparency simultaneously. The flash video codec allows alpha values, but I managed only inside flash to use them.

Displaying a video in DirectX

What is the best/easiest way to display a video (with sound!) in an application using XAudio2 and Direct3D9/10?
At the very least it needs to be able to stream potentially larger videos, and take care of the fact that the windows aspect ratio may differ from the videos (eg by adding letter boxes), although ideally Id like the ability to embed the video into a 3D scene.
I could of course work out a way to load each frame into a texture, discarding/reusing the textures once rendered, and playing the audio separately through XAudio2, however as well as writing a loader for at least one format, ive also got to deal with stuff like synchronising the video and audio components, so hopefully there is an eaier solution available or even a ready made free one with a suitable lisence (commercial distribution in binary form, dynamic linking is fine in the case of say LGPL).
In Windows SDK, there is a DirectShow example for rendering video to texture. It handles audio output too.
But there are limitations and I can't honestly call it easy.
Have you looked at Bink video? Its what lots of games use for video playback. Works great and you don't have to code all that video stuff yourself from scratch.