How to overlay direct3d in directshow - c++

I am looking for a tutorial or documentation on how to overlay direct3d on top of a video (webcam) feed in directshow.
I want to provide a virtual web cam (a virtual device that looks like a web cam to the system (ie. so that it be used where ever a normal webcam could be used like IM video chats)
I want to capture a video feed from a webcam attached to the computer.
I want to overlay a 3d model on top of the video feed and provide that as the output.
I had planned on doing this in directshow only because it looked possible to do this in it. If you have any ideas about possible alternatives, I am all ears.
I am writing c++ using visual studio 2008.

Use the Video Mixing Renderer Filter to render the video to a texture, then render it to the scene as a full screen quad. After that you can render the rest of the 3D stuff on top and then present the scene.

Are you after a filter that sits somewhere in the graph that renders D3D stuff over the video?
If so then you need to look at deriving a filter from CTransformFilter. Something like the EZRGB example will give you something to work from. Basically once you have this sorted your filter needs to do the Direct 3D rendering and, literally, insert the resulting image into the direct show stream. Alas you can't render the Direct3D directly to a direct show video frame so you will have to do your rendering then lock the front/back buffer and copy the 3D data out and into the direct show stream. This isn't ideal as it WILL be quite slow (compared to standard D3D rendering) but its the best you can do, to my knowledge.
Edit: In light of your update what you want is quite complicated. You need to create a source filter (You should look at the CPushSource example) to begin with. Once you've done that you will need to register it as a video capture source. Basically you need to do this by using the IFilterMapper2::RegisterFilter call in your DLLRegisterServer function and pass in a class ID of "CLSID_VideoInputDeviceCategory". Adding the Direct3D will be as I stated above.
All round you want to spend as much time reading through the DirectShow samples in the windows SDK and start modifying them to do what YOU want them to do.

Related

How to decode a video file straight to a Direct3D11 texture using Windows Media Foundation?

I'd like to decode the contents of a video file to a Direct3D11 texture and avoid the copies back and forth to CPU memory. Ideally, the library will play the audio itself and call back into my code whenever a video frame has been decoded.
On the surface, the Windows Media Foundation's IMFPMediaPlayer (ie MFPCreateMediaPlayer() and IMFPMediaPlayer::CreateMediaItemFromURL()) seem like a good match, except that the player decodes straight to the app's HWND. The documentation implies that I can add a custom video sink, but I have not been able to find documentation nor sample code on how to do that. Please point me in the right direction.
Currently, I am using libVLC to accomplish the above, but it only provides the video surface in CPU memory, which can become a bottleneck for my use-case.
Thanks.
Take a look at this source code from my project 'Stackoverflow' : MFVideoEVR
This program shows how to setup EVR (enhanced video renderer), and how to provide video samples to it, using a Source Reader.
The key is to provide video samples, so you can use them for your purpose.
This program provides samples through IMFVideoSampleAllocator. It is for DirectX9 texture. You need to change source code, and to use IMFVideoSampleAllocatorEx, instead : IMFVideoSampleAllocatorEx
About MFCreateVideoSampleAllocatorEx :
This function creates an allocator for DXGI video surfaces. The buffers created by this allocator expose the IMFDXGIBuffer interface.
So to retreive texture : IMFDXGIBuffer::GetResource
You can use this method to get a pointer to the ID3D11Texture2D interface of the surface. If the buffer is locked, the method returns MF_E_INVALIDREQUEST.
You will also have to manage sound through IMFSourceReader.
With this approach, there is no copy back to system memory.
PS : You don't talk about video format (h265, h264, mpeg2, others ??). MediaFoundation doesn't handle all video format, natively.

show tracked object in Video using OpenGL

I am extending an existing OpenGL project with new functionality.
I can play a video stream using OpenGL with FFMPEG.
Some objects are moving in the video stream. Co-ordinates of those objects are know to me.
I need to show tracking of motion for that object, like continuously drawing a point or rectangle around the object as it moves on the screen.
Any idea how to start with it?
Are you sure you want to use OpenGL for this task? Usually for computer vision algorithms, like motion tracking one uses OpenCV. In this case you could simply use the drawing functions of OpenCV as documented here.
If you are using OpenGL you might have a look at this question because in this case I guess you draw the frames as textures.

AVFoundation on OSX: OpenGL texture from video WITHOUT needing access to pixel data

I've read a lot of posts describing how people use AVAssetReader or AVPlayerItemVideoOutput to get video frames as raw pixel data from a video file, which they then use to upload to an OpenGL texture. However, this seems to create the needless step of decoding the video frames with the CPU (as opposed to the graphics card), as well as creating unnecessary copies of the pixel data.
Is there a way to let AVFoundation own all aspects of the video playback process, but somehow also provide access to an OpenGL texture ID it created, which can just be drawn into an OpenGL context as necessary? Has anyone come across anything like this?
In other words, something like this pseudo code:
initialization:
open movie file, providing an opengl context;
get opengl texture id;
every opengl loop:
draw texture id;
If you were to use the Video Decode Acceleration Framework on OS X, it will give you a CVImageBufferRef when you "display" decoded frames, which you can call CVOpenGLTextureGetName (...) on to use as a native texture handle in OpenGL software.
This of course is lower level than your question, but it is definitely possible for certain video formats. This is the only technique that I have personal experience with. However, I believe QTMovie also has similar functionality at a much higher level, and would likely provide the full range of features you are looking for.
I wish I could comment on AVFoundation, but I have not done any development work on OS X since 10.6. I imagine the process ought to be similar though, it should be layered on top of CoreVideo.

Cinder: How to get a pointer to data\frame generated but never shown on screen?

There is that grate lib I want to use called libCinder, I looked thru its docs but do not get if it is possible and how to render something with out showing it first?
Say we want to create a simple random color 640x480 canvas with 3 red white blue circles on it, and get RGB\HSL\any char * to raw image data out of it with out ever showing any windows to user. (say we have console application project type). I want to use such feaure for server side live video stream generation and for video streaming I would prefer to use ffmpeg so that is why I want a pointer to some RGB\HSV or what ever buffer with actuall image data. How to do such thing with libCInder?
You will have to use off-screen rendering. libcinder seems to be just a wrapper for OpenGL, as far as graphics go, so you can use OpenGL code to achieve this.
Since OpenGL does not have a native mechanism for off-screen rendering, you'll have to use an extension. A tutorial for using such an extension, called Framebuffer Rendering, can be found here. You will have to modify renderer.cpp to use this extension's commands.
An alternative to using such an extension is to use Mesa 3D, which is an open-source implementation of OpenGL. Mesa has a software rendering engine which allows it to render into memory without using a video card. This means you don't need a video card, but on the other hand the rendering might be slow. Mesa has an example of rendering to a memory buffer at src/osdemos/ in the Demos zip file. This solution will probably require you to write a complete Renderer class, similar to Renderer2d and RendererGl which will use Mesa's intrusctions instead of Windows's or Mac's.

Combining Direct3D, Axis to make multiple IP camera GUI

Right now, what I'm trying to do is to make a new GUI, essentially a software using directX (more exact, direct3D), that display streaming images from Axis IP cameras.
For the time being I figured that the flow for the entire program would be like this:
1. Get the Axis program to get streaming images
2. Pass the images to the Direct3D program.
3. Display the program, on the screen.
Currently I have made a somewhat basic Direct3D app that loads and display video frames from avi videos(for testing). I dunno how to load images directly from videos using DirectX, so I used OpenCV to save frames from the video and have DX upload them up. Very slow.
Right now I have some unclear things:
1. How to Get an Axis program that works in C++ (gonna look up examples later, prolly no big deal)
2. How to upload images directly from the Axis IP camera program.
So guys, do you have any recommendations or suggestions on how to make my program work more efficiently? Anything just let me know.
Well you may find it faster to use directshow and add a custom renderer at the far end that, directly, copies the decompressed video data directly to a Direct3D texture.
Its well worth double buffering that texture. ie have texture 0 displaying and texture 1 being uploaded too and then swap the 2 over when a new frame is available (ie display texture 1 while uploading to texture 0).
This way you can de-couple the video frame rate from the rendering frame rate which makes dropped frames a little easier to handle.
I use in-place update of Direct3D textures (using IDirect3DTexture9::LockRect) and it works very fast. What part of your program works slow?
For capture images from Axis cams you may use iPSi c++ library: http://sourceforge.net/projects/ipsi/
It can be used for capturing images and control camera zoom and rotation (if available).