C++ DirectShow Video and Audio capture - beginning - c++

I have finally managed to drop working with VFW after several problems I have encountered during the application development.
Thanks to StackOverflow, I am now aware that VFW is obsolete and wish to switch to DShow, to let my application work with Vista/W7.
Unfortunately, the work has been made and application has been shipped to the client, but as soon as we realized we have troubles with frame rates on Vista / W7 - we decided to rewrite the video class and use DirectShow to establish a good audio/video capture engine for webcameras.
This will be tricky, as we never coded with DShow, and right now we are looking for few specific examples of how to:
Connect to a selected webcamera
similar to: capDriverConnect
Set camera resolution to 640x480 and RGB24 format ( we need to do RGB24 to YUV420 for each frame )
similar to: capSetVideoFormat / capCaptureSetSetup
Set audio capturing for this webcamera
similar to: capSetAudioFormat
Register two callbacks:
One for video frame ( we will pass frames to video encoder )
similar to: capSetCallbackOnVideoStream
One for wave buffer ( we will pass wave buffer to audio encoder )
similar to: capSetCallbackOnWaveStream
Be able to show a preview window somewhere on parent window
similar to: capPreview
Perform Start/Stop operation when needed
Start - would mean, connect and start capturing audio/video frames
Disconnect - would mean, stop capturing audio video frames
Perform drawing to the actual frame
similar to:
SetBitmapBits(CameraInput.GetFrameBitmap(),w*h*3,vdhdr->lpData);
// draw something with gdi+
GetBitmapBits(CameraInput.GetFrameBitmap(),w*h*3,vdhdr->lpData);//set back the frame with data
All of the above was already made with VFW, but as I wrote before we unfortunately need to switch do Direct Show.
Is there anyone who could help us out achieving a class that could rescue us from months of studying Direct Show ?

Your best bet for examples will be the ones from Microsoft.
Your questions are still phrased in terms of VFW so it's hard to answer them as written. For example, in DirectShow you wouldn't register a callback for to encode a video frame. Instead, you'd develop an encoder filter that would receive data from the capture source.
As an alternative, if you're only targeting Vista and later, there is the Microsoft Media Foundation. I have no experience with it so I don't know how the learning curve compares to DirectShow.

I'd suggest you to build a graph on GraphEdit using FFDshow filters.
EditGraph is making a demonstration of building a graph on DirectShow
I don't think you need you build the filter class by your own. After you'll build the graph and you'd be able to watch the video using GraphEdit. Implementing the graph is a very simple task.

Related

OpenGL - Display video a stream of the desktop on Windows

So I am trying to figure out how get a video feed (or screenshot feed if I must) of the Desktop using OpenGL in Windows and display that in a 3D environment. I plan to integrate this with ARToolkit to make essentially a virtual screen. The only issue is that I have tried manually getting the pixels in OpenGl, but I have been unable to properly display them in a 3D environment?
I apologize in advance that I do not have minimum runnable code, but due to all the dependencies and whatnot trying to get an ARToolkit code running would be far from minimal. How would I capture the desktop on Windows and display it in ARToolkit?
BONUS: If you can grab each desktop from the 'virtual' desktops in Windows 10, that would be an excellent bonus!
Alternative: If you know another AR library that renders differently, or allows me to achieve the same effect, I would be grateful.
There are 2 different problems here:
a) Make an augmentation that plays video
b) Stream the desktop to somewhere else
For playing video on an augmentation you basically need to have a texture that gets updated on each frame. I recall that ARToolkit for Unity has an example that plays video.However.
Streaming the desktop to the other device is a problem of its own. There are tools that do screen recording, but you probably don't want that.
It sounds to me that what you want to do it to make a VLC viewer and put that into an augmentation. If I am correct, I suggest you to start by looking at existing open source VLC viewers.

Video preview image using vlc-qt (or libvlc directly)

I'd like to make a detailed video list in my Qt application using vlc-qt. Other playback engines such as QtAV or QtMultimedia are not an option. It should be vlc-qt (libvlc). That's why I need to get a small picture of a video, a preview, but can't find anything suitable for this task, except libvlc_video_take_snapshot. This method will save a picture locally, and I guess it needs a real render window to exist. That's not a good variant for me, maybe there's some better solution?

[C++|winapi]Can you access video output of application before it is displayed?

I want to capture the video output of an application using C++ and winapi, and stream it over the network. At the moment, I am capturing this output using a DirectShow filter. The application displays it's video output on the screen, and I just capture whatever it is there. I want to optimize this process.
My question is: Is there a way to capture the video/audio output of an application before it is displayed on the screen?
Thanks.
Capture video before it is shown?
It is depends on how is the application provides the video for you.
Real-time rendering - You can't access what's not exists. Like video games, or any dynamic rendering only displaying the actual state, and perhaps don't know anything about the future.
Also there's an anomaly, when rendering becomes slower than the screen's refresh rate, called screen tearing.
Static displaying - All the data is available already. For example if it's a video player application, with a video on your local machine, your only task is to get the data, and capture it with the appropriate position in time.
Last but not least, every hardware has a reaction time, a small delay to process data.
Also, there is a similar question Fastest method of screen capturing on Windows

Windows media foundation use raw image to encode video

I'm working on a project that requires me to record webcam, microphone, and the screen. I have webcam recording, audio is a work in progress, and I stumbled across CMonitor wrapper (which I did some minor modifications to) to grab RGB images of the desktop on a specified monitor (if there are multiple monitors).
How do I go about pushing my raw RGB frames into windows media foundation to encode into a video file? My current video encoding is using a slightly modified version of this msdn sample, if that's easier to modify than it is to write a new class handler.
Or, perhaps there is some sort of media foundation route to recording the screen that I don't know of (which is possible, I'm not that great of a win32 programmer)?
Found PushSource in the Windows SDK samples, which does this.
Check Desktop Duplication API for capture desktop. Media Foundation provides two solution for encoding, MF Sink Writer for simple encoding, Media Session for a more flexible control of the media pipeline. Read this overview page first.

How to hook webcam capture?

I'm working on a software that the current version has a custom made device driver of a webcam, and we use this driver with our software, that changes the captures image before displaying it, very similar to YouCam.
Basically, when any application that uses the webcam starts, our driver runs a processing in the frame before showing it.
The problem is that there is always "2" webcams installed, the real one, and our custom driver.
I noticed that YouCam does what we need, which is, to hook some method in any installed webcam that will process each frame before showing it.
Does anyone knows how to do this?
We use VC++.
Thanks
As bkritzer said, OpenCV easily does what you want.
IplImage *image = 0; // OpenCV type
CvCapture *capture = 0; // OpenCV type
// Create capture
capture = cvCaptureFromCAM (0);
assert (capture, "Can't connect webcam");
// Capture images
while (stilCapturing)
{
// Grab image
cvGrabFrame (capture);
// Retrieve image
image = cvRetrieveFrame (capture);
// You can configure refresh time
if (image) cvWaitKey (refreshTime);
// Process your image here
//...
}
You can encapsulate these OpenCV calls into a C++ class and dedicate a specific thread for it -- these will be your driver.
I think that YouCam uses DirectShow transform filter. Is that what you need?
Check out the OpenCV libraries. It has a bunch of tutorial examples and libraries that do exactly what you're asking for. It's a bit tough to install, but I've gotten it to work before.
Well, I think there are 2 key concepts in this question that have been misunderstood:
1) How to hook webcam capture
2) ...any application that uses the webcam...
If I understood right, OpenCV is useful for writing your own complete application, complete meaning that it will open camera and will process images. So it wouldn't satisfy point 2), which I understand as referring to other application (not yours!) opening the camera, and your application processing the images.
Point 1) seems to confirm it, because "hook" is a word usually meaning interception of some other process that are not part of your own application.
So I doubt if this question is answered or not. I am also interested on it.