Directshow how to control Area of Interest - c++

I'm using c++ with directshow to capture multiple images from the camera, much like how opencv camcapture does.
And the camera I'm using has a AoI control where I can move the offset of x and y to move the camera view. I try searching the web, but I couldn't find anything.
Is there a way to control those values using directshow? Cause there seems to be way to change the values of gain and what not, but there's no mention about the AoI control

This totally depends on the source directshow filer that you got with your camera and you should check it's documentation / view it's property page with graph edit.
If it's a standard filter you are using, try to switch to the one you got with your hardware, since the lack of results probably means there is no support for this option in the standard src filter.

Related

How can I horizontally mirror video in DirectShow?

I need to display the local webcam stream on the screen,
horizontally flipped,
so that the screen appears as a mirror.
I have a DirectShow graph which does all of this,
except for mirroring the image.
I have tried several approaches to mirror the image,
but none have worked.
Approach A: VideoControlFlag_FlipHorizontal
I tried setting the VideoControlFlag_FlipHorizontal flag
on the output pin of the webcam filter,
like so:
IAMVideoControl* pAMVidControl;
IPin* pWebcamOutputPin;
// ...
// Omitting error-handing for brevity
pAMVidControl->SetMode(pWebcamOutputPin, VideoControlFlag_FlipHorizontal);
However, this has no effect.
Indeed, the webcam filter claims to not have this capability,
or any other capabilities:
long supportedModes;
hr = pAMVidControl->GetCaps(pWebcamOutputPin, &supportedModes);
// Prints 0, i.e. no capabilities
printf("Supported modes: %ld\n", supportedModes);
Approach B: SetVideoPosition
I tried flipping the image by flipping the rectangles passed to SetVideoPosition.
(I am using an Enhanced Video Renderer filter, in windowless mode.)
There are two rectangles:
a source rectangle and a destination rectangle.
I tried both.
Here's approach B(i),
flipping the source rectangle:
MFVideoNormalizedRect srcRect;
srcRect.left = 1.0; // note flipped
srcRect.right = 0.0; // note flipped
srcRect.top = 0.0;
srcRect.bottom = 0.5;
return m_pVideoDisplay->SetVideoPosition(&srcRect, &destRect);
This results in nothing being displayed.
It works in other configurations,
but appears to dislike srcRect.left > srcRect.right.
Here's approach B(ii),
flipping the destination rectangle:
RECT destRect;
GetClientRect(hwnd, &destRect);
LONG left = destRect.left;
destRect.left = destRect.right;
destRect.right = left;
return m_pVideoDisplay->SetVideoPosition(NULL, &destRect);
This also results in nothing being displayed.
It works in other configurations,
but appears to dislike destRect.left > destRect.right.
Approach C: IMFVideoProcessorControl::SetMirror
IMFVideoProcessorControl::SetMirror(MF_VIDEO_PROCESSOR_MIRROR)
sounds like what I want.
This IMFVideoProcessorControl interface is implemented by the Video Processor MFT.
Unfortunately, this is a Media Foundation Transform,
and I can't see how I could use it in DirectShow.
Approach D: Video Resizer DSP
The Video Resizer DSP
is "a COM object that can act as a DMO",
so theoretically I could use it in DirectShow.
Unfortunately, I have no experience with DMOs,
and in any case,
the docs for the Video Resizer don't say whether it would support flipping the image.
Approach E: IVMRMixerControl9::SetOutputRect
I found
IVMRMixerControl9::SetOutputRect,
which explicitly says:
Because this rectangle exists in compositional space,
there is no such thing as an "invalid" rectangle.
For example, set left greater than right to mirror the video in the x direction.
However, IVMRMixerControl9 appears to be deprecated,
and I'm using an EVR rather than a VMR,
and there are no docs on how to obtain a IVMRMixerControl9 anyway.
Approach F: Write my own DirectShow filter
I'm reluctant to try this one unless I have to.
It will be a major investment,
and I'm not sure it will be performant enough anyway.
Approach G: start again with Media Foundation
Media Foundation would possibly allow me to solve this problem,
because it provides "Media Foundation Transforms".
But it's not even clear that Media Foundation would fit all my other requirements.
I'm very surprised that I am looking at such radical solutions
for a transform that seems so standard.
What other approaches exist?
Is there anything I've overlooked in the approaches I've tried?
How can I horizontally mirror video in DirectShow?
If Option E does not work (see comment above; neither source nor destination rectangle allows mirroring), and given that it's DirectShow I would offer looking into Option F.
However writing a full filter might be not so trivial if you never did this before. There are a few shortcuts here though. You don't need to develop a full filter: similar functionality can be reached at least using two alternate methods:
Sample Grabber Filter with a ISampleGrabberCB::SampleCB callback. You will find lots of mentions for this technic: when inserted into graph your code can receive a callback for every processed frame. If you rearrange pixels in frame buffer within the callback, the image will be mirrored.
Implement a DMO and insert it into filter graph with the help of DMO Wrapper Filter. You will have a chance to similarly rearrange pixels of frames, with a bit more of flexibility at the expense of more code to write.
Both mentioned will be easier to do because you don't have to use DirectShow BaseClasses, which are notoriously obsolete in 2020.
Both mentioned will not require to understand data flow in DirectShow filter. Both and also developing full DirectShow filter assume that your code supports rearrangement in a limited set of pixel formats. You can go with 24-bit RGB for example, or typical formats of webcams such as NV12 (nowadays).
If your pixel data rearrangement is well done without need to super-optimize the code, you can ignore performance impact - either way it can be neglected in most of the cases.
I expect integration of Media Foundation solution to be more complicated, and much more complicated if Media Foundation solution is to be really well optimized.
The complexity of the problem in first place is the combination of the following factors.
First, you mixed different solutions:
Mirroring right in web camera (driver) where your setup to mirror results that video frames are already mirrored at the very beginning.
Mirroring as data flows through pipeline. Even though this sounds simple, it is not: sometimes the frames are yet compressed (webcams quite so often send JPEGs), sometimes frames can be backed by video memory, there are multiple pixel formats etc
Mirroring as video is presented.
Your approach A is #1 above. However if there is no support for the respected mode, you can't mirror.
Mirroring in EVR renderer #3 is apparently possible in theory. EVR used Direct3D 9 and internally renders a surface (texture) into scene so it's absolutely possible to setup 3D position of the surface in the way that it becomes mirrored. However, the problem here is that API design and coordinate checks are preventing from passing mirroring arguments.
Then Direct3D 9 is pretty much deprecated, and DirectShow itself and even DirectShow/Media Foundation's EVR are in no way compatible to current Direct3D 11. Even though a capability to mirror via hardware might exist, you might have hard time to consume it through the legacy API.
As you want a simple solution you are limited with mirroring as the data is streamed through, #2 that is. Even though this is associated with reasonable performance impact you don't need to rely on specific camera or video hardware support: you just swap the pixels in every frame and that's it.
As I mentioned the easiest way is to setup SampleCB callback using either 24-bit RGB and/or NV12 pixel format. It depends on whatever else your application is doing too, but with no such information I would say that it is sufficient to implement 24-bit RGB and having the video frame data you would just go row by row and swap the three byte pixel data width/2 times. If the application pipeline allows you might want to have additional code path to flip NV12, which is similar but does not have the video to be converted to RGB in first place and so is a bit more efficient. If NV12 can't work, RGB24 would be a backup code path.
See also: Mirror effect with DirectShow.NET - I seem to already explained something similar 8 years ago.

Accessing a Projector using MATlab

I wish to display an image through my projector via MATlab. The projected image should be full sized without any figure handle bars (menu bar, the grey stuff which encompasses a figure etc).
Similar to a normal presentation when the projector projects the complete slide or image, I want to do the same using MATlab as my platform. Any thoughts or idea? Can we access the projector using MATlab? My first thoughts were to send data to the corresponding printer IP but that doesn't seem to work :/
If you know the relevant C++ command or method to do this, please suggest a link or a library, so that I may try and import it on my MATlab platform.
Reason for doing this: Projector-Camera calibration for photo-metric correction of my projector display output.
Assuming your projector is set as a second display, you can do something very simple. Get the monitor position information and set the figure frame to be the monitor size
// plot figure however you want
monitorFrames = get(0,'MonitorPositions');
secondMonitor = monitorFrames(2,:);
secondMonitor(3) = secondMonitor(3)-monitorFrames(1,3);
set(gcf,'Position',secondMonitor);
This will put the figure window onto the second monitor and have it take up the whole screen.
You can then use this to do whatever calibration you need, and shift this window around as necessary.
NOTE:
In no way am I saying this is the ideal solution. It is quick and dirty, and will not use any outside libraries.
UPDATE
If the above solution does not suit your specific needs, what you could always do is save the plot as an image, then have your MATLAB script, call a c++ script that opens the image and makes it full screen.
This is non-trivial. For Windows you can use the WindowAPI submission to the MATLAB File Exchange. With the WindowAPI function installed you can do
WindowAPI(FigH, 'Position', 'full');
For Mac and Linux you can use wrappers around OpenGL to do low level plotting, but you cannot use standard MATLAB figure windows. One nice implementation is PsychToolbox.

Face detection and image preview drawing

I'm developing application that uses DirectShow combined with C++.
Its main goal is to capture users' faces.
I have reached the phase when I capture a image from my webcam.
The problem is I need an intelligent render. In fact, I need that render to be able to detect a face inside a rectangle.
I'm wondring if there is a filter that I can use for this purpose,
or if I need to create my own custmized filter.
If so enlighten my mind.
It would look like this:
I need to understand how I can draw a recangle in my render in the first place. Because otherwise, even if I know the algorithm, I will not be able to apply it. This is my main goal now.
I have some idea but I don't know if they are correct. I think I need to grab each frame separately and apply some modification in some pixels, like what's drawn in the live render.
Have a look at OpenCV
Quick look inside and I found this.
Making your own "filter" that works well is no easy job.
Are you talking about automatic detection of where there is something like a human face in the shot you have taken with the webcam? In this case object detection algorithms like Viola-Jones might be interesting for you.
If a commercial package is an option, you can use the Montivision Filter SDK which includes filters that should do the job out of the box. They offer a free eval which is perfect for experimentation.

QR Codes - Camera Orientation/Projection

I am looking for a library or method to decode a QR Code (or potentially another form of 2d barcode) and to be able to actually determine the camera position and orientation. This seems like it should be doable, but I am not entirely sure.
Does anyone know what the best route for this is? Or if it is even possible?
zxing is the open-source Google-hosted Java library for 2d barcodes including QR.
see com.google.zxing.ResultMetadataType.ORIENTATION (optional metadata returned in a hashtable from com.google.zxing.Result.getResultMetadata()):
Denotes the likely approximate orientation of the barcode in the image. This value is given as degrees rotated clockwise from the normal, upright orientation. For example a 1D barcode which was found by reading top-to-bottom would be said to have orientation "90". This key maps to an Integer whose value is in the range [0,360).
Many Android apps make heavy use of QR codes - if I were you I'd do some research using Android as one of the keywords and may be add "android" as a tag on this Q (or post android-specific version of it).
P.S. Since Android code is IIRC open source avialable from Google, if the QR logic is available in core Android you'd be able to have access to it.

Directdraw: Rotate video stream

Problem
Windows Mobile / Directdraw: Rotate video stream
The video preview is working, all I need now is a way to rotate the image. I think the only way to handle this is to write a custom filter based on CTransformFilter that will rotate the camera image for you. If you can help me to solve this problem, e.g. by helping me to develop this filter with my limited DirectDraw knowledge, the bounty is yours.
Background / Previous question
I'm currently developing an application for a mobile device (HTC HD2, Windows Mobile 6). One of things the program needs to do is to take pictures using the built-in camera. Previously I did this with the CameraCaptureDialog offered by the Windows Mobile 6 SDK, but our customer wants a more user-friendly solution.
The idea is to preview the camera's video stream in a control and take a high resolution picture (>= 2 megapixels) using the camera's photo function, when the control is clicked. We did some research on the topic and found out the best way to accomplish this seems to be using Direct Draw.
The downsides are that I never really used any native windows API and that my C++ is rather bad. In addition to this I read somewhere that the Direct Draw support of HTC phones is particularity bad and you will have to use undocumented native HTC libraries calls to take high quality pictures.
The good news is that a company offered us to develop a control that meets the specifications stated above. They estimated it would take them about 10 days, which lead to the discussion if we could develop this control ourself within a reasonable amount of time.
It's now my job to research which alternative is better. Needless to say it's far too less time to study the whole architecture and develop a demo, which lead me to the following questions:
Questions no longer relevant!
Does any of you have experience with similar projects? What are your recommendations?
Is there a good Direct Draw source code example that deals with video preview and image capturing?
Well if you look at the EZRGB24 sample you get the basics of a simple video transform filter.
There are 2 things you need to do to the sample to get it to do what you want.
1) You need to copy x,y to y,x.
2) You need to tell the media sample that the sample is now Height x Width instead of Width x Height.
Bear in mind that the final image will have exactly the same number of pixels.
To solve 1 is relatively simple. You can calculate the position of a pixel by doing "x + (y * Width)". So you step through each x and y calculate the position that way and then write it to "y + (x * Height)". This will transpose the image. Of course without step2 this will look completely wrong.
To solve 2 you need to get the AM_MEDIA_TYPE of the input sample. You then need to find out what the formatType is (Probably FormatType_VideoInfo or FormatType_VideoInfo2). You can thus cast the pbFormat member of AM_MEDIA_TYPE to either a VIDEOINFOHEADER or a VIDEOINFOHEADER2 (Depending on the FormatType). You need to now set VIDEOINFOHEADER[2]::bmiHeader.biWidth and biHeight to the biHeight and biWidth (respectively) of the input media sample. Everything else should be the same as the input AM_MEDIA_TYPE.
I hope that helps a bit.
This question will help you get some details about DirectDraw. I did some research about this some time ago and the best I could find was this blog post (also mentioned in the above question). The post presents an extension of the CameraCapture sample in the SDK.
However, don't have high expectations. It seems that the preview and the picture taken will only work in small resolution. Although DirectDraw does describe a way of configuring the resolution, there is no guarantee that this will be properly implemented by the driver.
So from my experience what you have read is true. The only way to do it will be to use HTC drivers. So, if you don't want to spend endless days in reverse engineering for a doubtful result, let someone else do the job for you. If you want to give it a shot, try xda-developers forum.