I'm using Open CV 2.4.6 with C++ (with Python sometimes too but it is irrelevant). I would like to know if there is a simple way to get all the available frame sizes from a capture device?
For example, my webcam can provide 640x480, 320x240 and 160x120. Suppose that I don't know about these frame sizes a priori... Is it possible to get a vector or an iterator, or something like this that could give me these values?
In other words, I don't want to get the current frame size (which is easy to obtain) but the sizes I could set the device to.
Thanks!
When you retrieve a frame from a camera, it is the maximum size that that camera can give. If you want a smaller image, you have to specify it when you get the image, and opencv will resize it for you.
A normal camera has one sensor of one size, and it sends one kind of image to the computer. What opencv does with it thereafter is up to you to specify.
Related
I am attempting to use the IMFSourceReader to read and decode a .mp4 file. I have configured the source reader to decode to MFVideoFormat_NV12 by setting a partial media type and calling IMFSourceReader::SetCurrentMediaType and loaded a video with dimensions of 1266x544.
While processing I receive the MF_SOURCE_READERF_CURRENTMEDIATYPECHANGED flag with a new dimension of 1280x544 and a MF_MT_MINIMUM_DISPLAY_APERTURE of 1266x544.
I believe the expectation is to then use either the video resizer dsp or video processor mft. However it is my understanding that the video processor mft requires windows 8.1 while I am on windows 7, and the video resizer dsp does not support MFVideoFormat_NV12.
What is the correct way to crop out the extra data added by the source reader to display only the data within the minimum display aperture for MFVideoFormat_NV12?
New media type says this: "video is 1266x544 and you expected/requested, but I have to carry it in 1280x544 textures because this is how GPU wanted it to work".
Generally speaking this does not require further scaling or cropping you already have the frames you need. If you are reading them out of sample objects - which is what I believe you are trying to do - just use increased stride (1280 bytes between consecutive rows).
If you are using this as a texture, presenting it somewhere or using it as a part of rendering, you would just use adjusted coordinates (0, 0) - (1266, 544) ignoring the remainder, as opposed to using full texture.
I'm using IPP to resize image efficiently, but the really curious part of IPP is 'Does IPP image resize function care about alpha value position?'
There's really lot of pixel type of image, but there's well-used two pixel type: 'ARGB, RGBA'. and the ippiResizeNearest_8u_C4R function resize four-channel-image according to Intel IPP function naming documentation.
And as you can see at documentation, we can distinguish function caring position of alpha channel like: A, A0. but ippiResizeNearest_8u_C4R function doesn't have any of alpha channel descriptor. so I can't even expect which color-order IPP resizes.
so my question is: Do Intel IPP image resize function care about position of alpha value? if not, what's the default pixel type for ippiResizeNearest_8u_C4R function?
I am no IPP expert but a quick read in the documentation seems to confirm what I believe to be correct. Anything else wouldn't make sense to me.
Resizing a multi channel image is usually done for each channel separately because interpolation between channels just doesn't make sense (given that we are talking about 2d images). It would completely mess up colours and transparencies.
And if we do it for every channel separately, IPP should not care about how you call your channels. It will resize ARGB the same way as RGBA. It will not change the channel order so if you put ARGB in you'll get a resized ARGB out. In case of resizing it does not matter what you store in those 4 channels.
Just try it out. Create a test image with different values in each channel. The resized image should still have those values.
I am writing a custom video rendering filter for Directshow. My renderer assumes the incoming pixels are organized one row of pixels at a time (correct assumption?) and blits them to another DirectX display elsewhere using a DirectX texture.
This approach works with webcams as input, but when I use an analog capture board, the samples the renderer receives are not in any expected order (see left image below). When I render the capture using the stock DirectShow video renderer, it looks fine (see right image below). So the directshow renderer must be doing something extra that my renderer is not. Any idea what it is?
Some more details:
The capture card is NTSC, I'm not sure if that would matter.
As input to the custom renderer, I am accepting only MEDIASUBTYPE_RGB24, so I do not think that this is a YUV issue (is it?).
It's a bit hard to see, but the second image below is my filter graph. My custom renderer connects to the color space converter on the far right.
I assume that the pixels coming into my renderer, are all organized one row of pixels at a time. Is this a correct assumption?
Maybe texture is padded to keep rows aligned at (multiply of) 32 bytes per row? Mind you that I did not ever use DirectShow but that's what I would expect in D3D.In other words, your input might have different stride than you think. Unfortunately I do not know DS so I can only assume that something that computes input / output coordinates should have different stride factor e.g. something in code that looks like this offset = y * stride + x.
I am using a DirectShow filtergraph to grab frames from videos. The current implementation follows this graph:
SourceFilter->SampleGrabber->NullRenderer
This works most of the time to extract images frame by frame for further processing. However I encountered issues with some videos that do not have a PAR of 1:1. These images occur stretched in my processing steps.
The only way to fix this I have found for now is to use a VMR9 renderer in windowless mode that uses GetCurrentImage() to extract a bitmap with the correct aspect ratio. But this method is not very useful for continuous grabbing of thousands of frames.
My question now is: what is the best way to fix this problem? Has anyone run into this issue as well?
Sample Grabber gets you frames with original pixels. It is not exactly a problem if there is aspect ratio attached and the pixels are not "square pixels". To convert to square pixels you simply need to stretch the image respectively. It would be easier for you to do this scale step outside of DirectShow pipeline, and you have all data you need: pixels and original media type. You can calculate the corresponding resolution with square pixels and resample the picture.
I am capturing images in real time using OpenCV, and I want to show these images in the OGRE window as a background. So, for each frame the background will change.
I am trying to use MemoryDataStream along with loadRawData to load the images into an OGRE window, but I am getting the following error:
OGRE EXCEPTION(2:InvalidParametersException): Stream size does not
match calculated image size in Image::loadRawData at
../../../../../OgreMain/src/OgreImage.cpp (line 283)
An image comes from OpenCV with a size of 640x480 and frame->buffer is a type of Mat in OpenCV 2.3. Also, the pixel format that I used in OpenCV is CV_8UC3 (i.e., each pixel is 8-bits and each pixel contains 3 channels ( B8G8R8 ) ).
Ogre::MemoryDataStream* videoStream = new Ogre::MemoryDataStream((void*)frame->buffer.data, 640*480*3, true);
Ogre::DataStreamPtr ptr(videoStream,Ogre::SPFM_DELETE);
ptr->seek(0);
Ogre::Image* image = new Ogre::Image();
image->loadRawData(ptr,640, 480,Ogre::PF_B8G8R8 );
texture->unload();
texture->loadImage(*image)
Why I always getting this memory error?
Quick idea, maybe memory 4-byte alignment issues ?
see Link 1 and
Link 2
I'm not an Ogre expert, but does it work if you use loadDynamicImage instead?
EDIT : Just for grins try using the Mat fields to setup the buffer:
Ogre::Image* image = new Ogre::Image();
image->loadDynamicImage((uchar*)frame->buffer.data, frame->buffer.cols, frame->buffer.rows, frame->buffer.channels(), Ogre::PF_B8G8R8);
This will avoid copying the image data, and should let the Mat delete it's contents later.
I had similar problems to get image data into OGRE, in my case the data came from ROS (see ros.org). The thing is that your data in frame->buffer is not RAW, but has a file header etc.
I think my solution was to search the data stream for the beginning of the image (by finding the appropriate indicator in the data block, e.g. 0x4D 0x00), and inserting the data from this point on.
You would have to find out were in your buffer the header ends and where your data begins.