I'm using DirectShow to access a video stream, and then using the SampleGrabber filter and interface to get samples from each frame for further image processing. I'm using a callback, so it gets called after each new frame. I've basically just worked from the PlayCap sample application and added a sample filter to the graph.
The problem I'm having is that I'm trying to display the grabbed samples on a different OpenCV window. However, when I try to cast the information in the buffer to an IplImage, I get a garbled mess of pixels. The code for the BufferCB call is below, sans any proper error handling:
STDMETHODIMP BufferCB(double Time, BYTE *pBuffer, long BufferLen)
{
AM_MEDIA_TYPE type;
g_pGrabber->GetConnectedMediaType(&type);
VIDEOINFOHEADER *pVih = (VIDEOINFOHEADER *)type.pbFormat;
BITMAPINFO* bmi = (BITMAPINFO *)&pVih->bmiHeader;
BITMAPINFOHEADER* bmih = &(bmi->bmiHeader);
int channels = bmih->biBitCount / 8;
mih->biPlanes = 1;
bmih->biBitCount = 24;
bmih->biCompression = BI_RGB;
IplImage *Image = cvCreateImage(cvSize(bmih->biWidth, bmih->biHeight), IPL_DEPTH_8U, channels);
Image->imageSize = BufferLen;
CopyMemory(Image->imageData, pBuffer, BufferLen);
cvFlip(Image);
//openCV Mat creation
Mat cvMat = Mat(Image, true);
imshow("Display window", cvMat); // Show our image inside it.
waitKey(2);
return S_OK;
}
My question is, am I doing something wrong here that will make the image displayed look like this:
Am I missing header information or something?
The quoted code is a part of the solution. You create here an image object of certain width/height with 8-bit pixel data and unknown channel/component count. Then you copy data from another buffer of unknown format.
The only chance for it to work well is that all unknowns amazingly match without your effort. So you basically need to start with checking what media type is exactly on Sample Grabber's input pin. Then, if it is not what you wanted, you have to update your code respectively. It might also be important what is the downstream connection of the SG, and whether it is connected to video renderer in particular.
Related
I have a Mipi camera that captures frames and stores them into the struct buffer that you can see below. Once the frame is stored I want to convert it into a cv::Mat, the thing is that the Mat ends up looking like the first pic.
The var buf.index is just part of the V4L2 API, useful to understand which buffer I'm using.
//The structure where the data is stored
struct buffer{
void *start;
size_t length;
};
struct buffer *buffers;
//buffer->mat
cv::Mat im = cv::Mat(cv::Size(width, height), CV_8UC3, ((uint8_t*)buffers[buf.index].start));
At first I thought that the data might be corrupted but storing the image with lodepng results in a nice image without any distortion.
unsigned char* out_buf = (unsigned char*)malloc( width * height * 3);
for(int pix = 0; pix < width*height; ++pix) {
memcpy(out_buf + pix*3, ((uint8_t*)buffers[buf.index].start)+4*pix+1, 3);
}
lodepng_encode24_file(filename, out_buf, width, height);
I bet it's something really silly.
the picture you post has oddly colored pixels and the patterns look like there's more information than simply 24 bits per pixel.
after inspecting the data, it appears that V4L gives you four bytes per pixel, and the first byte is always 0xFF (let's call that X). further, the channel order seems to be XRGB.
create a cv::Mat using 8UC4 to contain the data.
to use the picture in OpenCV, you need BGR order. cv::split the received data into its four color planes which are X,R,G,B. use cv::merge to reassemble the B,G,R planes into a picture that OpenCV can handle, or reassemble into R,G,B to create a Mat for other purposes (that other library you seem to use).
How to read an image in C++ as a 2D array? I need to create a C/C++ program that reads an image (all formats) as a 2D array to show pixel values (0-255), divide the image into blocks and apply different compression methods using pixels blocks (BTC, AMBTC, MMBTC) and saving the new image by hand without using already set libraries (must not use magic++)..
thanks in advance
Here's some 'outline' code using MFC's CImage class that may help you. I've shown how to use the basic Load and Save options, and how to get a 'raw' array of pixel data (note: it's best to convert to 32-bit format, so you can be sure the DWORD pointer you get will really be to a width X height array - other BPP formats can give strange results):
First, load from file (CImage will know or guess the format from the file extension):
CImage original;
original.Load("Yourfile.jpg"); // Use actual file path, obviously
int pw = original.GetWidth(), ph = original.GetHeight(); // Dimensions
CImage working; // Use this to hold our 32-bit image
working.Create(pw, ph, 32);
// Next, copy image from original to working...
HDC hDC = working.GetDC();
original.BitBlt(hDC, 0, 0, SRCCOPY);
working.ReleaseDC();
// Get a DWORD pointer to the pixel data...
BITMAP bmp;
GetObject(working.operator HBITMAP(), sizeof(BITMAP), &bmp);
DWORD* pixbuf = static_cast<DWORD*>(bmp.bmBits);
// We can now access any pixel(x,y) data using: pixbuf[x + y * pw]
You now do all sorts of work on your image buffer, using the pixbuf array, as stated in the comment. For clarity: each DWORD (32-bit unsigned) in the buffer will be the RGBA data (where A is the 'alpha` channel - set to zero) but in reversed order; so, for each DWORD, the bytes will be 0xBBBBGGGGRRRR0000.
When you're done, you can save your modified image as follows:
CImage savepic;
savepic.Create(pw, ph, 24); // Change 24 to whatever BPP you require
hDC = savepic.GetDC();
working.BitBlt(hDC, 0, 0, SRCCOPY); // Copies modified image to output
savepic.ReleaseDC();
savepic.Save("NewFile.jpg"); // CImage understand what format to use base on extension
Of course, in a real-world program, there are error checks that you will need to make (most CImage methods return a status indicator, and GetLastError() can be used), and you would probably be safer copying the 'pixbuf' data to a separate memory zone - but, hopefully, this brief outline will help you get started.
Feel free to ask for further clarification and/or explanation.
I've been working on a Webcam video recorder and I got interested in trying everything when it comes to this topic but there's this problem that I can't solve.
Everything that you might wonder about can be found here
https://msdn.microsoft.com/en-us/library/windows/desktop/dd757677%28v=vs.85%29.aspx and here
https://msdn.microsoft.com/en-us/library/windows/desktop/dd757694%28v=vs.85%29.aspx
Now, in this code
if (capSetCallbackOnVideoStream(hCapWnd, capVideoStreamCallback))
{
capCaptureSequenceNoFile(hCapWnd); //Capture
}
I make sure that every frame that gets captured is sent to capVideoStreamCallback.
Now what I'm trying to do is transform a frame to an image and save it somewhere, this might be useless but it's interesting and it is surely possible.
Here is my capVideoStreamCallback function (it's commented):
LRESULT CALLBACK capVideoStreamCallback(HWND hWnd, LPVIDEOHDR lpVHdr)
{
BYTE *Image;
BITMAPINFO * TempBitmapInfo = new BITMAPINFO;
ULONG Size;
// First we need to get the full size of the image
Size = capGetVideoFormat(hWnd, TempBitmapInfo, sizeof(BITMAPINFO)); //header size
Size += lpVHdr->dwBytesUsed; //bytes used
Image = new BYTE[Size];
memcpy(Image, TempBitmapInfo, sizeof(BITMAPINFO)); //copy the header to Image
// lpVHdr is LPVIDEOHER passed into callback function.
memcpy(Image + sizeof(BITMAPINFO), lpVHdr->lpData, lpVHdr->dwBytesUsed); //copy the data to Image
//write the image
ofstream output("image.dib", ios::binary);
for (int i = 0; i < Size; i++)
{
output << (BYTE)Image[i];
}
output.close();
return (LRESULT)TRUE;
}
So, the information about every frame that gets sent to capVideoStreamCallback can be found in lpVHdr which is a structure (https://msdn.microsoft.com/en-us/library/windows/desktop/dd757688%28v=vs.85%29.aspx) and what I'm trying to do here is to take that information and transform it to an image.
I first start by getting the full size of the image by retrieving the size of the header and the size of the data and then I dynamically declared a BYTE Array called Image and copied the header and the data to Image using memcpy. I finally used ofstream to write the bytes to a file and that's pretty much it.
The problem is that everything works just fine but the image is somehow corrupted because it cannot be opened.
What is wrong in what I'm doing? It seems so logical but it's not working.
Please share your ideas and thanks for reading.
Here's the answer thanks to Frankie-C from http://codeproject.com who reminded me that I needed a BITMAPFILEHEADER structure at the top of the BITMAP File.
There's also few extra stuff that you need to do to get the image to show up the way it should be such as flipping bytes to get BGR instead of RGB etc, here's a nice tut explaining that: http://tipsandtricks.runicsoft.com/Cpp/BitmapTutorial.htm
I'm building a skin-detection algorithm that takes constant, real-time feed with a webcam, converts it to a binary image (based on the skin color of the person's face), and filters out the noise by only showing focusing on the largest blobs (using CvBlobsLib). The output of my code, however, shows a lot of lag, and I'm not sure what to change to make it faster.
Here's (the important part of) my code:
Mat frame;
IplImage ipl, *res = new IplImage;
CBlobResult blobs;
CBlob *currentBlob;
cvNamedWindow("output");
for(;;){
cap >> frame; //get a new frame from camera
cvtColor(frame, lab, CV_BGR2Lab);//frame now in L*a*b*
inRange(lab, BW_MIN, BW_MAX, bw);//frame now only shows "skin values"...BW_MIN/BW_MAX determined earlier
ipl = bw; //IplImage header
blobs = CBlobResult(&ipl, NULL, 0);
blobs.Filter(blobs, B_EXCLUDE, CBlobGetArea(), B_LESS, 10000);
res = cvCreateImage(cvGetSize(&ipl), IPL_DEPTH_8U, 3);
cvMerge(&ipl, &ipl, &ipl, NULL, res);
cvShowImage("output", res);
if(waitKey(5) >= 0) break;
}
cvDestroyWindow("output");
I convert Mat to IplImage because CvBlobsLib only works with the IplImage type.
Does anyone see a way that I could make this faster? I've just recently heard other blob detection libraries do a better job with real-time video, but I'd be interested to see if there's something I'm simply overlooking in my code.
You can decrease the resolution of the camera capture using set method
set(CV_CAP_PROP_FRAME_WIDTH , double width)
and
set(CV_CAP_PROP_FRAME_HEIGHT , double height)
If your default capture resolution is too high, this can increase the detection speed considerably.
I have a CGImage (core graphics, C/C++). It's grayscale. Well, originally it was B/W, but the CGImage may be RGB. That shouldn't matter. I want to create a CCITT-Group 4 TIFF.
I can create an LZW TIFF (grayscale or color) via creating a destination with the correct dictionary and adding the image in. No problem.
However, there doesn't seem to be an equivalent kCGImagePropertyTIFFCompression value to represent CCITT-4. It should be 4, but that produces uncompressed.
I have a manual CCITT compression routine, so if I can get the binary (1 bit per pixel) data, I'm set. But I can't seem to get 1 BPP data out of a CGImage. I have code that is supposed to put the CGImage into a CGBitmapContext and then give me the data, but it seems to be giving me all black.
I've asked a couple of questions today trying to get at this, but I just figured, lets ask the question I REALLY want answered and see if someone can answer it.
There's GOT to be a way to do this. I've got to be missing something dumb. What is it?
This seems to work and produce not-all-black output. There may be a way to do it that doesn't involve a manual conversion to grayscale first, but at least it works!
static void WriteCCITTTiffWithCGImage_URL_(CGImageRef im, CFURLRef url) {
// produce grayscale image
CGImageRef grayscaleImage;
{
CGColorSpaceRef colorSpace = CGColorSpaceCreateWithName(kCGColorSpaceGenericGray);
CGContextRef bitmapCtx = CGBitmapContextCreate(NULL, CGImageGetWidth(im), CGImageGetHeight(im), 8, 0, colorSpace, kCGImageAlphaNone);
CGContextDrawImage(bitmapCtx, CGRectMake(0,0,CGImageGetWidth(im), CGImageGetHeight(im)), im);
grayscaleImage = CGBitmapContextCreateImage(bitmapCtx);
CFRelease(bitmapCtx);
CFRelease(colorSpace);
}
// generate options for ImageIO. Man this sucks in C.
CFMutableDictionaryRef options = CFDictionaryCreateMutable(kCFAllocatorDefault, 2, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
{
{
CFMutableDictionaryRef tiffOptions = CFDictionaryCreateMutable(kCFAllocatorDefault, 1, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
int fourInt = 4;
CFNumberRef fourNumber = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fourInt);
CFDictionarySetValue(tiffOptions, kCGImagePropertyTIFFCompression, fourNumber);
CFRelease(fourNumber);
CFDictionarySetValue(options, kCGImagePropertyTIFFDictionary, tiffOptions);
CFRelease(tiffOptions);
}
{
int oneInt = 1;
CFNumberRef oneNumber = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &oneInt);
CFDictionarySetValue(options, kCGImagePropertyDepth, oneNumber);
CFRelease(oneNumber);
}
}
// write file
CGImageDestinationRef idst = CGImageDestinationCreateWithURL(url, kUTTypeTIFF, 1, NULL);
CGImageDestinationAddImage(idst, grayscaleImage, options);
CGImageDestinationFinalize(idst);
// clean up
CFRelease(idst);
CFRelease(options);
CFRelease(grayscaleImage);
}
Nepheli:tmp ken$ tiffutil -info /tmp/output.tiff
Directory at 0x1200
Image Width: 842 Image Length: 562
Bits/Sample: 1
Sample Format: unsigned integer
Compression Scheme: CCITT Group 4 facsimile encoding
Photometric Interpretation: "min-is-black"
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Number of Strips: 1
Planar Configuration: Not planar
ImageMagick can convert from and to almost any image format. As it is open source you can go and read the source code to find the answer to your question.
You can even use the ImageMagick API in you app if you use C++.
Edit:
If you can get the data from CGImage in any format (and it sounded like you can) you can use ImageMagick to convert it from whatever the format is that you get from CGImage to any other format supported by ImageMagick (your desired TIFF format).
Edit:
Technical Q&A QA1509
Getting the pixel data from a CGImage object states:
On Mac OS X 10.5 or later, a new call has been added that allows you to obtain the actual pixel data from a CGImage object. This call, CGDataProviderCopyData, returns a CFData object that contains the pixel data from the image in question.
Once you have the pixel data you can use ImageMagick to convert it.
NSBitmapImageRep claims to be able to generate a CCITT FAX Group 4 compressed TIFF. So something like this might do the trick (untested):
CFDataRef tiffFaxG4DataForCGImage(CGImageRef cgImage) {
NSBitmapImageRep *imageRep =
[[[NSBitmapImageRep alloc] initWithCGImage:cgImage] autorelease];
NSData *tiffData =
[imageRep TIFFRepresentationUsingCompression:NSTIFFCompressionCCITTFAX4
factor:0.0f];
return (CFDataRef) tiffData;
}
This function should return the data you seek.