How to convert yuy2 video samples to rgb samples?

How to convert yuy2 video samples to rgb samples? - c++

I know the formula to convert yuy2 to rgb as described in here:
Convert yuy2 to bitmap
My problem is that I don't know how to apply it in a directshow filter:
In directshow i have a buffer and a header but how do I convert these into rgb?
The formula is:
int C = luma - 16;
int D = cr - 128;
int E = cb - 128;
r = (298*C+409*E+128)/256;
g = (298*C-100*D-208*E+128)/256;
b = (298*C+516*D+128)/256;
How do i get these values and how do I write them into the output buffer?
This is how i copy the buffer at the moment:
long lSizeSample = sample->GetSize();
long lSizeOutSample = outsample->GetSize();
outsample->GetPointer(&newBuffer);
sample->GetPointer(&sampleBuffer);
memcpy((void *)newBuffer, (void *)sampleBuffer, lSizeSample);
So i just copy the buffer. But how do i modify it?

Instead of memcpy you are expected to convert pixel by pixel, taking into consideration strides, planar/packed formatting etc. In most cases this needs to be well optimized, such as using SIMD, for decent performance.
You can do the math yourself, of course, but you can also have the conversion done for you by Color Converter DSP, if Vista+ is OK for you.
The DSP is available as DMO, or you can use DMO Wrapper Filter and use it as a readily available DirectShow filter.

Related

Scaling from YUV 10 bit to RGB

I finally managed to convert, with the aid of libyuv, a sample of type MFVideoFormat_P010 (read with Media Foundation) and I got a buffer of 10-bit values in a 32-bit structure like this:
struct ar30
{
unsigned b : 10;
unsigned g : 10;
unsigned r : 10;
unsigned a : 2;
};
Now I want to convert these RGB values to display into my HDR Direct2D context which has a DXGI_FORMAT_R16G16B16A16_FLOAT format and accepts WIC bitmaps created in a GUID_WICPixelFormat128bppPRGBAFloat format.
My problem is how to scale these values. Scaling to [0..1] misses the point of HDR anyway, but scaling to [0..4] creates a too much bright image.
which makes me think that the scaling between this 10-bit and floating point cannot be linear. Looking the images indicates that something about the luma is wrong.
Any clue on the proper conversion?
Thanks a lot.

Alternative ways to perform 'cv::flip(img, img, 1)'

I am receiving an image in the form of raw data from a sensor insied ROS. Following fields are known:
Width: 1920
Height: 1080
Encoding: BGRA8
Bytes Per Pixel: 4
Image Data: 1-D byte array having 8294400 elements //1920 * 1080 * 4 = 8294400
I need to visualize this image in ROS hence I am composing ROS supported image as follows:
std::basic_string<char> color_pixels = color_frame.data();
//Step 1: Create OpenCV image
cv::Mat image(cv::Size(color_frame.width(), color_frame.height()), CV_8UC4, const_cast<char*>(color_pixels.data()), cv::Mat::AUTO_STEP);
//Step 2: Convert OpenCV image to CvBridge image
cv_bridge::CvImage cv_image;
cv_image.header.frame_id = frame_id;
cv_image.encoding = "bgra8";
cv_image.image = image;
//Step 3: Convert CvBridge image to ROS image
sensor_msgs::Image ros_image;
cv_image.toImageMsg(ros_image);
This looks fine. However, during visualization, I noticed that image is flipped. Hence in order to restore it back, I have to use cv::flip before step 2 in following way:
cv::flip(image, image, 1);
Since I have all the raw values and the above process seems long, I decided to compose sensor_msgs::Image directly. However, I am not able to perform cv::flip operation.
Below are my queries:
Why is flip required? Is the raw data captured incorrectly?
Is it possible to perform the operation similar to cv::flip directly in a byte array?
My approach
I tried to reverse the byte array but it didn't work. Please see below the code snippet:
std::basic_string<char> color_pixels = color_data.data();
std::vector<unsigned char> color_values(color_pixels.begin(), color_pixels.end());
std::reverse(color_values.begin(), color_values.end());
sensor_msgs::Image ros_image;
//width, height, encoding etc. are set but not shown here
ros_image.data = color_values;

probably you can set up your camera to flip the image. That is the proper place to specify this. (Sometimes, our cameras are mounted upside down)
since your byte array is {b0, g0, r0, a0, b1, g1, r1, a1, ...}, simply reversing it will result in a {aN, rN, gN, bN, ...}, and your format becomes argb. cv::flip already accounts for this. Just saying " the above process seems long" is not enough reason to do this by yourself: it will complicate your code, and will result in a poor replication of what the opencv guys already provided you with.
If you really want to write that yourself, maybe b = reinterpret_cast<BRGRA*>(&color_data.data().front()), and do the reverse on the reinterpreted 'structs'.
struct BRGRA { char b, g, r, a };
std::reverse(
reinterpret_cast<BRGRA*>( &data.front() ),
reinterpret_cast<BRGRA*>( &data.back() ) + 1,
reinterpret_cast<BRGRA*>( &data.front() ));
-- EDIT
The fact that this answer was accepted without additional comments to the suggested code proves the point that it's hard to provide better than a 'poor replication': above snippet will reverse into , turning it upside-down. (viewed 'only' 40 times, but still...). So it can only work for single-line images.

Overlaying/merging two (and more) YUV images in OpenCV

I investigated and stripped down my previous question (Is there a way to avoid conversion from YUV to BGR?). I want to overlay few images (format is YUV) on the resulting, bigger image (think about it like it is a canvas) and send it via network library (OPAL) forward without converting it to to BGR.
Here is the code:
Mat tYUV;
Mat tClonedYUV;
Mat tBGR;
Mat tMergedFrame;
int tMergedFrameWidth = 1000;
int tMergedFrameHeight = 800;
int tMergedFrameHalfWidth = tMergedFrameWidth / 2;
tYUV = Mat(tHeader->height * 1.5f, tHeader->width, CV_8UC1, OPAL_VIDEO_FRAME_DATA_PTR(tHeader));
tClonedYUV = tYUV.clone();
tMergedFrame = Mat(Size(tMergedFrameWidth, tMergedFrameHeight), tYUV.type(), cv::Scalar(0, 0, 0));
tYUV.copyTo(tMergedFrame(cv::Rect(0, 0, tYUV.cols > tMergedFrameWidth ? tMergedFrameWidth : tYUV.cols, tYUV.rows > tMergedFrameHeight ? tMergedFrameHeight : tYUV.rows)));
tClonedYUV.copyTo(tMergedFrame(cv::Rect(tMergedFrameHalfWidth, 0, tYUV.cols > tMergedFrameHalfWidth ? tMergedFrameHalfWidth : tYUV.cols, tYUV.rows > tMergedFrameHeight ? tMergedFrameHeight : tYUV.rows)));
namedWindow("merged frame", 1);
imshow("merged frame", tMergedFrame);
waitKey(10);
The result of above code looks like this:
I guess the image is not correctly interpreted, so the pictures stay black/white (Y component) and below them, we can see the U and V component. There are images, which describes the problem well (http://en.wikipedia.org/wiki/YUV):
and: http://upload.wikimedia.org/wikipedia/en/0/0d/Yuv420.svg
Is there a way for these values to be correctly read? I guess I should not copy the whole images (their Y, U, V components) straight to the calculated positions. The U and V components should be below them and in the proper order, am I right?

First, there are several YUV formats, so you need to be clear about which one you are using.
According to your image, it seems your YUV format is Y'UV420p.
Regardless, it is a lot simpler to convert to BGR work there and then convert back.
If that is not an option, you pretty much have to manage the ROIs yourself. YUV is commonly a plane-format where the channels are not (completely) multiplexed - and some are of different sizes and depths. If you do not use the internal color conversions, then you will have to know the exact YUV format and manage the pixel copying ROIs yourself.
With a YUV image, the CV_8UC* format specifier does not mean much beyond the actual memory requirements. It certainly does not specify the pixel/channel muxing.
For example, if you wanted to only use the Y component, then the Y is often the first plane in the image so the first "half" of whole image can just be treated as a monochrome 8UC1 image. In this case using ROIs is easy.

obtaining cv::Scalar from cv::Mat of unknown type

after reading an image of unknown depth and channel number i want to access its pixels one by one.
on opencv 1.x the code goes:
IplImage * I = cvLoadImage( "myimage.tif" );
CvScalar pixel = cvGet2D( I, y, x );
but on opencv 2.x the cv::Mat.at() method demands that i know the image's type:
cv::Mat I = cv::imread( "myimage.tif" );
if( I.depth() == CV_8U && I.channels() == 3 )
cv::Vec3b pixel = I.at<cv::Vec3b>( x, y );
else if( I.depth() == CV_32F && I.channels() == 1 )
float pixel = I.at<cv::float>( x, y );
is there a function resembling cvGet2D that can receive cv::Mat and return cv::Scalar without knowing the image's type in compile time?

For someone who is really a beginner in C++ ...
... and/or a hacker who just need to save mere seconds of code typing to finish off the last project
cv::Mat mat = ...; // something
cv::Scalar value = cv::mean(mat(cv::Rect(x, y, 1, 1)));
(Disclaimer: This code is only slightly less wasteful than a young man dying for a revolutionary cause.)

The short answer is no. There's no such function in the C++ API.
The rationale behind this is performance. cv::Scalar (and CvScalar) is the same thing as cv::Vec<double,4>. So, for any Mat type other than CV_64FC4, you'll need a conversion to obtain cv::Scalar. Moreover, this method would be a giant switch, like in your example (you have only 2 branches).
But I suppose quite often this function would be convenient, so why not to have it? My guess is that people would tend to overuse it, resulting in really bad performance of their algorithms. So, OpenCV makes it just a tiny bit less convenient to access individual pixels, in order to force client code to use statically typed methods. This isn't such a big deal convenient-wise, since more often than not, you actually know the type statically and it's a really big deal performance-wise. So, I consider it a good trade-off.

I had the same issue, I just wanted to test something quickly and performance was not an issue. But all parts of the code uses cv::Mat(). What I did was the following
Mat img; // My input mat, initialized elsewhere
// Pretty fast operation, Will only create an iplHeader pointing to the data in the mat
// No data is copied and no memory is mallocated.
// The Header resides on the stack (note its type is "IplImage" not "IplImage*")
IplImage iplImg = (IplImage)img;
// Then you may use the old (slow converting) legacy-functions if you like
CvScalar s = cvGet2D( &iplImg, y, x );

Just a warning: you are using cvLoadImage and imread with default flags. This means that any image you read will be a 8-bit 3-channel image. Use appropriate flags (IMREAD_ANYDEPTH / IMREAD_ANYCOLOR) if you want to read image as is (which seems to be your intention).

Producing CCITT compressed TIFF from CGImage

I have a CGImage (core graphics, C/C++). It's grayscale. Well, originally it was B/W, but the CGImage may be RGB. That shouldn't matter. I want to create a CCITT-Group 4 TIFF.
I can create an LZW TIFF (grayscale or color) via creating a destination with the correct dictionary and adding the image in. No problem.
However, there doesn't seem to be an equivalent kCGImagePropertyTIFFCompression value to represent CCITT-4. It should be 4, but that produces uncompressed.
I have a manual CCITT compression routine, so if I can get the binary (1 bit per pixel) data, I'm set. But I can't seem to get 1 BPP data out of a CGImage. I have code that is supposed to put the CGImage into a CGBitmapContext and then give me the data, but it seems to be giving me all black.
I've asked a couple of questions today trying to get at this, but I just figured, lets ask the question I REALLY want answered and see if someone can answer it.
There's GOT to be a way to do this. I've got to be missing something dumb. What is it?

This seems to work and produce not-all-black output. There may be a way to do it that doesn't involve a manual conversion to grayscale first, but at least it works!
static void WriteCCITTTiffWithCGImage_URL_(CGImageRef im, CFURLRef url) {
// produce grayscale image
CGImageRef grayscaleImage;
{
CGColorSpaceRef colorSpace = CGColorSpaceCreateWithName(kCGColorSpaceGenericGray);
CGContextRef bitmapCtx = CGBitmapContextCreate(NULL, CGImageGetWidth(im), CGImageGetHeight(im), 8, 0, colorSpace, kCGImageAlphaNone);
CGContextDrawImage(bitmapCtx, CGRectMake(0,0,CGImageGetWidth(im), CGImageGetHeight(im)), im);
grayscaleImage = CGBitmapContextCreateImage(bitmapCtx);
CFRelease(bitmapCtx);
CFRelease(colorSpace);
}
// generate options for ImageIO. Man this sucks in C.
CFMutableDictionaryRef options = CFDictionaryCreateMutable(kCFAllocatorDefault, 2, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
{
{
CFMutableDictionaryRef tiffOptions = CFDictionaryCreateMutable(kCFAllocatorDefault, 1, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
int fourInt = 4;
CFNumberRef fourNumber = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &fourInt);
CFDictionarySetValue(tiffOptions, kCGImagePropertyTIFFCompression, fourNumber);
CFRelease(fourNumber);
CFDictionarySetValue(options, kCGImagePropertyTIFFDictionary, tiffOptions);
CFRelease(tiffOptions);
}
{
int oneInt = 1;
CFNumberRef oneNumber = CFNumberCreate(kCFAllocatorDefault, kCFNumberIntType, &oneInt);
CFDictionarySetValue(options, kCGImagePropertyDepth, oneNumber);
CFRelease(oneNumber);
}
}
// write file
CGImageDestinationRef idst = CGImageDestinationCreateWithURL(url, kUTTypeTIFF, 1, NULL);
CGImageDestinationAddImage(idst, grayscaleImage, options);
CGImageDestinationFinalize(idst);
// clean up
CFRelease(idst);
CFRelease(options);
CFRelease(grayscaleImage);
}
Nepheli:tmp ken$ tiffutil -info /tmp/output.tiff
Directory at 0x1200
Image Width: 842 Image Length: 562
Bits/Sample: 1
Sample Format: unsigned integer
Compression Scheme: CCITT Group 4 facsimile encoding
Photometric Interpretation: "min-is-black"
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Number of Strips: 1
Planar Configuration: Not planar

ImageMagick can convert from and to almost any image format. As it is open source you can go and read the source code to find the answer to your question.
You can even use the ImageMagick API in you app if you use C++.
Edit:
If you can get the data from CGImage in any format (and it sounded like you can) you can use ImageMagick to convert it from whatever the format is that you get from CGImage to any other format supported by ImageMagick (your desired TIFF format).
Edit:
Technical Q&A QA1509
Getting the pixel data from a CGImage object states:
On Mac OS X 10.5 or later, a new call has been added that allows you to obtain the actual pixel data from a CGImage object. This call, CGDataProviderCopyData, returns a CFData object that contains the pixel data from the image in question.
Once you have the pixel data you can use ImageMagick to convert it.

NSBitmapImageRep claims to be able to generate a CCITT FAX Group 4 compressed TIFF. So something like this might do the trick (untested):
CFDataRef tiffFaxG4DataForCGImage(CGImageRef cgImage) {
NSBitmapImageRep *imageRep =
[[[NSBitmapImageRep alloc] initWithCGImage:cgImage] autorelease];
NSData *tiffData =
[imageRep TIFFRepresentationUsingCompression:NSTIFFCompressionCCITTFAX4
factor:0.0f];
return (CFDataRef) tiffData;
}
This function should return the data you seek.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to convert yuy2 video samples to rgb samples? - c++

Related

Scaling from YUV 10 bit to RGB

Alternative ways to perform 'cv::flip(img, img, 1)'

Overlaying/merging two (and more) YUV images in OpenCV

obtaining cv::Scalar from cv::Mat of unknown type

Producing CCITT compressed TIFF from CGImage

Categories

Resources