I am working on a spectral camera and am using opencv to do the processing. I just started using opencv, so this might not be the best way to do this.
Basically this code grabs frames from two video streams then does a matrix multiplication. captureF and captureM are both video streams and eigen is a 6x7 matrix where the last row is an offset that needs subtracted from the image.
I could not figure out how to combine the two frames into one six channel imageII looked at merge and mixchannels but couldn't get either to work), so I wound up doing the matrix multiplication manually and saving the data out to two three channel images, but ideally this would be one 6 channel matrix. My question is that this code currently runs very slow (20sec per frame) and I wondering if there is a way to do this that runs faster and or do this using a 6channel image?
IplImage imgF = cvQueryFrame(captureF);
IplImage dst2 = cvQueryFrame(captureM);
IplImage *OutImg1 = cvCreateImage(cvSize(imgF->width, imgF->height), IPL_DEPTH_32F, 3);
IplImage *OutImg2 = cvCreateImage(cvSize(imgF->width, imgF->height), IPL_DEPTH_32F, 3);
// iterates through each frame in the image.
for(int i=0; i<(imgF->imageSize)/3;i+=3){
((float*)OutImg1->imageData)[i] = cvmGet(eigen,0,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,0,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,0,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,0,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,0,4)*(dst2->imageData[i+1]-cvmGet(eigen,6,4)) + cvmGet(eigen,0,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
((float*)OutImg1->imageData)[i+1] = cvmGet(eigen,1,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,1,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,1,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,1,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,1,4)*(dst2->imageData[i+1]-cvmGet(eigen,0,4)) + cvmGet(eigen,1,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
((float*)OutImg1->imageData)[i+2] = cvmGet(eigen,2,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,2,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,2,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,2,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,2,4)*(dst2->imageData[i+1]-cvmGet(eigen,0,4)) + cvmGet(eigen,2,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
((float*)OutImg2->imageData)[i] = cvmGet(eigen,3,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,3,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,3,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,3,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,3,4)*(dst2->imageData[i+1]-cvmGet(eigen,0,4)) + cvmGet(eigen,3,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
((float*)OutImg2->imageData)[i+1] = cvmGet(eigen,4,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,4,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,4,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,4,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,4,4)*(dst2->imageData[i+1]-cvmGet(eigen,0,4)) + cvmGet(eigen,4,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
((float*)OutImg2->imageData)[i+2] = cvmGet(eigen,5,2)*(imgF->imageData[i]-cvmGet(eigen,6,2)) + cvmGet(eigen,5,1)*(imgF->imageData[i+1]-cvmGet(eigen,6,1)) + cvmGet(eigen,5,0)*(imgF->imageData[i+2]-cvmGet(eigen,6,0)) + cvmGet(eigen,5,5)*(dst2->imageData[i]-cvmGet(eigen,6,5)) + cvmGet(eigen,5,4)*(dst2->imageData[i+1]-cvmGet(eigen,0,4)) + cvmGet(eigen,5,3)*(dst2->imageData[i+2]-cvmGet(eigen,6,3));
}
I use opencv2, and am a rank novice so maybe there are better ways. I imagine you can transpose to the old cv if you need too.
First, it looks annoyingly like there are not 6 channel Scalars. So convert your data to a NX6 array (N = rows*cols), and use matrix multiply.
Mat twoIm[2];
Mat eigen(6,6,CV_32F);
Mat bigGuy,newGuy;
merge(twoIm,2,bigGuy); // load your two images into twoIm[0] & twoIm[1]
bigGuy.convertTo(bigGuy, CV_32F); // mat multiply wants everything the same type
Mat bigGal = bigGuy.reshape(1, 6); // this makes 6 channels into 6 rows
newGuy = bigGal.t() * eigen; // and voila!
Related
How do I create a TensorFloat16Bit when manually doing a tensorization of the data?
We tensorized our data based on this Microsoft example, where we are converting 255-0 to 1-0, and changing the RGBA order.
...
std::vector<int64_t> shape = { 1, channels, height , width };
float* pCPUTensor;
uint32_t uCapacity;
// The channels of image stored in buffer is in order of BGRA-BGRA-BGRA-BGRA.
// Then we transform it to the order of BBBBB....GGGGG....RRRR....AAAA(dropped)
TensorFloat tf = TensorFloat::Create(shape);
com_ptr<ITensorNative> itn = tf.as<ITensorNative>();
CHECK_HRESULT(itn->GetBuffer(reinterpret_cast<BYTE**>(&pCPUTensor), &uCapacity));
// 2. Transform the data in buffer to a vector of float
if (BitmapPixelFormat::Bgra8 == pixelFormat)
{
for (UINT32 i = 0; i < size; i += 4)
{
// suppose the model expects BGR image.
// index 0 is B, 1 is G, 2 is R, 3 is alpha(dropped).
UINT32 pixelInd = i / 4;
pCPUTensor[pixelInd] = (float)pData[i];
pCPUTensor[(height * width) + pixelInd] = (float)pData[i + 1];
pCPUTensor[(height * width * 2) + pixelInd] = (float)pData[i + 2];
}
}
ref: https://github.com/microsoft/Windows-Machine-Learning/blob/2179a1dd5af24dff4cc2ec0fc4232b9bd3722721/Samples/CustomTensorization/CustomTensorization/TensorConvertor.cpp#L59-L77
I just converted our .onnx model to float16 to verify if that would provide some performance improvements on the inference when the available hardware provides support for float16. However, the binding is failing and the suggestion here is to pass a TensorFloat16Bit.
So if I swap the TensorFloat for TensorFloat16Bit I get an access violation exception at pCPUTensor[(height * width * 2) + pixelInd] = (float)pData[i + 2]; because pCPUTensor is half of the size of what it was. It seems like I should be reinterpreting_cast to uint16_t** or something among those lines, so pCPUTensor will have the same size as when it was a TensorFloat, but then I get further errors that it can only be uint8_t** or BYTE**.
Any ideas on how I can modify this code so I can get a custom TensorFloat16Bit?
Try the factory methods on TensorFloat16Bit.
However, you will need to convert you data to float16:
https://stackoverflow.com/a/60047308/11998382
Also, I might recommend you instead do the conversion within the onnx model.
I am working with OpenCV and inside there is the function goodFeaturesToTrack to apply ShiTomasi method to find corners.
We know that Shi-Tomasi is based on finding eigenvalues so there is even a function in OpenCV to calculate the minimal eigenvalue of gradient matrices for corner detection called cornerMinEigenVal in case you want to do your own implementation:
void cv::cornerMinEigenVal ( InputArray src,
OutputArray dst,
int blockSize,
int ksize = 3,
int borderType = BORDER_DEFAULT
)
However this function finds the minimum eigenvalue for ALL points in the image (and store the results in dst).
My question is:
Is there a function (in OpenCV or if not any other C++ library) to find the eigenvalues (or its minimum) for a particular point (X,Y) of an image (and not evaluated in the whole image) (with a blocksize) ?
Short answer: There is no such function in OpenCV that calculate MinEigenVals for sparse points. However, you can implement one from HarrisResponses() with just small modifications.
The HarrisResponses() function is used to calculate Harris score for sparse points (it's static in OpenCV, so you can't call it directly).
Look through the code of calcMinEigenVal() and calcHarris(), and you will find that the only difference between them is how they use values from the cov matrix:
// MinEigenVal
float a = cov[j*3]*0.5f;
float b = cov[j*3+1];
float c = cov[j*3+2]*0.5f;
dst[j] = (float)((a + c) - std::sqrt((a - c)*(a - c) + b*b));
// Harris
float a = cov[j*3];
float b = cov[j*3+1];
float c = cov[j*3+2];
dst[j] = (float)(a*c - b*b - k*(a + c)*(a + c));
Just change this line to:
// scale_sq = scale * scale
pts[ptidx].response = (float)((a + b)*0.5f - stb::sqrt((a - b)*(a - b)*0.25f + c*c))*scale_sq;
and you will get what you need.
I am accessing the image like so:
pDoc = GetDocument();
int iBitPerPixel = pDoc->_bmp->bitsperpixel; // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point; // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r; // red pixel value
int g; // green pixel value
int b; // blue pixel value
int gray; // gray pixel value
BYTE *pImgGS = pImg; // grayscale image pixel array
and attempting to change the rgb image to gray like so:
// convert RGB values to grayscale at each pixel, then put in grayscale array
for (int i = 0; i<iHeight; i++)
for (int j = 0; j<iWidth; j++)
{
r = pImg[i*iWidth * 3 + j * 3 + 2];
g = pImg[i*iWidth * 3 + j * 3 + 1];
b = pImg[i*Wp + j * 3];
r * 0.299;
g * 0.587;
b * 0.144;
gray = std::round(r + g + b);
pImgGS[i*Wp + j] = gray;
}
finally, this is how I try to draw the image:
//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
for (int j = 0; j < iWidth; j++) {
// this should set every corresponding grayscale picture to the current picture as grayscale
pImg[i*Wp + j] = pImgGS[i*Wp + j];
}
}
}
original image:
and the resulting image that I get is this:
First check if image type is 24 bits per pixels.
Second, allocate memory to pImgGS;
BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight);
Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER.
Hence you should access values in following way,
double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[(i*iWidth + j)*3 + 2];
g = (double)pImg[(i*iWidth + j)*3 + 1];
b = (double)pImg[(i*iWidth + j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
}
If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.
double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[index+ (j)*3 + 2];
g = (double)pImg[index+ (j)*3 + 1];
b = (double)pImg[index+ (j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
index =index +pitch;
}
While drawing image,
as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
tl; dr:
Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.
The lengthy stuff
First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.
Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.
Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.
Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that e.g. the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.
So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.
Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right, but not quite. If your 8-bit greycales are linear, i.e. were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).
About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.
Something like this:
uint32_t* out = (uint32_t*) output_bitmap_data;
for(int i = 0; i < inputSize; i+= 3)
{
uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
*out++ = (Y<<16) | (Y<<8) | Y;
}
Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.
I am trying to implement unsharp masking like it's done in Adobe Photoshop. I gathered a lot of information on the interent but I'm not sure if I'm missing something. Here's the code:
void unsharpMask( cv::Mat* img, double amount, double radius, double threshold ) {
// create blurred img
cv::Mat img32F, imgBlur32F, imgHighContrast32F, imgDiff32F, unsharpMas32F, colDelta32F, compRes, compRes32F, prod;
double r = 1.5;
img->convertTo( img32F, CV_32F );
cv::GaussianBlur( img32F, imgBlur32F, cv::Size(0,0), radius );
cv::subtract( img32F, imgBlur32F, unsharpMas32F );
// increase contrast( original, amount percent )
imgHighContrast32F = img32F * amount / 100.0f;
cv::subtract( imgHighContrast32F, img32F, imgDiff32F );
unsharpMas32F /= 255.0f;
cv::multiply( unsharpMas32F, imgDiff32F, colDelta32F );
cv::compare( cv::abs( colDelta32F ), threshold, compRes, cv::CMP_GT );
compRes.convertTo( compRes32F, CV_32F );
cv::multiply( compRes32F, colDelta32F, prod );
cv::add( img32F, prod, img32F );
img32F.convertTo( *img, CV_8U );
}
At the moment I am testing with a grayscale image. If i try the exact same parameters in Photoshop I get much better result. My own code leads to noisy images. What am I doing wrong.
The 2nd question is, how i can apply unsharp masking on RGB images? Do I have to unsharp mask each of the 3 channels or would it be better in another color space? How are these things done in Photoshop?
Thanks for your help!
I'm trying to replicate Photoshop's Unsharp Mask as well.
Let's ignore the Threshold for a second.
I will show you how to replicate Photoshop's Unsharp Mask using its Gaussian Blur.
Assuming O is the original image layer.
Create a new layer GB which is a Gaussian Blur applied on O.
Create a new layer which is O - GB (Using Apply Image).
Create a new layer by inverting GB - invGB.
Create a new layer which is O + invGB using Image Apply.
Create a new layer which is inversion of the previous layer, namely inv(O + invGB).
Create a new layer which is O + (O - GB) - inv(O + invGB).
When you do that in Photoshop you'll get a perfect reproduction of the Unsharp Mask.
If you do the math recalling that inv(L) = 1 - L you will get that the Unsharp Mask is
USM(O) = 3O - 2B.
Yet when I do that directly in MATLAB I don't get Photoshop's results.
Hopefully someone will know the exact math.
Update
OK,
I figured it out.
In Photoshop USM(O) = O + (2 * (Amount / 100) * (O - GB))
Where GB is a Gaussian Blurred version of O.
Yet, in order to replicate Photoshop's results you must do the steps above and clip the result of each step into [0, 1] as done in Photoshop.
According to docs:
C++: void GaussianBlur(InputArray src, OutputArray dst, Size ksize,
double sigmaX, double sigmaY=0, int borderType=BORDER_DEFAULT )
4th parameter is not "radius" it is "sigma" - gaussian kernel standard deviation. Radius is rather "ksize". Anyway Photoshop is not open source, hence we can not be sure they use the same way as OpenCV to calculate radius from sigma.
Channels
Yes you should apply sharp to any or to all channels, it depends on your purpose. Sure you can use any space: if you want sharp only brightness-component and don't want to increase color noise you can covert it to HSL or Lab-space and sharp L-channel only (Photoshop has all this options too).
In response to #Royi, the 2x multiplier results from assuming no clamping in this formula:
USM(Original) = Original + Amount / 100 * ((Original - GB) - (1 - (Original + (1 - GB))))
Ignoring clamping this incorrectly reduces to:
USM(Original) = Original + 2 * Amount / 100 * (Original - GB)
However, as you also point out, (Original - GB) and (Original + inv(GB)) are clamped to [0, 1]:
USM(Original) = Original + Amount / 100 *
(Max(0, Min(1, Original - GB)) - (1 - (Max(0, Min(1, Original + (1 - GB))))))
This correctly reduces to:
USM(Original) = Original + Amount / 100 * (Original - GB)
Here is an example illustrating why:
https://legacy.imagemagick.org/discourse-server/viewtopic.php?p=133597#p133597
Here's the code what I have done.
I am using this code to implement Unsharp Mask and it is working well for me.
Hope it is useful for you.
void USM(cv::Mat &O, int d, int amp, int threshold)
{
cv::Mat GB;
cv::Mat O_GB;
cv::subtract(O, GB, O_GB);
cv::Mat invGB = cv::Scalar(255) - GB;
cv::add(O, invGB, invGB);
invGB = cv::Scalar(255) - invGB;
for (int i = 0; i < O.rows; i++)
{
for (int j = 0; j < O.cols; j++)
{
unsigned char o_rgb = O.at<unsigned char>(i, j);
unsigned char d_rgb = O_GB.at<unsigned char>(i, j);
unsigned char inv_rgb = invGB.at<unsigned char>(i, j);
int newVal = o_rgb;
if (d_rgb >= threshold)
{
newVal = o_rgb + (d_rgb - inv_rgb) * amp;
if (newVal < 0) newVal = 0;
if (newVal > 255) newVal = 255;
}
O.at<unsigned char>(i, j) = unsigned char(newVal);
}
}
}
I have a TV capture card that has a feed coming in as a YUV format. I've seen other posts here similar to this question and attempted to try every possible method stated, but neither of them provided a clear image. At the moment the best results were with the OpenCV cvCvtColor(scr, dst, CV_YUV2BGR) function call.
I am currently unaware of the YUV format and to be honest confuses me a little bit as it looks like it stores 4 channels, but is only 3? I have included an image from the capture card to hope that someone can understand what is possibly going on that I could use to fill in the blanks.
The feed is coming in through a DeckLink Intensity Pro card and being accessed in a C++ application in using OpenCV in a Windows 7 environment.
Update
I have looked at a wikipedia article regarding this information and attempted to use the formula in my application. Below is the code block with the output received from it. Any advice is greatly appreciated.
BYTE* pData;
videoFrame->GetBytes((void**)&pData);
m_nFrames++;
printf("Num Frames executed: %d\n", m_nFrames);
for(int i = 0; i < 1280 * 720 * 3; i=i+3)
{
m_RGB->imageData[i] = pData[i] + pData[i+2]*((1 - 0.299)/0.615);
m_RGB->imageData[i+1] = pData[i] - pData[i+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[i+2]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+2] = pData[i] + pData[i+1]*((1 - 0.114)/0.436);
}
In newer version of OPENCV there is a built in function can be used to do YUV to RGB conversion
cvtColor(src,dst,CV_YUV2BGR_YUY2);
specify the YUV format after the underscore, like this CV_YUYV2BGR_xxxx
It looks to me like you're decoding a YUV422 stream as YUV444. Try this modification to the code you provided:
for(int i = 0, j=0; i < 1280 * 720 * 3; i+=6, j+=4)
{
m_RGB->imageData[i] = pData[j] + pData[j+3]*((1 - 0.299)/0.615);
m_RGB->imageData[i+1] = pData[j] - pData[j+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[j+3]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+2] = pData[j] + pData[j+1]*((1 - 0.114)/0.436);
m_RGB->imageData[i+3] = pData[j+2] + pData[j+3]*((1 - 0.299)/0.615);
m_RGB->imageData[i+4] = pData[j+2] - pData[j+1]*((0.114*(1-0.114))/(0.436*0.587)) - pData[j+3]*((0.299*(1 - 0.299))/(0.615*0.587));
m_RGB->imageData[i+5] = pData[j+2] + pData[j+1]*((1 - 0.114)/0.436);
}
I'm not sure you've got your constants correct, but at worst your colors will be off - the image should be recognizable.
I use the following C++ code using OpenCV to convert yuv data (YUV_NV21) to rgb image (BGR in OpenCV)
int main()
{
const int width = 1280;
const int height = 800;
std::ifstream file_in;
file_in.open("../image_yuv_nv21_1280_800_01.raw", std::ios::binary);
std::filebuf *p_filebuf = file_in.rdbuf();
size_t size = p_filebuf->pubseekoff(0, std::ios::end, std::ios::in);
p_filebuf->pubseekpos(0, std::ios::in);
char *buf_src = new char[size];
p_filebuf->sgetn(buf_src, size);
cv::Mat mat_src = cv::Mat(height*1.5, width, CV_8UC1, buf_src);
cv::Mat mat_dst = cv::Mat(height, width, CV_8UC3);
cv::cvtColor(mat_src, mat_dst, cv::COLOR_YUV2BGR_NV21);
cv::imwrite("yuv.png", mat_dst);
file_in.close();
delete []buf_src;
return 0;
}
and the converted result is like the image yuv.png.
you can find the testing raw image from here and the whole project from my Github Project
It may be the wrong path, but many people (I mean, engineers) do mix YUV with YCbCr.
Try to
cvCvtColor(src, dsc, CV_YCbCr2RGB)
or CV_YCrCb2RGB or maybe a more exotic type.
The BlackMagic Intensity software return YUVY' format in bmdFormat8BitYUV, so 2 sources pixels are compressed into 4bytes - I don't think openCV's cvtColor can handle this.
You can either do it yourself, or just call the Intensity software ConvertFrame() function
edit: Y U V is normally stored as
There is a Y (brightness) for each pixel but only a U and V (colour) for every alternate pixel in the row.
So if data is an unsigned char pointing to the start of the memory as shown above.
pixel 1, Y = data[0] U = data[+1] V = data[+3]
pixel 2, Y = data[+2] U = data[+1] V = data[+3]
Then use the YUV->RGB coefficients you used in your sample code.
Maybe someone is confused by color models YCbCr and YUV.
Opencv does not handle YCbCr. Instead it has YCrCb, and it implemented the same way as YUV in opencv.
From the opencv sources https://github.com/Itseez/opencv/blob/2.4/modules/imgproc/src/color.cpp#L3830:
case CV_BGR2YCrCb: case CV_RGB2YCrCb:
case CV_BGR2YUV: case CV_RGB2YUV:
// ...
// 1 if it is BGR, 0 if it is RGB
bidx = code == CV_BGR2YCrCb || code == CV_BGR2YUV ? 0 : 2;
//... converting to YUV with the only difference that brings
// order of Blue and Red channels (variable bidx)
But there is one more thing to say.
There is currently a bug in conversion CV_BGR2YUV and CV_RGB2YUV in OpenCV branch 2.4.* .
At present, this formula is used in implementation:
Y = 0.299B + 0.587G + 0.114R
U = 0.492(R-Y)
V = 0.877(B-Y)
What it should be (according to wikipedia):
Y = 0.299R + 0.587G + 0.114B
U = 0.492(B-Y)
V = 0.877(R-Y)
The channels Red and Blue are misplaced in the implemented formula.
Possible workaround to convert BGR->YUV while the bug is not fixed :
cv::Mat source = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
cv::Mat yuvSource;
cvtColor(source, yuvSource, cv::COLOR_BGR2RGB); // rearranges B and R in the appropriate order
cvtColor(yuvSource, yuvSource, cv::COLOR_BGR2YUV);
// yuvSource will contain here correct image in YUV color space