YUV420p to RGB conversion has shifted U and V values - iOS Unity3D - c++

I have a native plugin in Unity that decodes H264 frames to YUV420p using FFMPEG.
To display the output image, I rearrange the YUV values into an RGBA texture and convert YUV to RGB using Unity shader (just to make it faster).
The following is the rearrangement code in my native plugin:
unsigned char* yStartLocation = (unsigned char*)m_pFrame->data[0];
unsigned char* uStartLocation = (unsigned char*)m_pFrame->data[1];
unsigned char* vStartLocation = (unsigned char*)m_pFrame->data[2];
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
unsigned char* y = yStartLocation + ((y * width) + x);
unsigned char* u = uStartLocation + ((y * (width / 4)) + (x / 2));
unsigned char* v = vStartLocation + ((y * (width / 4)) + (x / 2));
//REF: https://en.wikipedia.org/wiki/YUV
// Write the texture pixel
dst[0] = y[0]; //R
dst[1] = u[0]; //G
dst[2] = v[0]; //B
dst[3] = 255; //A
// To next pixel
dst += 4;
// dst is the pointer to target texture RGBA data
}
}
The shader that converts YUV to RGB works perfectly and I've used it in multiple projects.
Now, I'm using the same code to decode on iOS platform. But for some reason the U and V values are now shifted:
Y Texture
U Texture
Is there anything that I'm missing for iOS or OpenGL specifically?
Any help greatly appreciated.
Thank You!
Please note that I filled R=G=B = Y for the first screenshot and U for the second.(if that makes sense)
Edit:
Heres the output that I'm getting:
Edit2:
Based on some research I think it may have something to do with Interlacing.
ref: Link
For now I've shifted to CPU based YUV-RGB conversion using sws_scale and it works fine.

The problem was on this line :
uStartLocation + ((y * (width / 4)) + (x / 2));
It should be
uStartLocation + (((y / 2) * (width / 2)) + (x / 2));
Since int rounding was causing the whole frame to shift. Very silly error trying to optimize calculations.
Hope it helps someone.

Related

Why my bitmap image have another color overlay after converting 32-bit to 8-bit

Im working on resizing bitmap image and converting bitmap image to 8-bit (grayscale). But I have the problem that when I convert 32-bit image to 8-bit image, the result has another color overlay while it works perfectly on 24-bit. I guess the cause is in the alpha color. but I dont know where the problem exactly is.
This is my code to generate 8-bit palette color and write it after DIB part:
char* palette = new char[1024];
for (int i = 0; i < 256; i++) {
palette[i * 4] = palette[i * 4 + 1] = palette[i * 4 + 2] = (char)i;
palette[i * 4 + 3] = 255;
}
fout.write(palette, 1024);
delete[] palette;
As I said, my code works perfectly on 24-bit. In 32-bit the color is still kept after resizing, but when converting to 8-bit, it will look like this:
expected image (when converted from 24-bit) //
unexpected image (when converted from 32-bit)
This is how I get the colors and save it to srcPixel[]:
int i = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
int index = getIndex(width, x, y);
srcPixel[index].A = srcBMP.pImageData[i];
i += alpha;
srcPixel[index].B = srcBMP.pImageData[i++];
srcPixel[index].G = srcBMP.pImageData[i++];
srcPixel[index].R = srcBMP.pImageData[i++];
}
i += padding;
}
And this is the code I converted it by getting average of 4 colors A, B, G and R from that srcPixel[]:
int i = 0;
for (int y = 0; y < dstHeight; y++) {
for (int x = 0; x < dstWidth; x++) {
int index = getIndex(dstWidth, x, y);
dstBMP.pImageData[i++] = (srcPixel[index].A + srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 4;
}
i += dstPadding;
}
If I remove and skip all alpha bytes in my code, when converting my image is still like that and I will have another problem is when resizing, my image will have another color overlay like the problem when converting to 8-bit: resizing without alpha channel.
If I skip the alpha channel while getting average (change into dstBMP.pImageData[i++] = (srcPixel[index].B + srcPixel[index].G + srcPixel[index].R) / 3, there is almost nothing different, the overlay still exists.
If I remove palette[i * 4 + 3] = 255; or doing anything with it, the result is still not affected.
Thank you very much.
You add alpha channel to the color and that's why it becomes brighter. From here I found that opaque is 255 and transparent 0 - therefore you add another channel which is set to 'white' to your result.
Remove alpha channel from your equation and see if I'm right.

What is the highest bit depth greyscale image I can export from FreeImage?

As context, I'm working with building a topographic program which needs relatively extreme detail. I do not expect the files to be small, and they do not formally need to be viewed on a monitor, they just need to have very high resolution.
I know that most image formats are limited to 8 bpp, on account of the standard limits on both monitors (at a reasonable price) and on human perception. However, 2⁸ is just 256 possible values, which induces plateauing artifacts in a reconstructed displacement. 2¹⁶ may be close enough at 65,536 possible values, which I have achieved.
I'm using FreeImage and DLang to construct the data, currently on a Linux Mint machine.
However, when I went on to 2³², software support seemed to fade on me. I tried a TIFF of this form and nothing seemed to be able to interpret it, either showing a completely (or mostly) transparent image (remembering that I didn't expect any monitor to really support 2³² shades of a channel) or complaining about being unable to decode the RGB data. I imagine that it's because it was assumed to be an RGB or RGBA image.
FreeImage is reasonably well documented for most purposes, but I'm now wondering, what is the highest-precision single-channel format I can export, and how would I do it? Can anyone provide an example? Am I really limited, in any typical and not-home-rolled image format, to 16-bit? I know that's high enough for, say, medical imaging, but I'm sure I'm not the first person to try to aim higher and we science-types can be pretty ambitious about our precision-level…
Did I make a glaring mistake in my code? Is there something else I should try instead for this kind of precision?
Here's my code.
The 16-bit TIFF that worked
void writeGrayscaleMonochromeBitmap(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ushort v = cast(ushort)((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(cast(ushort)(x/width * 0xFFFF));
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test.tif", TIFF_DEFAULT);
FreeImage_Unload(bitmap);
}
The 32-bit TIFF that didn't really work
void writeGrayscaleMonochromeBitmap32(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
writeln(width, ", ", height);
writeln("Width: ", FreeImage_GetWidth(bitmap));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
writeln(y, ": ", scanline);
for(int x = 0; x < width; x++) {
//writeln(x, " < ", width);
uint v = cast(uint)((x/width) * 0xFFFFFFFF);
writeln("V: ", v);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test32.tif", TIFF_NONE);
FreeImage_Unload(bitmap);
}
Thanks for any pointers.
For a single channel, the highest available from FreeImage is 32-bit, as FIT_UINT32. However, the file format must be capable of this, and as of the moment, only TIFF appears to be up to the task (See page 104 of the Stanford Documentation). Additionally, most monitors are incapable of representing more than 8-bits-per-sample, 12 in extreme cases, so it is very difficult to read data back out and have it render properly.
A unit test involving comparing bytes before marshaling to the bitmap, and sampled from the same bitmap afterward, show that the data is in fact being encoded.
To imprint data to a 16-bit gray scale (currently supported by J2K, JP2, PGM, PGMRAW, PNG and TIF), you would do something like this:
void toFreeImageUINT16PNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ushort v = cast(ushort)(data[cast(ulong)(((height - 1) - y) * width + x)] * 0xFFFF); //((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(v);
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Of course you would want to make adjustments for your target file type. To export as 48-bit RGB16, you would do this.
void toFreeImageColorPNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_RGB16, cast(int)width, cast(int)height);
uint pitch = FreeImage_GetPitch(bitmap);
uint bpp = FreeImage_GetBPP(bitmap);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ulong offset = cast(ulong)((((height - 1) - y) * width + x) * 3);
ushort r = cast(ushort)(data[(offset + 0)] * 0xFFFF);
ushort g = cast(ushort)(data[(offset + 1)] * 0xFFFF);
ushort b = cast(ushort)(data[(offset + 2)] * 0xFFFF);
ubyte[6] bytes = nativeToLittleEndian(r) ~ nativeToLittleEndian(g) ~ nativeToLittleEndian(b);
scanline[(x * 3 * ushort.sizeof) + 0] = bytes[0];
scanline[(x * 3 * ushort.sizeof) + 1] = bytes[1];
scanline[(x * 3 * ushort.sizeof) + 2] = bytes[2];
scanline[(x * 3 * ushort.sizeof) + 3] = bytes[3];
scanline[(x * 3 * ushort.sizeof) + 4] = bytes[4];
scanline[(x * 3 * ushort.sizeof) + 5] = bytes[5];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Lastly, to encode a UINT32 greyscale image (limited purely to TIFF at the moment), you would do this.
void toFreeImageTIF32(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
//DEBUG
int xtest = cast(int)(width/2);
int ytest = cast(int)(height/2);
uint comp1a = cast(uint)(data[cast(ulong)(((height - 1) - ytest) * width + xtest)] * 0xFFFFFFFF);
writeln("initial: ", nativeToLittleEndian(comp1a));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ulong i = cast(ulong)(((height - 1) - y) * width + x);
uint v = cast(uint)(data[i] * 0xFFFFFFFF);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
//DEBUG
ulong index = cast(ulong)(xtest * uint.sizeof);
writeln("Final: ", FreeImage_GetScanLine(bitmap, ytest)
[index .. index + uint.sizeof]);
FreeImage_Save(FIF_TIFF, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
I've yet to find a program, built by anyone else, which will readily render a 32-bit gray-scale image on a monitor's available palette. However, I left my checking code in which will consistently write out the same array both at the top DEBUG and the bottom one, and that's consistent enough for me.
Hopefully this will help someone else out in the future.

OpenCV: Random alpha channel artifacts when overlaying images with transparency in iOS

In my iOS Project i am adding small PNG Images including alpha channel as overlay on a JPEG Picture. The result on my device in DEBUG mode is as expected, the tears are drawn correctly.
When i run the same code on Simulator or when i archive and export the App in RELEASE mode i get random artifacts in alpha channel.
The underlying cv::Mat all contain header infos and a valid data section. Even on green background the error is reproducible.
The behaviour seem to be totally random as from time to time no artifacts are drawn (image 3: right tear, image 4: left tear).
Ideas, anybody?
const char *cpath1 = [#"" cStringUsingEncoding:NSUTF8StringEncoding];//overlay image path , within #"" pass your image path which is in NSString
const char *cpath = [#"" cStringUsingEncoding:NSUTF8StringEncoding];//underlay imagepath
cv::Mat overlay = cv::imread(cpath1,-1);//-1 is for read .png images
cv::Mat underlay = cv::imread(cpath,-1);
//convert mat image in to RGB channel
cv::Mat overlayAlpha;
std::vector<Mat> channels1;
split(overlay, channels1);
channels1[3].copyTo(overlayAlpha);
cv::Mat underlayAlpha;
std::vector<Mat> channels2;
split(underlay, channels2);
channels2[3].copyTo(underlayAlpha);
overlayImage( &underlay, &overlay,cv::Point(10,10);
convert final image to RGB channel
cv::split(underlay,channels1);
std::swap(channels1[0],channels1[2]);// swap B and R channels.
cv::merge(channels1,underlay);//merge channels
MatToUIImage(background); //display your final image, it returns cv::Mat image
and overlay function is like below
overlay function referenced from : http://answers.opencv.org/question/73016/how-to-overlay-an-png-image-with-alpha-channel-to-another-png/
void overlayImage(Mat* src, Mat* overlay, const cv::Point& location){
for (int y = max(location.y, 0); y < src->rows; ++y)
{
int fY = y - location.y;
if (fY >= overlay->rows)
break;
for (int x = max(location.x, 0); x < src->cols; ++x)
{
int fX = x - location.x;
if (fX >= overlay->cols)
break;
double opacity = ((double)overlay->data[fY * overlay->step + fX * overlay->channels() + 3]) / 255;
for (int c = 0; opacity > 0 && c < src->channels(); ++c)
{
unsigned char overlayPx = overlay->data[fY * overlay->step + fX * overlay->channels() + c];
unsigned char srcPx = src->data[y * src->step + x * src->channels() + c];
src->data[y * src->step + src->channels() * x + c] = srcPx * (1. - opacity) + overlayPx * opacity;
}
}
}
}

Fill C++ Cinder textures with RGBA values

I use the Cinder Library and want to create a texture, filled with RGBA values which I saved in an array. There is no helpfull explanation on the internet.
I've not used cinder before but a quick perusal of the documentation seems to suggest you can load a texture either form a file or from a Surface.
So looking at the docs it would seem you create a surface as follows:
cinder::Surface8u surf( 128, 128, SurfaceChannelOrder::RGBA );
You can then fill it using the getData function as follows:
uint8_t* pCols = surf.getData();
for( int y = 0; y < 128; y++ )
{
for( int x = 0; x < 128; x++ )
{
// Fill each pixel with red.
const idx = (y * (128 * 4)) + (x * 4);
pCols[idx + 0] = 0xff;
pCols[idx + 1] = 0x00;
pCols[idx + 2] = 0x00;
pCols[idx + 3] = 0xff;
}
}
You would then load the texture from the surface as follows:
cinder::gl::Texture texture( surf );

Quick code to resize DIB image and maintain good img quality

There is many algorithms to do image resizing - lancorz, bicubic, bilinear, e.g. But most of them are pretty complex and therefore consume too much CPU.
What I need is fast relatively simple C++ code to resize images with acceptable quality.
Here is an example of what I'm currently doing:
for (int y = 0; y < height; y ++)
{
int srcY1Coord = int((double)(y * srcHeight) / height);
int srcY2Coord = min(srcHeight - 1, max(srcY1Coord, int((double)((y + 1) * srcHeight) / height) - 1));
for (int x = 0; x < width; x ++)
{
int srcX1Coord = int((double)(x * srcWidth) / width);
int srcX2Coord = min(srcWidth - 1, max(srcX1Coord, int((double)((x + 1) * srcWidth) / width) - 1));
int srcPixelsCount = (srcX2Coord - srcX1Coord + 1) * (srcY2Coord - srcY1Coord + 1);
RGB32 color32;
UINT32 r(0), g(0), b(0), a(0);
for (int xSrc = srcX1Coord; xSrc <= srcX2Coord; xSrc ++)
for (int ySrc = srcY1Coord; ySrc <= srcY2Coord; ySrc ++)
{
RGB32 curSrcColor32 = pSrcDIB->GetDIBPixel(xSrc, ySrc);
r += curSrcColor32.r; g += curSrcColor32.g; b += curSrcColor32.b; a += curSrcColor32.alpha;
}
color32.r = BYTE(r / srcPixelsCount); color32.g = BYTE(g / srcPixelsCount); color32.b = BYTE(b / srcPixelsCount); color32.alpha = BYTE(a / srcPixelsCount);
SetDIBPixel(x, y, color32);
}
}
The code above is fast enough, but the quality is not ok on scaling pictures up.
Therefore, possibly someone already has fast and good C++ code sample for scaling DIBs?
Note: I was using StretchDIBits before - it was super-slow when was needed to downsize 10000x10000 picture down to 100x100 size, my code is much, much faster, I just want to have a bit higher quality
P.S. I'm using my own SetPixel/GetPixel functions, to work directly with data array and fast, that's not device context!
Why are you doing it on the CPU? Using GDI, there's a good chance of some hardware acceleration. Use StretchBlt and SetStretchBltMode.
In pseudocode:
create source dc and destination dc using CreateCompatibleDC
create source and destination bitmaps
SelectObject source bitmap into source DC and dest bitmap into dest DC
SetStretchBltMode
StretchBlt
release DCs
Allright, here is the answer, had to do it myself... It works perfectly well for scaling pictures up (for scaling down my initial code works perfectly well too). Hope someone will find a good use for it, it's fast enough and produced very good picture quality.
for (int y = 0; y < height; y ++)
{
double srcY1Coord = (y * srcHeight) / (double)height;
int srcY1CoordInt = (int)(srcY1Coord);
double srcY2Coord = ((y + 1) * srcHeight) / (double)height - 0.00000000001;
int srcY2CoordInt = min(maxSrcYcoord, (int)(srcY2Coord));
double yMultiplierForFirstCoord = (0.5 * (1 - (srcY1Coord - srcY1CoordInt)));
double yMultiplierForLastCoord = (0.5 * (srcY2Coord - srcY2CoordInt));
for (int x = 0; x < width; x ++)
{
double srcX1Coord = (x * srcWidth) / (double)width;
int srcX1CoordInt = (int)(srcX1Coord);
double srcX2Coord = ((x + 1) * srcWidth) / (double)width - 0.00000000001;
int srcX2CoordInt = min(maxSrcXcoord, (int)(srcX2Coord));
RGB32 color32;
ASSERT(srcX1Coord < srcWidth && srcY1Coord < srcHeight);
double r(0), g(0), b(0), a(0), multiplier(0);
for (int xSrc = srcX1CoordInt; xSrc <= srcX2CoordInt; xSrc ++)
for (int ySrc = srcY1CoordInt; ySrc <= srcY2CoordInt; ySrc ++)
{
RGB32 curSrcColor32 = pSrcDIB->GetDIBPixel(xSrc, ySrc);
double xMultiplier = xSrc < srcX1Coord ? (0.5 * (1 - (srcX1Coord - srcX1CoordInt))) : (xSrc >= srcX2Coord ? (0.5 * (srcX2Coord - srcX2CoordInt)) : 0.5);
double yMultiplier = ySrc < srcY1Coord ? yMultiplierForFirstCoord : (ySrc >= srcY2Coord ? yMultiplierForLastCoord : 0.5);
double curPixelMultiplier = xMultiplier + yMultiplier;
if (curPixelMultiplier > 0)
{
r += (curSrcColor32.r * curPixelMultiplier); g += (curSrcColor32.g * curPixelMultiplier); b += (curSrcColor32.b * curPixelMultiplier); a += (curSrcColor32.alpha * curPixelMultiplier);
multiplier += curPixelMultiplier;
}
}
color32.r = BYTE(r / multiplier); color32.g = BYTE(g / multiplier); color32.b = BYTE(b / multiplier); color32.alpha = BYTE(a / multiplier);
SetDIBPixel(x, y, color32);
}
}
P.S. Please don't ask why I’m not using StretchDIBits - leave comments for these who understand that not always system api is available or acceptable.
Again, why do it on the CPU? Why not use OpenGL / DirectX and fragment shaders? In pseudocode:
upload source texture (cache it if it's to be reused)
create destination texture
use shader program
render quad
download output texture
where shader program is the filtering method you're using. The GPU is much better at processing pixels than CPU/GetPixel/SetPixel.
You could probably find fragment shaders for lots of different filtering methods on the web - GPU Gems is a good place to start.