My requirements: Overlay graphics (with alpha/antialiasing) onto a UYVY image as fast as possible. The rendering must take place in UYVY preferably because I need to to both render and encode (H.264 with ffmpeg).
What framework (perferablly cross-platform, but Windows only is OK) should I use to render the image to later render/encode?
I looked at openvc, and it seems the drawing happens in BGR, which would requirement to convert each frame from UYVY (2-channel) to BGR (3-channel), and then back again.
I looked at SDL, which uses hardware acceleration. It supports multiple textures with different color spaces. However, the method SDL_RenderReadPixels, which I would need to get the resulting composited image, mentions in the documentation "warning This is a very slow operation, and should not be used frequently."
Is there a framework that can draw onto a BYTE array of YUV, possible with alpha blending/anti-aliasing?
You also can convert YUV to BGRA. And then perform drawing operation with using of the format. BGRA is more convenient then BGR for drawing because every its pixel is equal to 32-bit integer. Naturally after drawing you have to convert backward BGRA to YUV.
There is a fast cross-platform C++ library which can perform these manipulations.
Related
What't the best data-pass from usb-camera to opengl texture?
The only way I know is usb-camera -> (cv.capture()) cv_image -> glGenTexture(image.bytes)
Since CPU would parse the image for each frame, frame rate is lower.
Is there any better way?
I'm using nvidia jetson tx2, is there some way relative to the environment?
Since USB frames must be reassembled anyway by the USB driver and UVC protocol handler, the data is passing through the CPU anyway. The biggest worry is having redundant copy operations.
If the frames are transmitted in M-JPEG format (which almost all UVC compliant cameras do support), then you must decode it on the CPU anyway, since GPU video decoding acceleration HW usually doesn't cover JPEG (also JPEG is super easy to decode).
For YUV color formats it is advisable to create two textures, one for the Y channel, one for the UV channels. Usually YUV formats are planar (i.e. images of a single component per pixel each), so you'd make the UV texture a 2D array with two layers. Since chroma components may be subsampled you need the separate textures to support the different resolutions.
RGB data goes in is a regular 2D texture.
Use a pixel buffer object (PBO) for transfer. By mapping the PBO into host memory (glMapBuffer) you can decode the images coming from the camera directly into that staging PBO. After unmapping a call to glTexSubImage2D will then transfer the image to the GPU memory – in the case of a unified memory architecture this "transfer" might be as simple as shuffling around a few internal buffer references.
Since you didn't mention the exact API used to access the video device, it's difficult to give more detailed information.
I'm using WIC (Windows Imaging Component) to decode image files and get access to the pixel data. I'm trying to figure out the pixel order (i.e., bottom-up or top-down).
I use IWICImagingFactory::CreateDecoderFromFileName to create the decoder from which I grab the (first) frame (IWICBitmapFrameDecode). With the frame, I use GetPixelFormat and GetSize to compute a buffer size, and finally I use CopyPixels to get the decoded pixel data into my buffer.
This works fine with a variety of JPEG files, giving me pixel rows in top-down sequence, and the pixels are in BGRX order (GUID_WICPixelFormat32bppBGR).
When I try with GIF files, however, the pixel rows come in bottom-up sequence. The reported pixel format is RGBA (GUID_WICPixelFormat32bppRGBA), but the ground truth shows the channel order is BGRA (with the blue in the low byte of each 32-bit pixel, just like JPEG).
My primary question: Is there a way for me to query the top-down/bottom-up orientation of the pixel data?
I found a similar question that asked about rotation when using JPEG sources, and the answer was to query the EXIF data to know whether the image was rotated. But EXIF isn't used with GIF. So I'm wondering whether I'm supposed to assume that pixels are always bottom-up, except for ones that do have an EXIF orientation that says otherwise. Update 6/25/2020 Nope, the JPEG orientation is neutral and the GIF has no orientation information, yet MS Paint and other programs can open the files in the correct orientation.
My secondary question: What's up with the incorrect channel order (RGB/BGR) from the GIF decoder?
Not only that, the WIC documentation says that the GIF decoder should return indexes into a color table (GUID_WICPixelFormat8bppIndexed) rather than actual pixel values. Is it possible some software on my machine installed its own buggy GIF decoder that supersedes the one that comes with Windows 10?
To query photo orientation for formats that support it you should use System.Photo.Orientation photo metadata policy (or one of file format specific metadata query paths) using IWICMetadataQueryReader interface.
As for GetPixelFormat() reporting "incorrect" pixel format, it is right there in the Remarks section:
The pixel format returned by this method is not necessarily the pixel format the image is stored as. The codec may perform a format conversion from the storage pixel format to an output pixel format.
Native byte order of image bitmaps under Windows is BGRA, so that is what you are getting from the decoder. If you want image in a different format you need to use IWICImagingFactory::CreateFormatConverter() to create a format converter and convert the image data before copying.
Finally, GIF doesn't have orientation metadata because it is always encoded from top to bottom. Most likely reason you are getting a vertically inverted image is because you are reading it directly from the decoder -- try calling CopyPixels() on the converter instead.
I have extracted the depth map of 2 images and stored them as .tif file
now I would like to use openGL to join these two images depending on their depth
so I want to read the depth for each image from the .tif file and then use that depth to draw the pixel with the higher depth
to make it more clear the depth map are two images like this
link
so say I have the pervious image and I want to join it with this image
link
my question is how to read this depth from the .tif file
Ok, I'll have a go ;-)
I see the images are just grayscale, so if the "depth" information is just the intensity of the pixel, "joining" them may be just a matter of adding the pixels. This is generally referred to as "blending", but I don't know what else you could mean.
So, you need to;
Read the 2 images into memory
For each pixel (assuming both images the same size):
read the intensity from image A[row,col]
read the intensity from image B[row,col]
write max(A[row,col],B[row,col]) to C[row,col]
Save image C - this is your new "joined" image.
Now OpenGL doesn't have any built-in support for loading/saving images, so you'll need to find a 3rd party library, like FreeImage or similar.
So, that's a lot of work. I wonder if you really want an OpenGL solution or are just assuming OpenGL would be good for graphics work. If the algorithm above is really what you want, you could do it in something like C# in a matter of minutes. It has built-in support for loading (some formats) of image file, and accessing pixels using the Bitmap class. And since your created this images yourself, you may not be bound the the TIFF format.
What is the easiest format to read a texture into opengl? Are there any tutorials -good tutorials, for loading image formats like jpg, png, or raw into an array which can be used for texture mapping (preferably without the use of a library like libpng)?
OpenGL itself does not knows nothing about common image formats (other than natively supported S3TC/DXT compressed and alikes, but they are a different story). You need to expand you source images into RGBA arrays. Number of formats and combinations are supported. You need to choose one that suits you, e.g. GL_ALPHA4 for masks, GL_RGB5_A1 for 1bit transparency, GL_BGRA/GL_RGBA for fullcolor, etc.
For me the easiest (not the fastest) way are PNGs, for their lossless compression and full Alpha support. I read the PNG and write RGBA values into array which I then hand over to OpenGL texture creation. If you don't need alpha you may as well accept JPG or BMP. Pipeline is common Source -> Expanded RGBA array -> OpenGL texture.
There is a handy OpenGL texture tutorial available at the link: http://www.nullterminator.net/gltexture.html
I'm having a few issues regarding how to render a PVR.
I'm confused about how to get the data from the PVR to screen. I have a window that is ready to draw a picture and I'm a bit stuck. What do I need to get from the PVR as parameters to then be able to draw a texture? With jpeg and pngs locally you can just load the image from a directory but how would the same occur for a PVR?
Depends what format the data inside the PVR is in. If it's a supported standard then just copy it to a texture with glTexSubImage2D(), otherwise you will need to decompress it into something OpenGL understands - like RGB or RGBA.
edit - OpenGL is a display library (well much much more than that), it doesn't read images, decode movies or do sound.
TGA files are generally very simple uncompressed RGB or RGBA image data, it should be trivial to decode the file, extract the image data and copy it directly to an opengl texture.
since you tagged the question Qt you can use QImage to load the tga and Using QImage with OpenGL