I am working on OpenGL shaders. My current shaders will take textures in RGB24 format and display. I wanted to take Y420 as input and convert into RGB24 at fragment shader level. Guide me to proceed further with this.
The capability to render YUV via OpenGL depends on the platform. Most of the platforms expose YUV streaming texture capability via extensions, for example GL_OES_EGL_image_external at http://www.khronos.org/registry/gles/extensions/OES/OES_EGL_image_external.txt
For eglImage based streaming, you can refer to TEST16 in sgxperf codebase at,
https://github.com/prabindh/sgxperf/blob/master/sgxperf_gles20_vg.cpp
Additionally, when you use these extensions, it is NOT necessary to do any conversions in the shader. The sampler (HW) already does the conversion to RGB before you process it in the shader.
For developing a complete application though, you typically need other additional mechanisms like synchronisation with the display etc. If your vendor provides a gstreamer sink that integrates GL streaming functionality, that would be the best option.
Related
I'm trying to use a depth texture in a compute shader.
The depth texture is created with the format VK_FORMAT_D32_SFLOAT and with the usage VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT | VK_IMAGE_USAGE_STORAGE_BIT.
The problem is that it seems that this combination of parameters is not supported, I have this warning: vkCreateImageView(): pCreateInfo->format VK_FORMAT_D32_SFLOAT with tiling VK_IMAGE_TILING_OPTIMAL does not support usage that includes VK_IMAGE_USAGE_STORAGE_BIT.
Except this message, the program is working well and the compute shader successfully read the depth texture.
Is this possible to read depth texture in compute shader ?
Yes, it's possible to read a 32-bit normalized depth image in a compute shader. Just not in your implementation.
Vulkan permits an implementation to refuse certain combinations of image formats and usages. They can refuse some formats entirely, while restricting other formats to only specific usages. As such, unless the format+usage combination you intend to use is on the Vulkan specification's list of required functionality, you must query support for it.
Vulkan doesn't require that implementations allow you to use D32 images as storage images. Therefore, you must check to see if a particular implementation provides this functionality.
I am working on developing some FxPlug plugins for Motion and FCP X. Ultimately, I'd like to have them render in Metal as Apple is deprecating OpenGL.
I'm currently using CoreImage, and while I've been able to use the CoreImage functionality to do Metal processing outside of the FxPlug SDK, FxPlug only provides me the frame as an OpenGL texture. I've tried just passing this into the CoreImage filter, but I end up getting this error:
Cannot render image (with an input GL texture) using a metal-DG context.
After a bit of research, I found that I can supposedly use CVPixelBuffers to share textures between the two, but after trying to write code utilizing this method for a while, I've come to the belief that this was intended as a way to WRITE (as in, create from scratch) to a shared buffer, but not convert between. While this may be incorrect, I cannot find a way to get the existing GL texture to exist in a CVPixelBuffer.
TL;DR: I've found ways to get a resulting Metal or OpenGL texture FROM a CVPixelBuffer, but I cannot find a way to create a CVPixelBuffer from an existing OpenGL texture. My heart is not set on this method, as my ultimate goal is to simply convert from OpenGL to Metal, then back to OpenGL (ideally in an efficient way).
Has anyone else found a way to work with FxPlug with Metal? Is there a good way to convert from an OpenGL texture to Metal/CVPixelBuffer?
I have written an FxPlug that uses both OpenGL textures and Metal textures. The thing you're looking for is an IOSurface. They are textures that can be used with either Metal or OpenGL, though they have some limitations. As such, if you already have a Metal or OpenGL texture, you must copy it into an IOSurface to use it with the other system.
To create an IOSurface you can either use CVPixelBuffers (by including the kCVPixelBufferIOSurfacePropertiesKey) or you can directly create one using the IOSurface class defined in <IOSurface/IOSurfaceObjC.h>.
Once you have an IOSurface, you can copy your OpenGL texture into it by getting an OpenGL texture from the IOSurface via CGLTexImageIOSurface2D() (defined in <OpenGL/CGLIOSurface.h>). You then take that texture and use it as the backing texture for an FBO. You can, for example, draw a textured quad into it using the input FxTexture as the texture. Be sure the call glFlush() when done!
Next take the IOSurface and create a MTLTexture from it via -[MTLDevice newTextureWithDescriptor:ioSurface:plane:] (described here). You'll want to create an output IOSurface to draw into and also create a MTLTexture from it. Do your Metal rendering into the output MTLTexture. Next, take the output IOSurface and create an OpenGL texture out of it via CGLTexImageIOSurface2D(). Now copy that OpenGL texture into the output FxTexture either by using it as the backing of a texture-backed FBO or whatever other method you prefer.
As you can see, the downside of this is that each render requires 2 copies - 1 of the input into an IOSurface and 1 of the output IOSurface into the output texture the app gives you. The other downside is that this is probably all moot, as with Apple having announced publicly that they're ending support for OpenGL, they're probably working on a Metal-based solution already. It may be extra work to do it all yourself. (Though the upside is that you can use that same code in other host applications that only support OpenGL.)
My question concerns the most efficient way of performing geometric image transformations on the GPU. The goal is essentially to remove lens distortion from aquired images in real time. I can think of several ways to do it, e.g. as a CUDA kernel (which would be preferable) doing an inverse transform lookup + interpolation, or the same in an OpenGL shader, or rendering a forward transformed mesh with the image texture mapped to it. It seems to me the last option could be the fastest because the mesh can be subsampled, i.e. not every pixel offset needs to be stored but can be interpolated in the vertex shader. Also the graphics pipeline really should be optimized for this. However, the rest of the image processing is probably going to be done with CUDA. If I want to use the OpenGL pipeline, do I need to start an OpenGL context and bring up a window to do the rendering, or can this be achieved anyway through the CUDA/OpenGL interop somehow? The aim is not to display the image, the processing will take place on a server, potentially with no display attached. I've heard this could crash OpenGL if bringing up a window.
I'm quite new to GPU programming, any insights would be much appreciated.
Using the forward transformed mesh method is the more flexible and easier one to implement. However performance wise there's no big difference, as the effective limit you're running into is memory bandwidth, and the amount of memory bandwidth consumed does only depend on the size of your input image. If it's a fragment shader, fed by vertices or a CUDA texture access that's causing the transfer doesn't matter.
If I want to use the OpenGL pipeline, do I need to start an OpenGL context and bring up a window to do the rendering,
On Windows: Yes, but the window can be an invisible one.
On GLX/X11 you need an X server running, but you can use a PBuffer instead of a window to get a OpenGL context.
In either case use a Framebuffer Object as the actual drawing destination. PBuffers may corrupt their primary framebuffer contents at any time. A Framebuffer Object is safe.
or can this be achieved anyway through the CUDA/OpenGL interop somehow?
No, because CUDA/OpenGL interop is for making OpenGL and CUDA interoperate, not make OpenGL work from CUDA. CUDA/OpenGL Interop helps you with the part you mentioned here:
However, the rest of the image processing is probably going to be done with CUDA.
BTW; maybe OpenGL Compute Shaders (available since OpenGL-4.3) would work for you as well.
I've heard this could crash OpenGL if bringing up a window.
OpenGL actually has no say in those things. It's just a API for drawing stuff on a canvas (canvas = window or PBuffer or Framebuffer Object), but it doesn't deal with actually getting a canvas on the scaffolding, so to speak.
Technically OpenGL doesn't care if there's a window or not. It's the graphics system on which the OpenGL context is created. And unfortunately none of the currently existing GPU graphics systems supports true headless operation. NVidia's latest Linux drivers may allow for some crude hacks to setup a truly headless system, but I never tried that, so far.
I noticed that in the new features listed for OpenGL 4.0 the following is included:
Drawing of data generated by OpenGL or external APIs such as OpenCL,
without CPU intervention.
What functionality exactly is this referring to?
It's talking about ARB_draw_indirect. That functionality, core in 4.0, allows the GL implementation to read the drawing parameters directly from the buffer object. So the parameters you would pass to glDrawArrays or glDrawElements come from the buffer, not from your Draw call.
This way, OpenCL or other GPGPU code can just write that struct into the buffer. And therefore, they can determine how many vertices to draw.
AMD has a pretty nifty variation of this that allows for multi-draw functionality.
I've read that FBOs can be used for fast image manipulation using the OpenGL drawing actions. Does anyone know the basics of how to do this? or has some very simple example code illustrating it?
Before you can use FBOs for image manipulation you need to know how to handle OpenGL, as a FBO can simply be used as a render target (output buffer for rendering operations). Once you're fluent with OpenGL and probably know how to do shader programming, you can do virtually everything with images in an FBO, and do it extremely fast.
A simpler approach might be to employ CUDA (NVidia) or Stream Computing (ATI) to harness a GPU's power for image manipulation, because these APIs are much closer to regular array-based C++ programming. Image manipulation may be somewhat slower that way than with OpenGL, but still way faster than with traditional CPU driven code.
Framebuffer Objects (FBO) are just a basic tool that cannot be used to manipulate images directly. If you know how to render your image manipulations in OpenGL to the screen, you can then use FBOs to render them off-screen. So they are in fact useful for this task, since you are not limited by the resolution of your screen and don't have to distract the user with thousands of flashing images. However, the manipulation itself happens in OpenGL, probably in the fragment shader.
Visit to the OpenGL forum to get some advice how to start with OpenGL basics. They also have quite a few links to sample code.