Render to swap chain using compute shader - c++

I'm trying to use a compute shader to render directly to the swap chain.
For this, I need to create the swapchain with the usage VK_IMAGE_USAGE_STORAGE_BIT.
The problem is that the swapchain needs to be created with the format VK_FORMAT_B8G8R8A8_UNORM or VK_FORMAT_B8G8R8A8_SRGB and neither of the 2 allows the format feature VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT with the physical device I use.
Do I said something wrong or it's impossible to render to the swapchain with compute shader using my configuration?

Vulkan imposes no requirement on the implementation that it permit direct usage of a swapchain image in a compute shader operation (FYI: "rendering" typically refers to a very specific operation; it's not something that happens in a compute shader). Therefore, it is entirely possible that the implementation will forbid you from using a swapchain image in a CS through various means.
If you cannot create a swapchain image in your preferred format, then your next best bet is to see if you can find a compatible format for an image view of the format you can use as a storage image. This however requires that the implementation support the KHR extension swapchain_mutable_format, and the creation flags for the swapchain must include VK_SWAPCHAIN_CREATE_MUTABLE_FORMAT_BIT_KHR as well as a VkImageFormatListCreateInfoKHR list of formats that you intend to create views for.
Also, given support, this would mean that your CS will have to swap the ordering of the data. And don't forget that, when you create the swapchain, you have to ask it if you can use its images as storage images (imageUsage). It may directly forbid this.

Related

Dealing with the catch 22 of object lifetimes in vulkan's device, surface, and swapchain in C++?

Background:
In order to even display to the screen you need to enable a "KHR" (khronos group extension) extension for presentation surfaces.
A surface, as far as I understand, is an abstraction of the windows/places images are displayed returned by your window software.
In vulkan you have a VkSurface (returned by your window software, ie GLFW), which has certain properties
These properties are needed in order to know if a Device is compatible with it. In other words, before a VkDevice is created (the actual logical view of the GPU which you can actually use to submit commands to), it needs to know about the Surface if you are going to use it, specifically in order to create a device with presentation queues that support that surface with the properties it has.
Once the device is created, you can create the swapchain, which is basically a series of buffers/attachments you actually use to render to.
Swapchain's however have a 1:1 relationship with surfaces. There can only ever be a single swapchain per surface at max.
Problem:
This is where I start running into issues. In my code-base, I codify this relationship in a member variable. A surface has a swapchain, which guarantees that you as the programmer can't accidentally create multiple swapchains per surface if you use my wrapper.
But, if we use this abstraction the following happens:
my::Surface surface = window.create_surface(...); //VkSurface wrapper
auto queue_family = physical_device.find_queue_family_that_matches(surface,...);
auto queue_create_list = {{queue_family, priority},...};
my::Device device = physical_device.create_device(...,queue_create_list,...);
my::swapchain_builder.swapchain_builder(device);
swapchain_builder.builder_pattern_x(...).builder_pattern_x(...)....;
surface.create_swapchain(swapchain_builder);
...
render loop{
}
...
//end of program
return 0;
//ERROR! device no longer exists to destroy swapchain!
}
Because the surface is created before the device, and because the swapchain is a member of the surface, on destruction the device is destroyed before the swapchain.
The "solution" I came up with in the mean time was this:
my::Device device; //device default constructible, but becomes a VK_NULL_HANDLE underneath
my::Surface surface = ...;
...
device = physical_device.create_device(...,queue_create_list,...);
...
surface.create_swapchain(swapchain_builder);
And this certainly works. The surface is destroyed before the device is, and thus so is the swapchain. But it leaves a bad taste in my mouth.
The whole reason I made swapchain a member was to eliminate bugs caused by multiple swapchains being created, my eliminating the option for the bug to exist in the first place, and remove the need for the user to think about the Vulkan Spec by encoding that requirement into my wrapper itself.
But Now the user has to remember to default initialize the device first... or they will get an esoteric error (not as good as the one I show here) unless they use validation layers.
Question:
Is there some way to encode this object relationship at compile time with out runtime declaration order issues?, is there maybe a better way to codify a 1:1 relationship in this scenario, such that the surface object could exist on it's own and RAII order would handle this?
Swapchain's however have a 1:1 relationship with surfaces. There can only ever be a single swapchain per surface at max.
That is not true. From the standard:
A native window cannot be associated with more than one non-retired swapchain at a time.
You can create multiple swapchains for a surface. However, when you create a new one, you have to provide the old one, and the old one becomes "retired". Images you have previously acquired from the retired swapchain can still be presented, but you cannot acquire images more from the swapchain.
This moves nicely into the next point: the user needs to be able to recreate the swapchain for a surface.
Swapchains can become invalid, perhaps due to user rescaling of a window or other things. When this happens, the user needs to recreate them. Whether you retire the old one or not, you're going to have to call the function to create one.
So if you want your surface class to store a swapchain, your API needs a way for the user to create a swapchain.
In short, your goal is wrong; users need the function you're trying to get rid of.

How do we display pixel data calculated in an OpenCL kernel to the screen using OpenGL?

I am interested in writing a real-time ray tracing application in c++ and I heard that using OpenCL-OpenGL interoperability is a good way to do this (to make good use of the GPU), so I have started writing a c++ project using this interoperability and using GLFW for window management. I should mention that although I have some coding experience, I do not have so much in c++ and have not worked with OpenCL or OpenGL before attempting this project, so I would appreciate it if answers are given with this in mind (that is, beginner-friendly terminology is preferred).
So far I have been able to get OpenCL-OpenGL interoperability working with an example using a vertex buffer object. I have also demonstrated that I can create image data with an RGBA array (at least on the CPU), send this to an OpenGL texture with glTexImage2D() and display it using glBlitFramebuffer().
My problem is that I don't know how to create an OpenCL kernel that is able to calculate pixel data such that it can be given as the data parameter in glTexImage2D(). I understand that to use the interoperability, we must first create OpenGL objects and then create OpenCL objects from these to write the data on as these objects share memory, so I am assuming I must first create an empty OpenGL array object then create an OpenCL array object from this to apply an appropriate kernel to which would write the pixel data before using the OpenGL array object as the data parameter in glTexImage2D(), but I am not sure what kind of object to use and have not seen any examples demonstrating this. A simple example showing how OpenCL can create pixel data for an OpenGL texture image (assuming a valid OpenCL-OpenGL context) would be much appreciated. Please do not leave any line out as I might not be able to fill in the blanks!
It's also very possible that the method I described above for implementing a ray tracer is not possible or at least not recommended, so if this is the case please outline an advised alternate method for sending OpenCL kernel calculated pixel data to OpenGL and subsequently drawing this to the screen. The answer to this similar question does not go into enough detail for me and the CL/GL interop link is not working. The answer mentions that this can be achieved using a renderbuffer rather than a texture, but it says at the bottom of the Khronos OpenGL wiki for Renderbuffer Objects that the only way to send pixel data to them is via pixel transfer operations but I can not find any straightforward explanation for how to initialize data this way.
Note that I am using OpenCL c (no c++ bindings).
From your second para you are creating an OpenCL context with a platform specific combination of GLX_DISPLAY / WGL_HDC and GL_CONTEXT properties to interoperate with OpenGL, and you can create a vertex buffer object that can be read/written as necessary by both OpenGL and OpenCL.
That's most of the work. In OpenGL you can copy any VBO into a texture with
glBindBuffer(GL_PIXEL_UNPACK_BUFER, myVBO);
glTexSubImage2D(GL_TEXTURE_2D, level, x, y, width, height, format, size, NULL);
with the NULL at the end meaning to copy from GPU memory (the unpack buffer) rather than CPU memory.
As with copying from regular CPU memory, you might also need to change the pixel alignment if it isn't 32 bit.

Oculus Rift / Vulkan : Write to swapchain with a compute shader

I would like to write to the swapchain generated by OVR with a compute shader.
The problem is that the images don't have the usage VK_IMAGE_USAGE_STORAGE_BIT.
The creation of the swapchain is done with ovr_CreateTextureSwapChainVk which ask for a flag BindFlags. I added the flag ovrTextureBind_DX_UnorderedAccess but still the images don't have the correct usage.
The problem is that the images don't have the usage VK_IMAGE_USAGE_STORAGE_BIT.
Then you cannot write to a swapchain image directly with a compute shader.
The display engine that provides the swapchain images has the right to decide how you may and may not use them. The only method of interaction which is required is the ability to use them as a color render target; everything else is optional.
So you will have to do this another way, perhaps by writing to an intermediate image and copying/rendering it to the swapchain image.

VTK OpenGL objects (3D texture) access from CUDA‏

Is there any proper way to access the low level OpenGL objects of VTK in order to modify them from a CUDA/OpenCL kernel using the openGL-CUDA/OpenCL interoperability feature?
Specifically, I would want to get the GLuint (or unsigned int) member from vtkOpenGLGPUVolumeRayCastMapper that points to the Opengl 3D Texture object where the dataset is stored, in order to bind it to a CUDA Surface to be able to access and modify its values from a CUDA kernel implemented by me.
For further information, the process that I need to follow is explained here:
http://rauwendaal.net/2011/12/02/writing-to-3d-opengl-textures-in-cuda-4-1-with-3d-surface-writes/
where the texID object used there (in Steps 1 and 2) is the equivalent to what I want to retrieve from VTK.
At a first look at the vtkOpenGLGPUVolumeRayCastMapper functions, I don't find an easy way to do this, rather than maybe creating a vtkGPUVolumeRayCastMapper subclass, but even in that case I am not sure what should I modify exactly, since I guess that some other members depend on the 3D Texture values, and should be also updated after modifying it.
So, do you know some way to do this?
Lots of thanks.
Subclassing might work, but you could probably avoid it if you wanted. The important thing is that you get the order of the GL/CUDA API calls in the right order.
First, you have to register the texture with CUDA. This is done using:
cudaGraphicsGLRegisterImage(&cuda_graphics_resource, texture_handle,
GL_TEXTURE_3D, cudaGraphicsRegisterFlagsSurfaceLoadStore);
with the stipulation that texture_handle is a GLuint written to by a call to glGenTextures(...)
Once you have registered the texture with CUDA, you can create the surface which can be read or written to in your kernel.
The only thing you have to worry about from here is that vtk does not use the texture in between a call to cudaGraphicsMapResources(...) and cudaGraphicsUnmapResources(...). Everything else should just be standard CUDA.
Also once you map the texture to CUDA and write to it within a kernel, there is no additional work besides unmapping the texture. GL will get the modified texture the next time it is used.

OpenGL: Reusing the same texture with different parameters

In my program I have a texture which is used several times in different situations. In each situation I need to apply a certain set of parameters.
I want to avoid having to create an additional buffer and essentially creating a copy of the texture for every time I need to use it for something else, so I'd like to know if there's a better way?
This is what sampler objects are for (available in core since version 3.3, or using ARB_sampler_objects). Sampler objects separate the texture image from its parameters, so you can use one texture with several parameter sets. That functionality was created with exactly your problem in mind.
Quote from ARB_sampler_objects extension spec:
In unextended OpenGL textures are considered to be sets of image data (mip-chains, arrays, cube-map face sets, etc.) and sampling state (sampling mode, mip-mapping state, coordinate wrapping and clamping rules, etc.) combined into a single object. It is typical for an application to use many textures with a limited set of sampling states that are the same between them. In order to use textures in this way, an application must generate and configure many texture names, adding overhead both to applications and to implementations. Furthermore, should an application wish to sample from a texture in more than one way (with and without mip-mapping, for example) it must either modify the state of the texture or create two textures, each with a copy of the same image data. This can introduce runtime and memory costs to the application.