Difference between glTexSubImage and glTexImage function in OpenGL - opengl

What is the difference between the two functions?
Any performance difference?
Thanks..

You create a texture using glTexImage, and then update its contents with glTexSubImage. When you update the texture, you can update the entire texture, or just a sub-rectangle of it.
It is far more efficient to create the one texture and update it than to create it and delete it repeatedly, so in that sense if you have a texture you want to update, always use glTexSubImage (after the initial creation).
Other techniques may be applicable for texture updates. For example, see this article on texture streaming for further information.
(Originally, this post suggested using glMapBuffer for texture updates - see discussion below.)

The gl functions with "sub" in the name aren't limited to power-of-2 dimensions. As gavinb points out, you need to use the non-sub variant once to set the overall dimensions, but I don't agree that calling the non-sub variant repeatedly is any slower than using "sub" for updates -- the GPU is still free to overwrite the existing texture in place as long as you are using the same texture id.

Related

Read and Write in one Texture (OpenGL)

I want to store and update informations in a texture. So the idea is, that I create a new texture with current informations. While storing it in the render process I actually want to read the informations out of the same pixel and store a weighted average of both values. So the value that was rendered to that pixel and the value that was already on that pixel.
Now I read very often that I can not read and write on the same texture. Now my questions is, may it maybe be possible? and if not should I copy the texture information, before the rendering step and pass the copy to the shader? If so, how can I copy the texture? or should I do a extra rendering step for copying?
I see two possible options here, depending on the mix equation
Alpha Blending: If the equation used can be mapped to one of the glBlendFunc functions, then this is the way to go. If you want to use linear factors for the stored and the new value this should be possible. This is also the option where I would expect the best performance.
Image Load Store: With this method one can read and write to the same texture at the same time (see here). The performance will usually be very bad here and you will have to use the image atomic operations to ensure that multiple fragments at the same location always read the correct value.
Copying the texture would, in my opinion, only work if you render an image and then perform one weighted average computation on it afterwards (otherwise you would have to copy the texture after each store operation). But if this is the case, one could simple render the result of the average computation to a different texture and completely avoid all the trouble of copying the input data.
If resorting to an extension is an option, you can use NV_texture_barrier which allows writing and reading from the same texture.

VTK OpenGL objects (3D texture) access from CUDA‏

Is there any proper way to access the low level OpenGL objects of VTK in order to modify them from a CUDA/OpenCL kernel using the openGL-CUDA/OpenCL interoperability feature?
Specifically, I would want to get the GLuint (or unsigned int) member from vtkOpenGLGPUVolumeRayCastMapper that points to the Opengl 3D Texture object where the dataset is stored, in order to bind it to a CUDA Surface to be able to access and modify its values from a CUDA kernel implemented by me.
For further information, the process that I need to follow is explained here:
http://rauwendaal.net/2011/12/02/writing-to-3d-opengl-textures-in-cuda-4-1-with-3d-surface-writes/
where the texID object used there (in Steps 1 and 2) is the equivalent to what I want to retrieve from VTK.
At a first look at the vtkOpenGLGPUVolumeRayCastMapper functions, I don't find an easy way to do this, rather than maybe creating a vtkGPUVolumeRayCastMapper subclass, but even in that case I am not sure what should I modify exactly, since I guess that some other members depend on the 3D Texture values, and should be also updated after modifying it.
So, do you know some way to do this?
Lots of thanks.
Subclassing might work, but you could probably avoid it if you wanted. The important thing is that you get the order of the GL/CUDA API calls in the right order.
First, you have to register the texture with CUDA. This is done using:
cudaGraphicsGLRegisterImage(&cuda_graphics_resource, texture_handle,
GL_TEXTURE_3D, cudaGraphicsRegisterFlagsSurfaceLoadStore);
with the stipulation that texture_handle is a GLuint written to by a call to glGenTextures(...)
Once you have registered the texture with CUDA, you can create the surface which can be read or written to in your kernel.
The only thing you have to worry about from here is that vtk does not use the texture in between a call to cudaGraphicsMapResources(...) and cudaGraphicsUnmapResources(...). Everything else should just be standard CUDA.
Also once you map the texture to CUDA and write to it within a kernel, there is no additional work besides unmapping the texture. GL will get the modified texture the next time it is used.

Most performant way to clear RWTexture2D

I was wondering what would be the quickest / most performant way to clear a 2D texture using DirectX 11?
Context: I am using an RWTexture object as a head pointer to implement linked lists on the GPU (essentially, to implement Order-Independent Transparency as known from the AMD Tech Demo) and I need to reset this buffer to a fixed value every Frame.
The following ideas come to my mind:
Declare it as a Render Target and use ClearRenderTargetView to set it. Seems unnatural to me since I don't actually render to it directly, also I am not sure if it actually works with an uint datatype
Actually Map it as a render target and render a fullscreen quad, using the pixel shader to set the value
Use a compute shader to set the fixed value
Is there some obvious way I am missing or an API for this that I am not aware of?
As pointed out by user galop1n, ClearUnorderedAccessViewUint and ClearUnorderedAccessViewFloat are the way to go here.

How to do 3D selection/picking using OpenGL

I have some objects in the scene, some may occlude others. When I click the mouse or drag-select to get a selection rectangle, I want to select/pick only the objects that I can see from this perspective. The application currently uses GL_SELECT render mode but as we know, this selects occluded objects too. Also, I read that this is deprecated in OpenGL 3.
There are two methods that are currently appealing to me. First is Object Selection using the Back Buffer (Red book, chapter 14): setting the colour of each object to it's object id and reading the colour of pixels from the back frame buffer. The second is occlusion queries (superbible, 4th ed, chap 13).
Other approaches I have ruled out are looking at the min/max z values in the selection buffer and doing custom ray/object detection outside of GL.
I have some questions:
1) If GL_SELECT is deprecated in recent OpenGL, what alternatives are developers supposed to be using?
2) I've only ever read about occlusion queries being employed to speed up rendering. Can they be used for selection/picking, and are there drawbacks?
3) The existing application has a handful of glColorXXX calls. If I went the back buffer route, and used glColorMask(FALSE,FALSE,FALSE,FALSE), will this effectively turn the glColourXXX calls into calls that have no effect, thereby letting me control the colour in a single place when rendering in select mode?
4) Which route is best/canonical?
I decided to implement the selection using the back buffer. Here's my attempt to answer my questions:
If GL_SELECT is deprecated in recent OpenGL, what alternatives are developers supposed to be using?
I think it's best to not employ OpenGL to do this task but to use spatial acceleration structures as user chamber85 suggested in comments to the original question.
I've only ever read about occlusion queries being employed to speed up rendering. Can they be used for selection/picking, and are there drawbacks?
I'm sure they could but one would need to know all the objects they want to query for occlusion before the draw. Using back buffer and colour selection, one can just see what is under the cursor or within a rectangular region and filter from there.
The existing application has a handful of glColorXXX calls. If I went the back buffer route, and used glColorMask(FALSE,FALSE,FALSE,FALSE), will this effectively turn the glColorXXX calls into calls that have no effect, thereby letting me control the colour in a single place when rendering in select mode?
The answer is no. Calling glColorMask() with all GL_FALSE parameters will not mean that glColor3ub() calls (for example) will not be honoured. It simply specifies a filter/mask for colours just before they are written to the colour buffer. The original thought was to set the colour to the object id, then call glColorMask() to ignore all subsequent glColorXXX() calls. This strategy is doomed as the colour representing the object id would also be masked out.
4) Which route is best/canonical?
I would say the back buffer colour selection is generally best as it doesn't require setting up the occlusion queries before/during the draw.

Read Framebuffer-texture like an 1D array

I am doing some gpgpu calculations with GL and want to read my results from the framebuffer.
My framebuffer-texture is logically an 1D array, but I made it 2D to have a bigger area. Now I want to read from any arbitrary pixel in the framebuffer-texture with any given length.
That means all calculations are already done on GPU side and I only need to pass certain data to the cpu that could be aligned over the border of the texture.
Is this possible? If yes is it slower/faster than glReadPixels on the whole image and then cutting out what I need?
EDIT
Of course I know about OpenCL/CUDA but they are not desired because I want my program to run out of the box on (almost) any platform.
Also I know that glReadPixels is very slow and one reason might be that it offers some functionality that I do not need (Operating in 2D). Therefore I asked for a more basic function that might be faster.
Reading the whole framebuffer with glReadPixels just to discard it all except for a few pixels/lines would be grossly inefficient. But glReadPixels lets you specify a rect within the framebuffer, so why not just restrict it to fetching the few rows of interest ? So you maybe end up fetching some extra data at the start and end of the first and last lines fetched, but I suspect the overhead of that is minimal compared with making multiple calls.
Possibly writing your data to the framebuffer in tiles and/or using Morton order might help structure it so a tighter bounding box can be be found and the extra data retrieved minimised.
You can use a pixel buffer object (PBO) to transfer pixel data from the framebuffer to the PBO, then use glMapBufferARB to read the data directly:
http://www.songho.ca/opengl/gl_pbo.html