Editable Texture with OpenGL - opengl

I'm trying to take advantage of a gpu's parallelism to make an image proccessing application. I'm having a shader, which takes two textures, and based on some uniform variables, computes an output texture. But instead of transparency alpha value, each texture pixel needs an extra metadata byte, mandatory in computation:
So I consider running the shader twice each frame, once to compute the Dynamic Metadata as a single byte texture, and once to calculate the resulting Paint Texture, which I need to be 3 bytes (to limit memory usage, as there might be quite some such textures loaded at once).
I find the above problem a bit complicated, I've used opengl to paint to
the screen, but I need to paint to two different textures this time,
which I do not know how to do. Besides, gl_FragColor built-in variable's
type is vec4, but I need different output values.
So, to sum it up a little, is it possible for the fragment shader to output
anything other than a vec4?
Is it possible to save to two different textures with a single call?
Is it possible to make an editable texture to store changes, until the editing ends and the data have to be passed back to the cpu?
What openGL calls would be most usefull for the above?
Paint texture should also be able to be retrieved to be shown on the screen.
The above could very easily be done via blitting textures on the cpu.
I could keep all the relevant data on the cpu, do all the work 60 times/sec,
and update the relevant texture by passing the data from the cpu to the gpu.
For changing relatively small regions of a texture each frame
(about ~20% of the total scale of about 512x512 size textures), would you consider the above approach worth the trouble?

It depends on which version of OpenGL you use.
The latest OpenGL 4+ does not have a gl_FragColor variable, and instead lets you write any number (up to supported maximum) of output colors from the fragment shader, each sent to the corresponding framebuffer color attachment:
layout(location = 0) out vec4 OUT0;
layout(location = 1) out float OUT1;
That will write OUT0 to GL_COLOR_ATTACHMENT0 and OUT1 to GL_COLOR_ATTACHEMENT1 of the currently bound framebuffer.
However, considering that you use gl_FragColor, you use some old version of OpenGL. I'm not proficient in the legacy older OpenGL versions, but you can check out whether your implementation supports the GL_ARB_draw_buffers extension and/or gl_FragData[] output variable.
Also, as stated, it's unclear why can't you use a single RGBA texture and use its alpha channel for that metadata.


Detect single channel texture in pixel shader

Is it possible to detect when a format has a single channel in HLSL or GLSL? Or just as good, is it possible to extract a greyscale color from such a texture without knowing if it has a single channel or 4?
When sampling from texture formats such as DXGI_FORMAT_R8_*/GL_R8 or DXGI_FORMAT_BC4_UNORM, I am getting pure red RGBA values (g,0,0,1). This would not be a problem if I knew (within the shader) that the texture only had the single channel, as I could then flood the other channels with that red value. But doing anything of this nature would break the logic for color textures, requiring a separate compiled version for the grey sampling (for every texture slot).
Is it not possible to make efficient use of grey textures in modern shaders without specializing the shader for them?
The only solution I can come up with at the moment would be to detect the grey texture on the CPU side and generate a macro on the GPU side that selects a different compiled version of the shader for every texture slot. Doing this with 8 texture slots would add up to 8x8=64 compiled versions every shader that wants to support grey inputs. That's not counting the other macro-like switches that actually make sense being there.
Just to be clear, I do know that I can load these textures into GPU memory as 4-channel greyscale textures, and go from there. But doing that uses 4X the memory, and I would rather load in 3 more textures.
In OpenGL there's two ways to achieve what you're looking for:
Legacy: The INTENSITY and LUMINANCE texture formats will when sampled result in vec4(I,I,I,I) or vec4(L,L,L,1).
Modern: Use a swizzle mask to apply user defined channel swizzling per texture: glTexParameteriv(GL_TEXTURE_2D, GL_TEXTURE_SWIZZLE_RGBA, {GL_RED,GL_RED,GL_RED,GL_ONE});
In DirectX 12 you can use component mapping during the creation of a ShaderResourceView.

how to retrieve z depth and color of a rendered pixel

I would like to retrieve the z height of each pixels of a rendered object in a scene.
I will need to retrieve the color rendered too.
What are the opengl technics to implement ?
glReadPixels and CPU side code
use glReadPixels to obtain both RGB and Depth buffers. Here examples for both:
depth buffer got by glReadPixels is always 1
OpenGL Scale Single Pixel Line
That will read the buffers into CPU accessible memory. This way is slow (due to sync) but should work on any platform.
FBO render to texture and GPU shader
Faster method is to use FBO and render to texture and use that output in next rendering pass as input texture for computing your stuff inside shaders. This however will not run properly on Intel and might need additional tweaking of code between nVidia and AMD.
If you have per pixel output use single QUAD covering your screen as the second rendering pass.
If you got single output for the whole screen instead use single POINT render and compute all in the fragment shader (scann the whole texture inside) something like this:
How to implement 2D raycasting light effect in GLSL
The difference is that by usnig shaders and FBO you are not transferring data between GPU/CPU so its way faster.
The content of the targeted textures can be still readed by CPU using texture related GL functions
compute GPU shaders
There are also compute shaders out there but I did not use them yet so I am just guessing however with them it might be possible to do your stuff in single pass and also the form of the result and computation should not be as limiting.
My bet is that you are doing some post processing similar to Deferred Shading so googling such topic/tutorials might help.

Fast line drawing in OpenGL

I am working on a project that requires drawing a lot of data as it is acquired by an ADC...something like 50,000 lines per frame on a monitor 1600 pixels wide. It runs great on a system with a 2007-ish Quadro FX 570, but basically can't keep up with the data on machines with Intel HD 4000 class chips. The data load is 32 channels of 200 Hz data received in batches of 5 samples per channel 40 times per second. So, in other words, the card only needs to achieve 40 frames per second or better.
I am using a single VBO for all 32 channels with space for 10,000 vertices each. The VBO is essentially treated like a series of ring buffers for each channel. When the data comes in, I decimate it based on the time scale being used. So, basically, it tracks the min/max for each channel. When enough data has been received for a single pixel column, it sets the next two vertices in the VBO for each channel and renders a new frame.
I use glMapBuffer() to access the data once, update all of the channels, use glUnmapBuffer, and then render as necessary.
I manually calculate the transform matrix ahead of time (using an orthographic transform calculated in a non-generic way to reduce multiplications), and the vertex shader looks like:
#version 120
varying vec4 _outColor;
uniform vec4 _lBound=vec4(-1.0);
uniform vec4 _uBound=vec4(1.0);
uniform mat4 _xform=mat4(1.0);
attribute vec2 _inPos;
attribute vec4 _inColor;
void main()
gl_Position=clamp(_xform*vec4(_inPos, 0.0, 1.0), _lBound, _uBound);
The _lBound, _uBound, and _xform uniforms are updated once per channel. So, 32 times per frame. The clamp is used to limit certain channels to a range of y-coordinates on the screen.
The fragment shader is simply:
#version 120
varying vec4 _outColor;
void main()
There is other stuff being render to the screen; channel labels, for example, using quads and a texture atlas; but profiling in gDEBugger seems to indicate that the line rendering takes the overwhelming majority of time per frame.
Still, 50,000 lines does not seem like a horrendously large number to me.
So, after all of that, the question is: are there any tricks to speeding up line drawing? I tried rendering them to the stencil buffer and then clipping a single quad, but that was slower. I thought about drawing the lines to a texture, the drawing a quad with the texture. But, that does not seem scalable or even faster due to uploading large textures constantly. I saw a technique that stores the y values in a single row texture, but that seems more like memory optimization rather than speed optimization.
Mapping a VBO might slow you down due to the driver might require to sync the GPU with the CPU. A more performant way is to just throw your data onto the GPU, so the CPU and GPU can run more independently.
Recreate the VBO every time, do create it with STATIC_DRAW
If you need to map your data, do NOT map as readable (GL_WRITE_ONLY)
Thanks, everyone. I finally settled on blitting between framebuffers backed by renderbuffers. Works well enough. Many suggested using textures, and I may go that route in the future if I eventually need to draw behind the data.
If you're just scrolling a line graph (GDI style), just draw the new column on the CPU and use glTexSubImage2D to update a single column in the texture. Draw it as a pair of quads and update the st coordinates to handle scrolling/wrapping.
If you need to update all the lines all the time, use a VBO created with GL_DYNAMIC_DRAW and use glBufferSubData to update the buffer.

I need my GLSL fragment shader to return the distance calculation

I'm using some standard GLSL (version 120) vertex and fragment shaders to simulate LIDAR. In other words, instead of just returning a color at each x,y position (each pixel, via the fragment shader), it should return color and distance.
I suppose I don't actually need all of the color bits, since I really only want the intensity; so I could store the distance in gl_FragColor.b, for example, and use .rg for the intensity. But then I'm not entirely clear on how I get the value back out again.
Is there a simple way to return values from the fragment shader? I've tried varying, but it seems like the fragment shader can't write variables other than gl_FragColor.
I understand that some people use the GLSL pipeline for general-purpose (non-graphics) GPU processing, and that might be an option — except I still do want to render my objects normally.
OpenGL already returns this "distance calculation" via the depth buffer, although it's not linear. You can simply create a frame buffer object (FBO), attach colour and depth buffers, render to it, and you have the result sitting in the depth buffer (although you'll have to undo the depth transformation). This is the easiest option to program provided you are familiar with the depth calculations.
Another method, as you suggest, is storing the value in a colour buffer. You don't have to use the main colour buffer because then you'd lose your colour or have to render twice. Instead, attach a second render target (texture) to your FBO (GL_COLOR_ATTACHMENT1) and use gl_FragData[0] for normal colour and gl_FragData[1] for your distance (for newer GL versions you should be declaring out variables in the fragment shader). It depends on the precision you need, but you'll probably want to make the distance texture 32 bit float (GL_R32F and write to gl_FragData[1].r).
- This is a decent place to start: http://www.opengl.org/wiki/Framebuffer_Object
Yes, GLSL can be used for compute purposes. Especially with ARB_image_load_store and nvidia's bindless graphics. You even have access to shared memory via compute shaders (though I've never got one faster than 5 times slower). As #Jherico says, fragment shaders generally output to a single place in a framebuffer attachment/render target, and recent features such as image units (ARB_image_load_store) allow you to write to arbitrary locations from a shader. It's probably overkill and slower but you could also write your distances to a buffer via image units .
Finally, if you want the data back on the host (CPU accessible) side, use glGetTexImage with your distance texture (or glMapBuffer if you decided to use image units).
Fragment shaders output to a rendering buffer. If you want to use the GPU for computing and fetching data back into host memory you have a few options
Create a framebuffer and attach a texture to it to hold your data. Once the image has been rendered you can read back information from the texture into host memory.
Use an CUDA, OpenCL or an OpenGL compute shader to write the memory into an arbitrary bound buffer, and read back the buffer contents

How to create textures within GPU

Can anyone pls tell me how to use hardware memory to create textures in OpenGL ? Currently I'm running my game in window mode, do I need to switch to fullscreen to get the use of hardware ?
If I can create textures in hardware, is there a limit for no of textures (other than the hardware memory) ? and then how can I cache my textures into hardware ? Thanks.
This should be covered by almost all texture tutorials for OpenGL. For example here, here and here.
For every texture you first need a texture name. A texture name is like a unique index for a single texture. Every name points to a texture object that can have its own parameters, data, etc. glGenTextures is used to get new names. I don't know if there is any limit besides the uint range (2^32). If there is then you will probably get 0 for all new texture names (and a gl error).
The next step is to bind your texture (see glBindTexture). After that all operations that use or affect textures will use the texture specified by the texture name you used as parameter for glBindTexture. You can now set parameters for the texture (glTexParameter) and upload the texture data with glTexImage2D (for 2D textures). After calling glTexImage you can also free the system memory with your texture data.
For static textures all this has to be done only once. If you want to use the texture you just need to bind it again and enable texturing (glEnable(GL_TEXTURE_2D)).
The size (width/height) for a single texture is limited by GL_MAX_TEXTURE_SIZE. This is normally 4096, 8192 or 16384. It is also limited by the available graphics memory because it has to fit into it together with some other resources like the framebuffer or vertex buffers. All textures together can be bigger then the available memory but then they will be swapped.
In most cases the graphics driver should decide which textures are stored in system memory and which in graphics memory. You can however give certain textures a higher priority with either glPrioritizeTextures or with glTexParameter.
I wouldn't worry too much about where textures are stored because the driver normally does a very good job with that. Textures that are used often are also more likely to be stored in graphics memory. If you set a priority that's just a "hint" for the driver on how important it is for the texture to stay on the graphics card. It's also possible the the priority is completely ignored. You can also check where textures currently are with glAreTexturesResident.
Usually when you talk about generating a texture on the GPU, you're not actually creating texture images and applying them like normal textures. The simpler and more common approach is to use Fragment shaders to procedurally calculate the colors of for each pixel in real time from scratch for every single frame.
The canonical example for this is to generate a Mandelbrot pattern on the surface of an object, say a teapot. The teapot is rendered with its polygons and texture coordinates by the application. At some stage of the rendering pipeline every pixel of the teapot passes through the fragment shader which is a small program sent to the GPU by the application. The fragment shader reads the 2D texture coordinates and calculates the Mandelbrot set color of the 2D coordinates and applies it to the pixel.
Fullscreen mode has nothing to do with it. You can use shaders and generate textures even if you're in window mode. As I mentioned, the textures you create never actually occupy space in the texture memory, they are created on the fly. One could probably think of a way to capture and cache the generated texture but this can be somewhat complex and require multiple rendering passes.
You can learn more about it if you look up "GLSL" in google - the OpenGL shading language.
This somewhat dated tutorial shows how to create a simple fragment shader which draws the Mandelbrot set (page 4).
If you can get your hands on the book "OpenGL Shading Language, 2nd Edition", you'll find it contains a number of simple examples on generating sky, fire and wood textures with the help of an external 3D Perlin noise texture from the application.
To create a texture on GPU look into "render to texture" tutorials. There are two common methods: Binding a PBuffer context as texture, or using Frame Buffer Objects. PBuffer render to textures are the older method, and have the wider support. Frame Buffer Objects are easier to use.
Also you don't have to switch to "fullscreen" mode for OpenGL to be HW accelerated. In fact OpenGL doesn't know about windows at all. A fullscreen OpenGL window is just that: A toplvel window on top of all other windows with no decorations and the input focus grabed. Some drivers bypass window masking and clipping code, and employ a simpler, faster buffer swap method if the window with the active OpenGL context covers the whole screen, thus gaining a little performance, but with current hard- and software the effect is very small compared to other influences.