Mixing Audio Using OpenGL - opengl

I want to mix two (or more) 16bit audio streams using OpenGL and I need a bit of help
Basically what I want to do is to put the audio data into texture which I draw to a frame buffer object and then read back. This is not a problem, however drawing the data in way that gives correct results is a bit more problematic.
I have basically two questions.
In order to mix the data by drawing i need to use blending (alpha = 0.5), however the result should not have any alpha channel. So if I render to e.g. a frame buffer with the format RGB will alpha blending still work as I expect and the resulting alpha will not be written to the fbo? (I want to avoid having to read back the fbo for each render pass)
texture |sR|sG|sB|
framebuffer(before) |dR|dG|dB|
framebuffer(after)
|dR*0.5+sR*0.5|dG*0.5+sG*0.5|dB*0.5+sB*0.5|
The audio samples are signed 16bit integer values. Is it possible do signed calculations this way? Or will I need to first convert the values to unsigned on the cpu, draw them, and then make them signed again on the cpu?
EDIT:
I was a bit unclear. My hardware is restricted to OpenGL 3.3 hardware. I would prefer to not use CUDA or OpenCL, since I'm alrdy using OpenGL for other stuff.
Each audio sample will be rendered in seperate passes, which means that it has to "mix" with whats already been rendered to the frame buffer. The problem is how the output from the pixel shader is written to the framebuffer (this blending is not accessible through programmable shaders, as far as i know, and one has to use glBlendFunc).
EDIT2:
Each audio sample will be rendered in different passes, so only one audio sample will be available in the shader at a time, which means that they need to be accumulated in the FBO.
foreach(var audio_sample in audio_samples)
draw(audio_sample);
and not
for(int n = 0; n < audio_samples.size(); ++n)
{
glActiveTexture(GL_TEXTURE0 + n);
glBindTexture(audio_sample);
}
draw_everything();

Frankly, why wouldn't you just use programmable pixel shaders for that?
Do you have to use OpenGL 1 Fixed Functionality Pipeline?
I'd just go with a programmable shader operating on signed 16bit grayscale linear textures.
Edit:
foreach(var audio_sample in audio_samples)
blend FBO1 + audio_sample => FBO2
swap FBO2, FBO1
It ought to be just as fast, if not faster (thanks to streaming pipelines).

I agree with QDot. Could you however inform us a bit about the hardware restrictions you are facing? If you have reasonable modern hardware I might even suggest to go the CUDA or OpenCL route, in stead of going through OpenGL.

You should be able to do blending even if the destination buffer does not have alpha. That said, rendering to non-power-of-two sizes (rgb16 = 6bytes/pixel) usually incurs performance penalties.
Signed is not your typical render target format, but it does exist in the OpenGL 4.0 specification (Table 3.12, called RGB16_SNORM or RGB16I, depending on whether you want a normalized representation or not).
As a side note, you also have glBlendFunc(GL_CONSTANT_ALPHA,GL_ONE_MINUS_CONSTANT_ALPHA) to not even have to specify an alpha per-pixel. That may not be available on all GL implementations though.

Related

Editable Texture with OpenGL

I'm trying to take advantage of a gpu's parallelism to make an image proccessing application. I'm having a shader, which takes two textures, and based on some uniform variables, computes an output texture. But instead of transparency alpha value, each texture pixel needs an extra metadata byte, mandatory in computation:
So I consider running the shader twice each frame, once to compute the Dynamic Metadata as a single byte texture, and once to calculate the resulting Paint Texture, which I need to be 3 bytes (to limit memory usage, as there might be quite some such textures loaded at once).
I find the above problem a bit complicated, I've used opengl to paint to
the screen, but I need to paint to two different textures this time,
which I do not know how to do. Besides, gl_FragColor built-in variable's
type is vec4, but I need different output values.
So, to sum it up a little, is it possible for the fragment shader to output
anything other than a vec4?
Is it possible to save to two different textures with a single call?
Is it possible to make an editable texture to store changes, until the editing ends and the data have to be passed back to the cpu?
What openGL calls would be most usefull for the above?
Paint texture should also be able to be retrieved to be shown on the screen.
The above could very easily be done via blitting textures on the cpu.
I could keep all the relevant data on the cpu, do all the work 60 times/sec,
and update the relevant texture by passing the data from the cpu to the gpu.
For changing relatively small regions of a texture each frame
(about ~20% of the total scale of about 512x512 size textures), would you consider the above approach worth the trouble?
It depends on which version of OpenGL you use.
The latest OpenGL 4+ does not have a gl_FragColor variable, and instead lets you write any number (up to supported maximum) of output colors from the fragment shader, each sent to the corresponding framebuffer color attachment:
layout(location = 0) out vec4 OUT0;
layout(location = 1) out float OUT1;
That will write OUT0 to GL_COLOR_ATTACHMENT0 and OUT1 to GL_COLOR_ATTACHEMENT1 of the currently bound framebuffer.
However, considering that you use gl_FragColor, you use some old version of OpenGL. I'm not proficient in the legacy older OpenGL versions, but you can check out whether your implementation supports the GL_ARB_draw_buffers extension and/or gl_FragData[] output variable.
Also, as stated, it's unclear why can't you use a single RGBA texture and use its alpha channel for that metadata.

Open gl es - How to improve performance, render to texture, blending

I am here because I'm working on an OpenGL program and I have some issues with performance. I work with OpenGL ES 3.0 on iMX6 soc.
Here is my algorithm :
I get an image from camera which is directly map to a texture.
Using an FBO, I render to texture to map the image on a specific form.
I do the same thing (with a second FBO) for another image which is sent via shared memory by another application. This step is performed only if the image is updated. Only once per second.
I blend these two textures in the default frame buffer to render the result to the screen.
If I perform these three steps separately, It works well and the screen is updated at 30FPS. But when I include the three step in one program the render is very slow and I got only 0.5FPS.
I am wondering if the GPU on the iMX6 is enough powerful, but I think it is not a complex algorithm. I think I am doing something in the wrong way, but what?
I use 3 different frame buffers, so is that a good way or should I use only one?
Can someone give me answer, clues, anything that can help me? :-)
My images dimensions are 1280x1024 x RGBA. Then I am doing some conversion from floating-point texture to integer and back to float, this is done to perform bitwise operation on pixels.
Thanks to #Columbo the problem came from all the conversion, I work with floating-point texture and only for the bitwise operations I do the conversion which improve a lot the performance of the algorithm.
Another point which decrease the performance was the texture format. For the first step, the image was 1280x1024 but only on one composent (grayscale image). To keep only the grayscale composant and not to use too much memory I worked with a GL_RED texture but this wasn't a good idea because when I changed it to GL_RGB, I double the framerate of the render too.

Writing to depth buffer from opengl compute shader

Generally on modern desktop OpenGL hardware what is the best way to fill a depth buffer from a compute shader and then use that depth buffer for graphics pipeline rendering with triangles etc?
Specifically I am wondering about concerns regards HiZ. Also I wonder if it's better to do compute shader modifications to the depth buffer before or after the graphics rendering?
If the compute shader is run after the graphics rendering I assume the depth buffer will typically be decompressed behind the scenes. But I worry done the other way around the depth buffer may be in a decompressed/non-optimal state for the graphics pipeline?
As far as i know, you cannot bind textures with any of the depth formats as images, and thus cannot write to depth format textures in compute shaders. See glBindImageTexture documentation, it lists the formats that your texture format must be compatible to. Depth formats are not among them and the specification says the depth formats are not compatible to the normal formats.
Texture copying functions have the same compatibility restrictions, so you can't even e.g. write to a normal texture in the compute shader and then copy to a depth texture. glCopyImageSubData does not explicitly have that restriction but i haven't tried it and it's not part of the core profile anymore.
What might work is writing to a normal texture, then rendering a fullscreen triangle and setting gl_FragDepth to values read from the texture, but that's an additional fullscreen pass.
I don't quite understand your second question - if your compute shader stuff modifies the depth buffer, the result will most likely be different depending on whether you do it before or after regular rendering because different parts will be visible or occluded.
But maybe that question is moot since it seems you cannot manually write into depth buffers at all - which might also answer your third question - by not writing into depth buffers you cannot mess with the compression of it :)
Please note that i'm no expert in this, i had a similar problem and looked at the docs/spec myself, so this all might be wrong :) Please let me know if you manage to write to depth buffers with compute shaders!

Get results of GPU calculations back to the CPU program in OpenGL

Is there a way to get results from a shader running on a GPU back to the program running on the CPU?
I want to generate a polygon mesh from simple voxel data based on a computational costly algorithm on the GPU but I need the result on the CPU for physics calculations.
Define "the results"?
In general, if you're doing GPGPU-style computations with OpenGL, you are going to need to structure your shaders around the needs of a rendering system. Rendering systems are designed to be one-way: data goes into them and an image is produced. Going backwards, having the rendering system produce data, is not generally how rendering systems are structured.
That doesn't mean you can't do it, of course. But you need to architect everything around the limitations of OpenGL.
OpenGL offers a number of hooks where you can write data from certain shader stages. Most of these require specialized hardware
Fragment shader outputs
Any hardware capable of fragment shaders will obviously allow you to write to the current framebuffer you're rendering. Through the use of framebuffer objects and textures with floating-point or integer image formats, you can write pretty much any data you want to a variety of images. Once in a texture, you can simply call glGetTexImage to get the rendered pixel data. Or you can just do glReadPixels to get it if the FBO is still bound. Either way works.
The primary limitations of this method are:
The number of images you can attach to the framebuffer; this limits the amount of data you can write. On pre-GL 3.x hardware, FBOs were typically limited to only 4 images plus a depth/stencil buffer. In 3.x and better hardware, you can expect a minimum of 8 images.
The fact that you're rendering. This means that you need to set up your vertex data to position a triangle exactly where you want it to modify data. This is not a trivial undertaking. It's also difficult to get useful input data, since you typically want each texel to be fairly independent of the other. Structuring your fragment shader around these limitations is difficult. Not impossible, but non-trivial in many cases.
Transform Feedback
This OpenGL 3.0 feature allows the output from the Vertex Processing stage of OpenGL (vertex shader and optional geometry shader) to be captured in one or more buffer objects.
This is much more natural for capturing vertex data that you want to play with or render again. In your case, you'll need to read it back after rendering it, perhaps with a glGetBufferSubData call, or by using glMapBufferRange for reading.
The limitations here are that you generally only can capture 4 output values, where each value is a vec4. There are also some strict layout restrictions. Some OpenGL 3.x and 4.x hardware offers the ability to write data to multiple feedback streams, which can all be written into different buffers.
Image Load/Store
This GL 4.2 feature represents the pinnacle of writing: you can bind an image (a buffer texture, if you want to write to a buffer), and just write to it. There are memory ordering constraints that you need to work within.
It's very flexible, but very complex. Besides the difficulty in using it properly, there are a number of limitations. The number of images you can write to will be fairly limited, perhaps 8 or so. And implementations may have total write limits, so that 8 images to write to may have to be shared by the fragment shader's outputs.
What's more, image outputs are only guaranteed for the fragment shader (and 4.3's compute shaders). That is, hardware is allowed to forbid you from using image load/store on non-FS/CS shader stages.

When to call glEnable(GL_FRAMEBUFFER_SRGB)?

I have a rendering system where I draw to an FBO with a multisampled renderbuffer, then blit it to another FBO with a texture in order to resolve the samples in order to read off the texture to perform post-processing shading while drawing to the backbuffer (FBO index 0).
Now I'd like to get some correct sRGB output... The problem is the behavior of the program is rather inconsistent between when I run it on OS X and Windows and this also changes depending on the machine: On Windows with the Intel HD 3000 it will not apply the sRGB nonlinearity but on my other machine with a Nvidia GTX 670 it does. On the Intel HD 3000 in OS X it will also apply it.
So this probably means that I'm not setting my GL_FRAMEBUFFER_SRGB enable state at the right points in the program. However I can't seem to find any tutorials that actually tell me when I ought to enable it, they only ever mention that it's dead easy and comes at no performance cost.
I am currently not loading in any textures so I haven't had a need to deal with linearizing their colors yet.
To force the program to not simply spit back out the linear color values, what I have tried is simply comment out my glDisable(GL_FRAMEBUFFER_SRGB) line, which effectively means this setting is enabled for the entire pipeline, and I actually redundantly force it back on every frame.
I don't know if this is correct or not. It certainly does apply a nonlinearization to the colors but I can't tell if this is getting applied twice (which would be bad). It could apply the gamma as I render to my first FBO. It could do it when I blit the first FBO to the second FBO. Why not?
I've gone so far as to take screen shots of my final frame and compare raw pixel color values to the colors I set them to in the program:
I set the input color to RGB(1,2,3) and the output is RGB(13,22,28).
That seems like quite a lot of color compression at the low end and leads me to question if the gamma is getting applied multiple times.
I have just now gone through the sRGB equation and I can verify that the conversion seems to be only applied once as linear 1/255, 2/255, and 3/255 do indeed map to sRGB 13/255, 22/255, and 28/255 using the equation 1.055*C^(1/2.4)+0.055. Given that the expansion is so large for these low color values it really should be obvious if the sRGB color transform is getting applied more than once.
So, I still haven't determined what the right thing to do is. does glEnable(GL_FRAMEBUFFER_SRGB) only apply to the final framebuffer values, in which case I can just set this during my GL init routine and forget about it hereafter?
When GL_FRAMEBUFFER_SRGB is enabled, all writes to an image with an sRGB image format will assume that the input colors (the colors being written) are in a linear colorspace. Therefore, it will convert them to the sRGB colorspace.
Any writes to images that are not in the sRGB format should not be affected. So if you're writing to a floating-point image, nothing should happen. Thus, you should be able to just turn it on and leave it that way; OpenGL will know when you're rendering to an sRGB framebuffer.
In general, you want to work in a linear colorspace for as long as possible. Only your final render, after post-processing, should involve the sRGB colorspace. So your multisampled framebuffer should probably remain linear (though you should give it higher resolution for its colors to preserve accuracy. Use GL_RGB10_A2, GL_R11F_G11F_B10F, or GL_RGBA16F as a last resort).
As for this:
On Windows with the Intel HD 3000 it will not apply the sRGB nonlinearity
That is almost certainly due to Intel sucking at writing OpenGL drivers. If it's not doing the right thing when you enable GL_FRAMEBUFFER_SRGB, that's because of Intel, not your code.
Of course, it may also be that Intel's drivers didn't give you an sRGB image to begin with (if you're rendering to the default framebuffer).