Can we render to screen while also extracting the points? - opengl

I have two streams, one of point clouds and one of fullscreen textures. I want to colour the points using the texture and then render them, but I also want to store the resulting coloured pointclouds for later analyzes and usage. Can I do this somehow without sending the data to the unit, colour the points on the GPU, extract it to host and then send it to the unit again in order to render it? What I mean is that it can be done using GPU computing and then store the data to allocated memory on the unit, which you then later extract, but sending the data to the unit for processing and then later sending the same data to the unit for rendering seems redundant. I have never done anything other than rendering to screen before so I am not sure.
If it is possible, do we need Cuda or can we do without? How could it be done?
EDIT:
An example of what I hope to achieve. Arrows indicate moving data and | symbolizes the wall between the two sides.
unit | host
-------------------------------------------------------------------------------------------
← PointCloud and Texture //Input data
Colour points | Wait //Colour the points using the GPU
Coloured points → //Extract the coloured points
Render points | Do whatever with the points. // Render to screen and use the coloured points on the CPU
An example of what I hope to avoid. Arrows indicate moving data and | symbolizes the wall between the two sides.
unit | host
-------------------------------------------------------------------------------------------
← PointCloud and Texture //Input data
Colour points | Wait //Colour the points using the GPU
Coloured points → //Extract the coloured points
← Coloured points //Send the exact same data back again
Render points | Do whatever with the points. // Render to screen and use the coloured points on the CPU

Related

Interactive mouse picking/selecting of curves with OpenGL

This is NOT about how to create and render curves (e.g. Bezeier, NURBS, etc) BUT about how to be able to interactively 'pick' such curves by mouse click when the mouse cursor is hovering over ANY part of such a curve.
After doing the required computing I render a curve, in either 2D or 3D, by subdividing it into lots of individual smaller line segments. The more such line segments there are the smoother and more realistic the curve looks.
I deliberately create each of these GL_LINE segments as separate entities (each being assigned their own unique ID number). These line segments belong to the Curve 'object' (which also has it's own unique ID). Then, using using Ray Casting I can enable mouse-line collsion detection and know when an individual line segment has been 'hit' - and highlight it (e.g. temporarily change its color).
BUT at the same time also highlight all the other line segments that belong to the Curve - and so give appearance of the whole curve being selected.
THE PROBLEM is that because each curve is made up of not just the 'core' control points, which effectively define the curve, but also the thousands of points that effectively draw the curve there is quickly a very noticable slowing down of graphics performance.
I am aware that I could more efficiently instead compute the all the subdivision points and use LINE_STRIP instead to render the curve as one graphical object? BUT then that will NOT allow me to use the ray casting technique described to be be able hover the mouse cursor over any part of the curve and 'select' the curve object.
So....how can I more efficiently 'pick' curves in OpenGL?
You got 2 obvious options for this:
use ID buffer
when you render your curve you assign color to the RGBA frame buffer. So simply also assign ID of renderd curve to separate buffer (of the same resolution as view) and then simply pick the pixel position under mouse from this buffer to see exactly which curve or object you select.
This is pixel perfect and O(1) super fast ... However if your objects are too thin you might have problems using it for mouse picking from single pixel so test up to some distance from mouse (you can glReadPixels rectangle around mouse) end return either all the IDs present or just the most often one.
See this:
OpenGL 3D-raypicking with high poly meshes
Do not forget to clear the ID buffer before rendering and if you got too many objects that will not fit into 8bit stencil use different buffer... In case of 2D you can use depth buffer, or you can render to RGBA framebufer in 2 passes first the ID then read into CPU side memory and ten normal render. You can also render to texture ...
compute distance to curve
its doable for example see:
Is it possible to express “t” variable from Cubic Bezier Curve equation?
as you can see its possible to use this also for the rendering itself (no line approximation just the "perfect" curve) and even with speed ...
So simply compute the distance of mouse to each of your curves and remember the closest one. If the distance is bigger than some threshold/distance no selection occurs.

How to fix a point on the surface of a 3D model created by texture mapping in OpenGL?

Let's see an image first:
The model in the image is create by texture mapping. I want to have a mouse clicked on the screen, then I want to place a fixed point on the surface of the model. What's more, as the model rotates, the fixed point is still on the surface of the model.
My question is:
How can I place the fixed point on the surface of the model?
How can I get the coordinate (x, y, z) of the fixed point?
My thought is as follows:
use gluUnproject function to get two points when I have the mouse clicked on the screen. One point is on the near clip plane and another is on the far one.
concatenate the two points to form a line.
iterate points on the line of step 2 and use glReadPixels to get the pixel value of the iterated points. If the the values jump from zero to nonzero or jump from nonzero to zero(the pixel value of background is zero), the surface points are found.
This is my thought. But it seems that it does not work!!! Can anyone give me some advice. Thank you!
The model in the image is create by texture mapping.
No, it's not. First and foremost there is no model at all. What you have there is a 3D dataset of voxels and then you have a volume rasterizer that "shoots" rays through the dataset, integrates them up and for each ray produces a color and opacity value.
This process is not(!!!) texture mapping. Texture mapping is when you draw a "solid" primitive and for each fragment (a fragment is what eventually becomes a pixel) determines a single location in the texture data set and samples it. But a volume raycaster as you have it there performs a whole ray integration effectively sampling many voxels from the whole dataset into a single pixel. That's a completely different way of creating a color-opacity value.
My question is:
How can I place the fixed point on the surface of the model?
You can't because the dataset you have there does not have a "fixed" surface point. You have to define some segmentation operation that decides which position along the ray constitutes as "this is the surface". The simple most method would be using a threshold cutoff function.
How can I get the coordinate (x, y, z) of the fixed point?
Your best bet would be modifying the volume raycasting code, changing it from an integrator into a segmentizer. Assume that you want to use the threshold method.
You typical volume rasterizer works like this (usually implemented in a shader):
vec4 output;
for(vec3 pos = start
; length(pos - start) <= length(end - start)
; pos += voxel_grid_increment ){
vec4 t = texture3D(voxeldata, pos);
/* integrate t into output */
}
The integration step merges the incoming color and opacity of the texture voxel t into the output color+opacitcy. There are several methods to do this.
You'd change this into a shader that simply stops that loop at a given cutoff threshold and emits the position of that voxel:
vec3 output;
for(vec3 pos = start
; length(pos - start) <= length(end - start)
; pos += voxel_grid_increment ){
float t = texture3D(voxeldata, pos).r;
if( t > threshold ){
output = pos;
break;
}
}
The result of that would be a picture encoding the determined voxel position in its pixels RGB values. Use a 16 bit per channel texture format (or single precision of half precision float) and you've got enough resolution to address withing the limits of what typical GPUs can address in a 3D texture.
You'll want to do this off-screen using a FBO.
Another viable approach is taking the regular voxel raycaster and at the threshold position modify the depth value output for the particular fragment. The drawback of this method is, that depth output modification trashes performance, so you'll not want to do this if framerates matter. The benefit of this method would be, that you then in fact could use glReadPixels on the depth buffer and gluUnProject the depth value at where your moust pointer is.
My thought is as follows:
use gluUnproject function to get two points when I have the mouse
clicked on the screen. One point is on the near clip plane and
another is on the far one. concatenate the two points to form a
line.
iterate points on the line of step 2 and use glReadPixels to get
the pixel value of the iterated points. If the the values jump from
zero to nonzero or jump from nonzero to zero(the pixel value of
background is zero), the surface points are found.
That's not going to work. For the simple reason that glReadPixels sees exactly the same as you see. You can not "select" the depth at which glReadPixels read the pixels, because there's no depth in the picture left. glReadPixels just sees what you see: A flat image as it's shown in the window. You'll have to iterate over the voxel data, but you can't do this post-hoc. You'll have to implement or modify a volume raterizer to extract the information you need.
I am not going to write here a full implementation of what you need.Also,you could just search the web and find quite alot of info on this subject.But what you are looking for is called "Decals".Nvidia also presented a technique called "Texture bombing".In the nutshell,you draw a planar (or enclosing volume)geometry to project the decal texture onto it.The actual process is a little bit more complex as you can see from the examples.

How do I get started with a GPU voxelizer?

I've been reading various articles about how to write a GPU voxelizer. From my understanding the process goes like this:
Inspect the triangles individually and decide the axis that displays the triangle in the largest way. Call this the dominant axis.
Render the triangle on its dominant axis and sample the texels that come out.
Write that texel data onto a 3D texture and then do what you will with the data
Disregarding conservative rasterization, I have a lot of questions regarding this process.
I've gotten as far as rendering each triangle, choosing a dominant axis and orthogonally projecting it. What should the values of the orthogonal projection be? Should it be some value based around the size of the voxels or how large of an area the map should cover?
What am I supposed to do in the fragment shader? How do I write to my 3D texture such that it stores the voxel data? From my understanding, due to choosing the dominant axis we can't have more than a depth of 1 voxel for each fragment. However, since we projected orthogonally I don't see how that would reflect onto the 3D texture.
Finally, I am wondering on where to store the texture data. I know it's a bad idea to store data CPU side since you have to pass it all in to use it on the GPU, however the sourcecode I am kind of following chooses to store all its texture on the CPU side, such as those for a light map. My assumption is that data that will only be used on the GPU should be stored there and data used on both should be stored on the CPU side of things. So, from this I store my data on the CPU side. Is that correct?
My main sources have been: https://www.seas.upenn.edu/~pcozzi/OpenGLInsights/OpenGLInsights-SparseVoxelization.pdf OpenGL Insights
https://github.com/otaku690/sparsevoxeloctree A SVO using a voxelizer. The issue is that the shader code is not in the github.
In my own implementation, the whole scene is positioned and scaled into one unit cube centered on world origin. The modelview-project matrices are straightforward then. And the viewport is simply the desired voxel resolution.
I use 2-pass approach to output those voxel fragments: the 1st pass calculate the number of output voxel fragments by accumulating a single variable using atomic counter. Then I use the info to allocate a linear buffer.
In the 2nd pass the rasterized voxel fragments are stored into the allocated linear buffer, using atomic counter to avoid write conflict.

Pixel manipulation in OpenGL

Lets say I have this image and in it is an object (a cube). That object is being tracked (with labels) and I manage to render a virtual cube onto it (augmented reality). Now that I can render a virtual cube onto it I want to be able to make the object 'disappear' with some really basic diminished-reality technique called "inpainting". The inpaint in question is pretty simple (it has to be or the FPS will suffer) and it requires me to do some operations on pixels and their neighbors (like with Gaussian blur or other basic image processing).
To do that I first need:
A mask: black background with a white cube in it.
Access each pixel of the initial image (at coordinates x and y) as well as its neighborhood and do stuff based on the pixel value of the mask at the same x and y coordinates. So basically the mask serves as a way to say ignore this pixel or use this pixel.
How do I do this using OpenGL? I want to be able to access pixel values 1 by 1 preferably in 2D because of the neighbors.
Do I use FBOs or PBOs? I've read many things about buffers and methods like glDrawPixels() but I'm having trouble putting them all together. The paper I saw this method in used the GL_BACK buffer but mine is already used. Some sample code (C++) would be really appreciated with all the formalities (OpenG` calls) since I'm still a beginner in OpenGL.
I'm even thinking of using OpenCV if pixel manipulation is too hard in OpenGL since my AR library (Aruco) works on top of OpenCV. In that case I will still need to get the mask (white cube on black background), convert it to a cv::Mat and then do my processing.
I know this approach is inefficient (going back and forth from the GPU/CPU) but my goal (for now) is to at least make the basics work.
Setup a framebuffer object to render your original image + virtual cube. Here's a tutorial.
Next you can attach that framebuffer texture as a input (sampler) texture of your next stage and render a quad (two triangles) that cover your mask.
In the fragment shader you should be able to sample your "screen coordinate" by reading the variable gl_FragCoord. Setting up the texture filter functions as GL_NEAREST, you can access the exact texture coordinates. Also the neighboring pixels are available with a displacement (deltaX = 2/Width, deltaY=2/Height).
Using a previous framebuffer texture as source is mandatory, as the currently active framebuffer is write only.

labels in an opengl map application

Short Version
How can I draw short text labels in an OpenGL mapping application without having to manually recompute coordinates as the user zooms in and out?
Long Version
I have an OpenGL-based mapping application where I need to be able to draw data sets with up to about 250k points. Each point can have a short text label, usally about 4 or 5 characters long.
Currently, I do this using a single textue containing all the characters. For each point, I define a quad for each character in its label. So a point with the label "Fred" would have four quads associated with it, and each quad uses texture coordinates into that single texture to draw its corresponding character.
When I draw the map, I draw the map points themselves in map coordinates (e.g., longitude/latitude). Then I compute the position of each point in screen coordinates and update the four corner points for each of that point's label quads, again in screen coordinates. (For instance, if I determine the point is drawn at screen point 100, 150, I could set the quad for the first character in the point's label to be the rectangle starting with left-top point of 105, 155 and having a width of 6 pixels and a height of 12 pixels, as appropriate for the particular character. Then the second character might start at 120, 155, and so on.) Then once all these label character quads are positioned correctly, I draw them using an orthogonal screen projection.
The problem is that the process of updating all of those character quad coordinates is slow, taking about half a second for a particular test data set with 150k points (meaning that, since each label is about four characters long, there are about 150k * [ 4 characters per point] * [ 4 coordinate pairs per character] coordinate pairs that need to be set on each update.
If the map application didn't involve zooming, I would not need to recompute all these coordinates on each refresh. I could just compute the label coordinates once and then simply shift my viewing rectangle to show the right area. But with zooming, I can't see how to make it work without doing coordniate computation, because otherwise the characters will grow huge as you zoom in and tiny as you zoom out.
What I want (and what I understand OpenGL doesn't provide) is a way to tell OpenGL that a quad should be drawn in a fixed screen-coordinate rectangle, but that the top-left position of that rectangle should be a fixed distance from a given point in map coordinate space. So I want both a primitive hierarchy (a given map point is that parent of its label character quads) and the ability to mix two different coordinate systems within this hierarchy.
I'm trying to understand whether there is some magic transformation matrix I can set that will do all this form me, but I can't see how to do it.
The other alternative I've considered is using a shader on each point to handle computing the label character quad coordinates for that point. I haven't worked with shaders before, and I'm just trying to understand (a) if it's possible to use shaders to do this, and (b) whether computing all those points in shader code actually buys me anything over computing them myself. (By the way, I have confirmed that the big bottleneck is computing the quad coordinates, not in uploading the updated coordinates to the GPU. The latter takes a bit of time, but it's the computation, the sheer number of coordinates being updated, that takes up the bulk of that half second.)
(Of course, the other other alternative is to be smarter about which labels need to be drawn in a given view in the first place. But for now I'd like to concentrate on the solution assuming all labels need to be drawn.)
So the basic problem ("because otherwise the characters will grow huge as you zoom in and tiny as you zoom out") is that you are doing calculations in map coordinates rather than screen coordinates? And if you did it in screen coords, this would require more computations? Obviously, any rendering needs to translate from map coordinates to screen coordinates. The problem seems to be that you are translating from map to screen too late. Therefore, rather than doing a single map-to-screen for each point, and then working in screen coords, you are working mostly in map coords, and then translating per-character to screen coords at the very end. And the slow part is that you are working in screen coords, then having to manually translate back to map coords just to tell OpenGL the map coords, and it will convert those back to screen coords! Is that a fair assessment of your problem?
The solution therefore is to push that transformation earlier in your pipeline. However, I can see why it is tricky, because at first glance, OpenGL seems want to do everything in "world coordinates" (for you, map coords), but not in screen coords.
Firstly, I am wondering why you are doing separate coordinate calculations for each character. What font rendering system are you using? Something like FreeType will automatically generate a bitmap image of an entire string, and doesn't require you to work per-character [edit: this isn't quite true; see comments]. You definitely shouldn't need to calculate the map coordinate (or even screen coordinate) for every character. Calculate the screen coordinate for the top-left corner of the label, and have your font rendering system produce the bitmap of the entire label in one go. That should speed things up about fourfold (since you assume 4 characters per label).
Now as for working in screen coords, it may be helpful to learn a bit about shaders. The more you learn about OpenGL, the more you learn that really it isn't a 3D rendering engine at all. It's just a 2D graphics library with some very fast matrix primitives built-in. OpenGL actually works, at the lowest level, in screen coordinates (not pixel coordinates -- it works in normalized screen space, I think from memory from -1 to 1 in both the X and Y axis). The only reason it "feels" like you're working in world coordinates is because of these matrices you have set up.
So I think the reason why you are working in map coords all the way until the end is because it's easiest: OpenGL naturally does the map-to-screen transform for you (using the matrices). You have to change that, because you want to work in screen coords yourself, and therefore you need to make the transformation a long time before OpenGL gets its hands on your data. So when you go to draw a label, you should manually apply the map-to-screen transformation matrix on each point, as follows:
You have a particular point (which needs a label drawn) in map coords.
Apply the map-to-screen matrix to convert the point to screen coords. This probably means multiplying the point by the MODELVIEW and PROJECTION matrices, using the same algorithm that OpenGL does when it's rendering a vertex. So you could either glGet the GL_MODELVIEW_MATRIX and GL_PROJECTION_MATRIX to extract OpenGL's current matrices, or you could manually keep around a copy of the matrix yourself.
Now you have the map label in screen coords, compute the position of the label's text. This is simply adding 5 pixels in the X and Y axis, as you said above. However, remember that you aren't in pixel space, but normalised screen space, so you are working in percentages (add 0.05 units, would add 5% of the screen space, for example). It's probably better not to think in pixels, because then your application will scale to match the resolution. But if you really want to think in pixels, you will have to calculate the pixels-to-units based on the resolution.
Use glPushMatrix to save the current matrix, then glLoadIdentity to set the current matrix to the identity -- tell OpenGL not to transform your vertices. (I think you will have to do this for both the PROJECTION and MODELVIEW matrices.)
Draw your label, in screen coordinates.
So you don't really need to write a shader. You could certainly do this in a shader, and it would certainly make step 2 faster (no need to write your own software matrix multiply code; multiplying matrices on the GPU is extremely fast). But that would be a later optimisation, and a lot of work. I think the above steps will help you work in screen coordinates and avoid having to waste a lot of time just to give OpenGL map coordinates.
Side comment on:
"""
generate a bitmap image of an entire string, and doesn't require you to work per-character
...
Calculate the screen coordinate for the top-left corner of the label, and have your font rendering system produce the bitmap of the entire label in one go. That should speed things up about fourfold (since you assume 4 characters per label).
"""
Freetype or no, you could certainly compute a bitmap image for each label, rather than each character, but that would require one of:
storing thousands of different textures, one for each label
It seems like a bad idea to store that many textures, but maybe it's not.
or
rendering each label, for each point, at each screen update.
this would certainly be too slow.
Just to follow up on the resolution:
I didn't really solve this problem, but I ended up being smarter about when I draw labels in the first place. I was able to quickly determine whether I was about to draw too many characters (i.e., so many characters that on a typical screen with a typical density of points the labels would be too close together to read in a useful way) and then I simply don't label at all. With drawing up to about 5000 characters at a time there isn't a noticeable slowdown recomputing the character coordinates as described above.