How do you store voxel data? - c++

I've been looking online and I'm impressed by the capabilities of using voxel data, especially for terrain building and manipulation. The problem is that voxels are never clearly explained on any site that i visited or how to use/implement them. All i find is that voxels are volumetric data. Please provide a more complete answer; what is volumetric data. It may seem like a simple question but I'm still unsure.
Also, how would you implement voxel data? (I aim to implement this into a c++ program.) What sort of data type would you use to store the voxel data to enable me to modify the contents at run time as fast as possible. I have looked online and i couldn't find anything which explained how to store the data. Lists of objects, arrays, ect...
How do you use voxels?
EDIT:
Since I'm just beginning with voxels, I'll probably start by using it to only model simple objects but I will eventually be using it for rendering terrain and world objects.

In essence, voxels are a three-dimensional extension of pixels ("volumetric pixels"), and they can indeed be used to represent volumetric data.
What is volumetric data
Mathematically, volumetric data can be seen as a three-dimensional function F(x,y,z). In many applications this function is a scalar function, i.e., it has one scalar value at each point (x,y,z) in space. For instance, in medical applications this could be the density of certain tissues. To represent this digitally, one common approach is to simply make slices of the data: imagine images in the (X,Y)-plane, and shifting the z-value to have a number of images. If the slices are close to eachother, the images can be displayed in a video sequence as for instance seen on the wiki-page for MRI-scans (https://upload.wikimedia.org/wikipedia/commons/transcoded/4/44/Structural_MRI_animation.ogv/Structural_MRI_animation.ogv.360p.webm). As you can see, each point in space has one scalar value which is represented as a grayscale.
Instead of slices or a video, one can also represent this data using voxels. Instead of dividing a 2D plane in a regular grid of pixels, we now divide a 3D area in a regular grid of voxels. Again, a scalar value can be given to each voxel. However, visualizing this is not as trivial: whereas we could just give a gray value to pixels, this does not work for voxels (we would only see the colors of the box itself, not of its interior). In fact, this problem is caused by the fact that we live in a 3D world: we can look at a 2D image from a third dimension and completely observe it; but we cannot look at a 3D voxel space and observe it completely as we have no 4th dimension to look from (unless you count time as a 4th dimension, i.e., creating a video).
So we can only look at parts of the data. One way, as indicated above, is to make slices. Another way is to look at so-called "iso-surfaces": we create surfaces in the 3D space for which each point has the same scalar value. For a medical scan, this allows to extract for instance the brain-part from the volumetric data (not just as a slice, but as a 3D model).
Finally, note that surfaces (meshes, terrains, ...) are not volumetric, they are 2D-shapes bent, twisted, stretched and deformed to be embedded in the 3D space. Ideally they represent the border of a volumetric object, but not necessarily (e.g., terrain data will probably not be a closed mesh). A way to represent surfaces using volumetric data, is by making sure the surface is again an iso-surface of some function. As an example: F(x,y,z) = x^2 + y^2 + z^2 - R^2 can represent a sphere with radius R, centered around the origin. For all points (x',y',z') of the sphere, F(x',y',z') = 0. Even more, for points inside the sphere, F < 0, and for points outside of the sphere, F > 0.
A way to "construct" such a function is by creating a distance map, i.e., creating volumetric data such that every point F(x,y,z) indicates the distance to the surface. Of course, the surface is the collection of all the points for which the distance is 0 (so, again, the iso-surface with value 0 just as with the sphere above).
How to implement
As mentioned by others, this indeed depends on the usage. In essence, the data can be given in a 3D matrix. However, this is huge! If you want the resolution doubled, you need 8x as much storage, so in general this is not an efficient solution. This will work for smaller examples, but does not scale very well.
An octree structure is, afaik, the most common structure to store this. Many implementations and optimizations for octrees exist, so have a look at what can be (re)used. As pointed out by Andreas Kahler, sparse voxel octrees are a recent approach.
Octrees allow easier navigating to neighbouring cells, parent cells, child cells, ... (I am assuming now that the concept of octrees (or quadtrees in 2D) are known?) However, if many leaf cells are located at the finest resolutions, this data structure will come with a huge overhead! So, is this better than a 3D array: it somewhat depends on what volumetric data you want to work with, and what operations you want to perform.
If the data is used to represent surfaces, octrees will in general be much better: as stated before, surfaces are not really volumetric, hence will not require many voxels to have relevant data (hence: "sparse" octrees). Refering back to the distance maps, the only relevant data are the points having value 0. The other points can also have any value, but these do not matter (in some cases, the sign is still considered, to denote "interior" and "exterior", but the value itself is not required if only the surface is needed).
How to use
If by "use", you are wondering how to render them, then you can have a look at "marching cubes" and its optimizations. MC will create a triangle mesh from volumetric data, to be rendered in any classical way. Instead of translating to triangles, you can also look at volume rendering to render a "3D sampled data set" (i.e., voxels) as such (https://en.wikipedia.org/wiki/Volume_rendering). I have to admit that I am not that familiar with volume rendering, so I'll leave it at just the wiki-link for now.

Voxels are just 3D pixels, i.e. 3D space regularly subdivided into blocks.
How do you use them? It really depends on what you are trying to do. A ray casting terrain game engine? A medical volume renderer? Something completely different?
Plain 3D arrays might be the best for you, but it is memory intensive. As BWG pointed out, octree is another popular alternative. Search for Sparse Voxel Octrees for a more recent approach.

In popular usage during the 90's and 00's, 'voxel' could mean somewhat different things, which is probably one reason you have been finding it hard to find consistent information. In technical imaging literature, it means 3D volume element. Oftentimes, though, it is used to describe what is somewhat-more-clearly termed a high-detail raycasting engine (as opposed to the low-detail raycasting engine in Doom or Wolfenstein). A popular multi-part tutorial lives in the Flipcode archives. Also check out this brief one by Jacco.
There are many old demos you can find out there that should run under emulation. They are good for inspiration and dissection, but tend to use a lot of assembly code.
You should think carefully about what you want to support with your engine: car-racing, flying, 3D objects, planets, etc., as these constraints can change the implementation of your engine. Oftentimes, there is not a data structure, per se, but the terrain heightfield is represented procedurally by functions. Otherwise, you can use an image as a heightfield. For performance, when rendering to the screen, think about level-of-detail, in other words, how many actual pixels will be taken up by the rendered element. This will determine how much sampling you do of the heightfield. Once you get something working, you can think about ways you can blend pixels over time and screen space to make them look better, while doing as little rendering as possible.

Related

Tree to optimize OpenGL Pointcloud

I would like to optimize my OpenGL program.
It consists in loading a vector of 3D points then applying shaders to them.
But I have billions of points, and my FPS drop to 2 when I try to see the points.
Actually I'm sending every points, and I believe this is what is too much for my computer.
Is making a KD-tree (for example) to store my points and then send to my shaders only the points contained in the viewing frustum an efficient way to optimize my program?
And, since my goal isn't to do research of points, but only use points in the viewing frustum, which tree would be better? Octree? KD-tree?
using trees is definitely a good way to deal with large point clouds. i worked on a point cloud rendering software for a while and we used kd-trees for rendering, and regular voxel grids for analysis.
i don't exactly remember the reasons for/against using an octree, but i guess it depends on the density distribution of your clouds: if you have large point clouds with some small high-density areas, you would have lots of empty cells in the octree, whether for evenly-distributed point clouds, octrees might be simpler. We also had 2.5D maps (from aerial scans: several square kilometers of terrain but only little devation in height) where we used quad trees for some tasks.
also we did not render all the points that were in the frustum, because that degenerates e.g. when you zoom all the way out so the whole point cloud is in the frustum again.
instead, all the inner (non-leaf) nodes in the kd-tree contained a "representative" selection of the points in their children, and we rendered the tree only up to a depth that seemed appropriate depending on the distance from the camera to the bounding volume of each node. this way, for areas that are far away from the camera, you render a thinned out version of the point cloud, an LOD of sorts.
if you want to go fancy: we actually maintained a "front-line" of nodes, that is a line or cut from left to right through the tree up to which all nodes should be rendered. this way we did not need to check each node but only the ones in the cut whether their status ("rendered" or "not rendered") should change. additionally, we had out-of-core point clouds which where larger that the (V)RAM, where we allowed the front to only move farther down the tree if the parent node had been loaded from disk.
kd-trees are a bit harder to build because you need to determine where the split plane is located. for this we used a first pass where we read the locations of all the points in the node, determining the split plane, and then a second pass doing the actual split.
i think we had 4096 points per node (i think we experimented with more and 8k or 16k were fine as well), and did one draw call per node. as long as your point cloud fits in VRAM you can simply put all of it in one large buffer and do draw calls with offsets into that buffer.

GLSL truncated signed distance representation (TSDF) implementation

I am looking forward to implement a model reconstruction of RGB-D images. Preferred on mobile phones. For that I read, it is all done with an TSDF-representation. I read a lot of papers now over hierarchical structures and other ideas to speed this up, but my problem is, that I still hat no clue how to actually implement this representation.
If I have a volume grid of size n, so n x n x n and I want to store in each voxel the signed distance, weight and color information. My only guess is, that I have to build a discrete set of points, for each voxel position. And with GLSL "paint" all these points and calculate the nearest distance. But that don't seem quite good or efficient to calculate this n^3 times.
How can I imagine to implement such a TSDF-representation?
The problem is, my only idea is to render the voxel grid to store in the data of signed distances. But for each depth map I have to render again all voxels and calculate all distances. Is there any way to render it the other way around?
So can't I render the points of the depth map and store informations in the voxel grid?
How is the actual state of art to render such a signed distance representation in an efficient way?
You are on the right track, it's an ambitious project but very cool if you can do it.
First, it's worth getting a good idea how these things work. The original paper identifying a TSDF was by Curless and Levoy and is reasonably approachable - a copy is here . There are many later variations but this is the starting point.
Second, you will need to create nxnxn storage as you have said. This very quickly gets big. For example, if you want 400x400x400 voxels with RGB data and floating point values for distance and weight then that will be 768MB of GPU memory - you may want to check how much GPU memory you have available on a mobile device. Yup, I said GPU because...
While you can implement toy solutions on CPU, you really need to get serious about GPU programming if you want to have any kind of performance. I built an early version on an Intel i7 CPU laptop. Admittedly I spent no time optimising it but it was taking tens of seconds to integrate a single depth image. If you want to get real time (30Hz) then you'll need some GPU programming.
Now you have your TSFD data representation, each frame you need to do this:
1. Work out where the camera is with respect to the TSDF in world coordinates.
Usually you assume that you are the origin at time t=0 and then measure your relative translation and rotation with regard to the previous frame. The most common way to do this is with an algorithm called iterative closest point (ICP) You could implement this yourself or use a library like PCL though I'm not sure if they have a mobile version. I'd suggest you get started without this by just keeping your camera and scene stationary and build up to movement later.
2. Integrate the depth image you have into the TSDF This means updating the TSDF with the next depth image. You don't throw away the information you have to date, you merge the new information with the old.
You do this by iterating over every voxel in your TSDF and :
a) Work out the distance from the voxel centre to your camera
b) Project the point into your depth camera's image plane to get a pixel coordinate (using the extrinsic camera position obtained above and the camera intrinsic parameters which are easily searchable for Kinect)
c) Lookup up the depth in your depth map at that pixel coordinate
d) Project this point back into space using the pixel x and y coords plus depth and your camera properties to get a 3D point corresponding to that depth
e) Update the value for the current voxel's distance with the value distance_from_step_d - distance_from_step_a (update is usually a weighted average of the existing value plus the new value).
You can use a similar approach for the voxel colour.
Once you have integrated all of your depth maps into the TSDF, you can visualise the result by either raytracing it or extracting the iso surface (3D mesh) and viewing it in another package.
A really helpful paper that will get you there is here. This is a lab report by some students who actually implemented Kinect Fusion for themselves on a PC. It's pretty much a step by step guide though you'll still have to learn CUDA or similar to implement it
You can also check out my source code on GitHub for ideas though all normal disclaimers about fitness for purpose apply.
Good luck!
After I posted my other answer I thought of another approach which seems to match the second part of your question but it definitely is not reconstruction and doesn't involve using a TSDF. It's actually a visualisation but it is a lot simpler :)
Each frame you get an RGB and a Depth image. Assuming that these images are registered, that is the pixel at (x,y) in the RGB image represents the same pixel as that at (x,y) in the depth image then you can create a dense point cloud coloured using the RGB data. To do this you would:
For every pixel in the depth map
a) Use the camera's intrinsic matrix (K), the pixel coordinates and the depth value in the map at that point to project the point into a 3D point in camera coordinates
b) Associate the RGB value at the same pixel with that point in space
So now you have an array of (probably 640x480) structures like {x,y,z,r,g,b}
You can render these using on GLES just by creating a set of vertices and rendering points. There's a discussion on how to do this here
With this approach you throw away the data every frame and redo from scratch. Importantly, you don't get a reconstructed surface, and you don't use a TSDF. You can get pretty results but it's not reconstruction.

Clarification about octrees and how they work in a Voxel world

I read about octrees and I didn't fully understand how they world work/be implemented in a voxel world where the octree's purpose is to lower the amount of voxels you would render by connecting repeating voxels to one big "voxel".
Here are the questions I want clarification about:
What type of data structure would you use? How could turn a 3-D array of voxels into and array that has different sized voxels that take multiple locations in the array?
What are the nodes and what are they used for?
Does the octree connect the voxels so there are ONLY square shapes or could it be a rectangle or a L shape or an entire Y column of voxels or what?
Do the octrees really improve performance of a voxel game? If so usually by how much?
Quick answers:
A tree:Each node has 8 children, top-back-left, top-back-right, etc. down to a certain levelThe code for this can get quite complex, especially if the voxels can change at runtime.
The type of voxel (colour, material, a list of items)
yep. Cubes onlyMore specifically 1x1, 2x2, 4x4, 8x8 etc. It must be an entire node.If you really want to you could define some sort of patterns, but its no longer a octdtree.
yeah, but it depends on your data. Imagine describing 256 identical blocks individually, or describing it once (like air in Minecraft)
I'd start with trying to understand quadtrees first. You can do that on paper, or make a test program with it. You'll answer these questions yourself if you experiment
An octree done correctly can also help you with neighbour searches which enable you to determine if a face is considered to be "visible" (ie so you end up with a hull of voxels visible). Once you've established your octree you then use this to store your XYZ coords which you then extract into a single array. You then feed this array into your VERTEX Buffer (GL solutions require this) which you can then render in chunk forms as needed (as the camera moves forward etc).
Octree's also by there very nature collapse Cubes into bigger ones if there are ones of the same type... much like Tetris does when you have colors/shapes that "fit" one another.. this in turn can reduce your vertex count and at render you're really drawing a combination of squares and rectangles
If done correctly you will end up with a lot of chunks that only have the outfacing "faces" visible in the vertex buffers. Now you then have to also build your own Occlusion Culling algorithm which then reduces the visibility ontop of this resulting in less rendering required.
I did an example here:
https://vimeo.com/71330826
notice how the outside is only being rendered but the chunks themselves go all the way down to the bottom even though the chunks depth faces should cancel each other out? (needs more optimisation). Also note how the camera turns around and the faces are removed from the rendering buffers?

Voxels... Honestly, I need to know where to begin

Okay, i understand that voxels are just basically a volumetric version of a pixel.
After that, I have no idea what to even look for.
Googling doesn't show any tutorials, I can't find a book on it anywhere, I can't find anything even having to do with the basic idea of what a voxel really is.
I know much of the C++ library, and have the basics of OpenGL down.
Can someone point me in the right direction?
EDIT: I guess I'm just confused on how to implement them? Sorry for being a pain, it's just that I can't really find anything that I can easily correlate to... I think I was imagining a voxel being relevant to a vector in which you can actually store data.
a voxel can be represented as ANY 3D shape? For example, say I wanted the shape to be a cylinder. Is this possible, or do they have to link like cubes?
Minecraft is a good example of using voxels. In Minecraft each voxel is a cube.
To see a C++ example you can look at the Minecraft clone Minetest-c55. This is open source so you can read all of the source code to see how its done.
Being cubes is not a requirement of voxels. They could be pyramids or any other shape that can fit together.
I suspect that you are looking for information on Volume Rendering techniques (since you mention voxels and OpenGL). You can find plenty of simple rendering code in C++, and more advanced OpenGL shaders as well with a little searching on that term.
In the simplest possible implementation, a voxel space is just a 3 dimensional Array. For solids you could use a single bit per voxel: 1 == filled and 0 == empty. You use implicit formulas to make shapes, e.g. A sphere is all the voxels within a radius from the center voxel.
Voxels are not really compatible with polygon-based 3d rendering, but they are widely used in image analysis, medical imaging, computer vision...
Let's Make a Voxel Engine on Google Sites might help one to get started creating a voxel based engine:
https://sites.google.com/site/letsmakeavoxelengine/
In addition to that there are presentations of the results on Youtube worth checking:
http://www.youtube.com/watch?v=nH_bHqury9Q&list=PL3899B2CEE4CD4687
Typically a voxel is a position in some 3D space that has a volume (analogous to the area that a pixel contains.
Just like in an image, where a pixel contains some scalar value (grayscale) or vector of values (like in a color image where the vector is either the red, green, and blue components, or hue, saturation, and value components) the entries for a voxel can have some scale or vector of values.
A couple natural examples of volumetric images that contains voxels are 3D medical imagines such as CT, MRI, 3D ultrasound etc.
Mathematically speaking a 3D image is a function from some voxel space to some set of numbers.
look for voxlap or try this http://www.html5code.com/gallery/voxel-rain/ or write your own code. Yes, a voxel can be reduced to a 3d coordinate (which can be implied by it's position in the file structure) and a graphical representation which can be anything (cube, sphere, picture, color ...). Just like a pixel is a 2D coordinate with a color index.
You only need to parse your file and render the corresponding voxels. Sadly, there is no 'right' file format although voxlaps file formats seem pretty neat.
good luck

Implementing Marching Cube Algorithm?

From My last question: Marching Cube Question
However, i am still unclear as in:
how to create imaginary cube/voxel to check if a vertex is below the isosurface?
how do i know which vertex is below the isosurface?
how does each cube/voxel determines which cubeindex/surface to use?
how draw surface using the data in triTable?
Let's say i have a point cloud data of an apple.
how do i proceed?
can anybody that are familiar with Marching Cube help me?
i only know C++ and opengl.(c is a little bit out of my hand)
First of all, the isosurface can be represented in two ways. One way is to have the isovalue and per-point scalars as a dataset from an external source. That's how MRI scans work. The second approach is to make an implicit function F() which takes a point/vertex as its parameter and returns a new scalar. Consider this function:
float computeScalar(const Vector3<float>& v)
{
return std::sqrt(v.x*v.x + v.y*v.y + v.z*v.z);
}
Which would compute the distance from the point and to the origin for every point in your scalar field. If the isovalue is the radius, you just figured a way to represent a sphere.
This is because |v| <= R is true for all points inside a sphere, or which lives on its interior. Just figure out which vertices are inside the sphere and which ones are on the outside. You want to use the less or greater-than operators because a volume divides the space in two. When you know which points in your cube are classified as inside and outside, you also know which edges the isosurface intersects. You can end up with everything from no triangles to five triangles. The position of the mesh vertices can be computed by interpolating across the intersected edges to find the actual intersection point.
If you want to represent say an apple with scalar fields, you would either need to get the source data set to plug in to your application, or use a pretty complex implicit function. I recommend getting simple geometric primitives like spheres and tori to work first, and then expand from there.
1) It depends on yoru implementation. You'll need to have a data structure where you can lookup the values at each corner (vertex) of the voxel or cube. This can be a 3d image (ie: an 3D texture in OpenGL), or it can be a customized array data structure, or any other format you wish.
2) You need to check the vertices of the cube. There are different optimizations on this, but in general, start with the first corner, and just check the values of all 8 corners of the cube.
3) Most (fast) algorithms create a bitmask to use as a lookup table into a static array of options. There are only so many possible options for this.
4) Once you've made the triangles from the triTable, you can use OpenGL to render them.
Let's say i have a point cloud data of an apple. how do i proceed?
This isn't going to work with marching cubes. Marching cubes requires voxel data, so you'd need to use some algorithm to put the point cloud of data into a cubic volume. Gaussian Splatting is an option here.
Normally, if you are working from a point cloud, and want to see the surface, you should look at surface reconstruction algorithms instead of marching cubes.
If you want to learn more, I'd highly recommend reading some books on visualization techniques. A good one is from the Kitware folks - The Visualization Toolkit.
You might want to take a look at VTK. It has a C++ implementation of Marching Cubes, and is fully open sourced.
As requested, here is some sample code implementing the Marching Cubes algorithm (using JavaScript/Three.js for the graphics):
http://stemkoski.github.com/Three.js/Marching-Cubes.html
For more details on the theory, you should check out the article at
http://paulbourke.net/geometry/polygonise/