Some total noob questions on volume rendering - xtk

I have been asked to provide a 3D visualisation of data from a prototype of a new type of scanner.
The data will be provided to me as a cube of voxels. Each voxel will be a data structure who's exact contents are yet to be determined.
It seems like xtk might be a good base on which to tackle this problem, but as I'm a total noob in this area, I have some pretty fundamental questions...
1) I'm having trouble finding simple explanations of the various file formats that xtk supports - which (if any) represent cubes of voxels?
2) For those, do the file formats also specify the data structure for each voxel? How would you tackle rendering a file that had an arbitrary data structure? ( ie If, say, each voxel contained a numerical measurement of "foo" at that location - how would you go about getting xtk to render a vizualization)
Apologies for the noob questions - any pointers in the right direction would be very gratefully received.

here is a list of supported file formats: https://github.com/xtk/X/wiki/X:Fileformats
the formats under "DICOM/Volume Files" are the ones you want. we support different data types for the voxels (sint, uint, float, double etc.).
XTK converts the numerical values of each voxel to grayscale and renders them.

Related

Find similar (large) patterns in images

I have images with repeated patterns in them. I would like to find similar images based on having similar patterns.
The patterns are made of crosses, triangles, squares which are combined to form more complicated structures made of those "primitive shapes". For example, imagine a cross made of triangles or hexagons etc.
These decorations are found in wallpapers, carpets, kilims, wool sweaters and even some paintings.
An example image from wikipedia is here
My typical application is to find, say, sweaters with similar patterns in them. Ignoring colour for the moment.
I have tried to extract SIFT descriptors (using C++ and OpenCV) and match these between two images. However, they match tiny areas e.g. the vertex of a hexagon and a triangle but ideally I would like to match the actual shapes of the triangle and rectangle.
It works a bit better if I scaled down the images but still I sense I need a different approach than SIFT and friends.
Can anyone suggest other methods for these kind of problems?
If you know patterns you're looking for a priori, you can do old school template matching. It may not be as trendy as deep learning techniques but for constrained problems can be effective.
Load in your N templates
Load in the image you want to test against.
Normalise the image and templates (possibly including
conversion to grayscale and some histogram and white balance
equilisation)
Create some M perturbations of each template (ie,
different scales, rotations, and perspective transforms)
Do template between between your NxM templates and your image.
You can optimise the above slightly by doing the template matching in the Fourier domain since you only need to do the 2D FFT of the image once. You can also precalculate and store the pertubations - or better yet: store their Fourier transforms.

How do you store voxel data?

I've been looking online and I'm impressed by the capabilities of using voxel data, especially for terrain building and manipulation. The problem is that voxels are never clearly explained on any site that i visited or how to use/implement them. All i find is that voxels are volumetric data. Please provide a more complete answer; what is volumetric data. It may seem like a simple question but I'm still unsure.
Also, how would you implement voxel data? (I aim to implement this into a c++ program.) What sort of data type would you use to store the voxel data to enable me to modify the contents at run time as fast as possible. I have looked online and i couldn't find anything which explained how to store the data. Lists of objects, arrays, ect...
How do you use voxels?
EDIT:
Since I'm just beginning with voxels, I'll probably start by using it to only model simple objects but I will eventually be using it for rendering terrain and world objects.
In essence, voxels are a three-dimensional extension of pixels ("volumetric pixels"), and they can indeed be used to represent volumetric data.
What is volumetric data
Mathematically, volumetric data can be seen as a three-dimensional function F(x,y,z). In many applications this function is a scalar function, i.e., it has one scalar value at each point (x,y,z) in space. For instance, in medical applications this could be the density of certain tissues. To represent this digitally, one common approach is to simply make slices of the data: imagine images in the (X,Y)-plane, and shifting the z-value to have a number of images. If the slices are close to eachother, the images can be displayed in a video sequence as for instance seen on the wiki-page for MRI-scans (https://upload.wikimedia.org/wikipedia/commons/transcoded/4/44/Structural_MRI_animation.ogv/Structural_MRI_animation.ogv.360p.webm). As you can see, each point in space has one scalar value which is represented as a grayscale.
Instead of slices or a video, one can also represent this data using voxels. Instead of dividing a 2D plane in a regular grid of pixels, we now divide a 3D area in a regular grid of voxels. Again, a scalar value can be given to each voxel. However, visualizing this is not as trivial: whereas we could just give a gray value to pixels, this does not work for voxels (we would only see the colors of the box itself, not of its interior). In fact, this problem is caused by the fact that we live in a 3D world: we can look at a 2D image from a third dimension and completely observe it; but we cannot look at a 3D voxel space and observe it completely as we have no 4th dimension to look from (unless you count time as a 4th dimension, i.e., creating a video).
So we can only look at parts of the data. One way, as indicated above, is to make slices. Another way is to look at so-called "iso-surfaces": we create surfaces in the 3D space for which each point has the same scalar value. For a medical scan, this allows to extract for instance the brain-part from the volumetric data (not just as a slice, but as a 3D model).
Finally, note that surfaces (meshes, terrains, ...) are not volumetric, they are 2D-shapes bent, twisted, stretched and deformed to be embedded in the 3D space. Ideally they represent the border of a volumetric object, but not necessarily (e.g., terrain data will probably not be a closed mesh). A way to represent surfaces using volumetric data, is by making sure the surface is again an iso-surface of some function. As an example: F(x,y,z) = x^2 + y^2 + z^2 - R^2 can represent a sphere with radius R, centered around the origin. For all points (x',y',z') of the sphere, F(x',y',z') = 0. Even more, for points inside the sphere, F < 0, and for points outside of the sphere, F > 0.
A way to "construct" such a function is by creating a distance map, i.e., creating volumetric data such that every point F(x,y,z) indicates the distance to the surface. Of course, the surface is the collection of all the points for which the distance is 0 (so, again, the iso-surface with value 0 just as with the sphere above).
How to implement
As mentioned by others, this indeed depends on the usage. In essence, the data can be given in a 3D matrix. However, this is huge! If you want the resolution doubled, you need 8x as much storage, so in general this is not an efficient solution. This will work for smaller examples, but does not scale very well.
An octree structure is, afaik, the most common structure to store this. Many implementations and optimizations for octrees exist, so have a look at what can be (re)used. As pointed out by Andreas Kahler, sparse voxel octrees are a recent approach.
Octrees allow easier navigating to neighbouring cells, parent cells, child cells, ... (I am assuming now that the concept of octrees (or quadtrees in 2D) are known?) However, if many leaf cells are located at the finest resolutions, this data structure will come with a huge overhead! So, is this better than a 3D array: it somewhat depends on what volumetric data you want to work with, and what operations you want to perform.
If the data is used to represent surfaces, octrees will in general be much better: as stated before, surfaces are not really volumetric, hence will not require many voxels to have relevant data (hence: "sparse" octrees). Refering back to the distance maps, the only relevant data are the points having value 0. The other points can also have any value, but these do not matter (in some cases, the sign is still considered, to denote "interior" and "exterior", but the value itself is not required if only the surface is needed).
How to use
If by "use", you are wondering how to render them, then you can have a look at "marching cubes" and its optimizations. MC will create a triangle mesh from volumetric data, to be rendered in any classical way. Instead of translating to triangles, you can also look at volume rendering to render a "3D sampled data set" (i.e., voxels) as such (https://en.wikipedia.org/wiki/Volume_rendering). I have to admit that I am not that familiar with volume rendering, so I'll leave it at just the wiki-link for now.
Voxels are just 3D pixels, i.e. 3D space regularly subdivided into blocks.
How do you use them? It really depends on what you are trying to do. A ray casting terrain game engine? A medical volume renderer? Something completely different?
Plain 3D arrays might be the best for you, but it is memory intensive. As BWG pointed out, octree is another popular alternative. Search for Sparse Voxel Octrees for a more recent approach.
In popular usage during the 90's and 00's, 'voxel' could mean somewhat different things, which is probably one reason you have been finding it hard to find consistent information. In technical imaging literature, it means 3D volume element. Oftentimes, though, it is used to describe what is somewhat-more-clearly termed a high-detail raycasting engine (as opposed to the low-detail raycasting engine in Doom or Wolfenstein). A popular multi-part tutorial lives in the Flipcode archives. Also check out this brief one by Jacco.
There are many old demos you can find out there that should run under emulation. They are good for inspiration and dissection, but tend to use a lot of assembly code.
You should think carefully about what you want to support with your engine: car-racing, flying, 3D objects, planets, etc., as these constraints can change the implementation of your engine. Oftentimes, there is not a data structure, per se, but the terrain heightfield is represented procedurally by functions. Otherwise, you can use an image as a heightfield. For performance, when rendering to the screen, think about level-of-detail, in other words, how many actual pixels will be taken up by the rendered element. This will determine how much sampling you do of the heightfield. Once you get something working, you can think about ways you can blend pixels over time and screen space to make them look better, while doing as little rendering as possible.

Interpolating the color points throughout the image

I have been trying to solve this problem for several days now, and I am official stuck; I need to draw the topological plot of an eeg signal on the brain, and I didn't find any cpp libraries that already do so. There is such library in Matlab, but that is considered a last resort, for now it is prefered to do all the processing in c++.
Basically what I need is a way to interpolate the color points in image 1 in order to produce image 2. They belong to different eeg diagrams, which is why they do not match.
My question is: is there any commonly known algorithm that will allow me to interpolate the points in image 1 in order to produce image 2?
I like the "Irregular grid (scattered data)" methods suggested by #Pavel in a comment.
To implement a simple but fast rendering solution where each output color is based on only three source colors, you could do a Delaunay triangulation and then use Gouraud shading to render the triangles using the known vertex colors.
Your sample image 2 is "softer" than that so I suspect it uses a higher-order interpolation scheme.
Since the interpolation method influences interpretation of the data be careful to select one that reduces incorrect interpretations.

Voxels... Honestly, I need to know where to begin

Okay, i understand that voxels are just basically a volumetric version of a pixel.
After that, I have no idea what to even look for.
Googling doesn't show any tutorials, I can't find a book on it anywhere, I can't find anything even having to do with the basic idea of what a voxel really is.
I know much of the C++ library, and have the basics of OpenGL down.
Can someone point me in the right direction?
EDIT: I guess I'm just confused on how to implement them? Sorry for being a pain, it's just that I can't really find anything that I can easily correlate to... I think I was imagining a voxel being relevant to a vector in which you can actually store data.
a voxel can be represented as ANY 3D shape? For example, say I wanted the shape to be a cylinder. Is this possible, or do they have to link like cubes?
Minecraft is a good example of using voxels. In Minecraft each voxel is a cube.
To see a C++ example you can look at the Minecraft clone Minetest-c55. This is open source so you can read all of the source code to see how its done.
Being cubes is not a requirement of voxels. They could be pyramids or any other shape that can fit together.
I suspect that you are looking for information on Volume Rendering techniques (since you mention voxels and OpenGL). You can find plenty of simple rendering code in C++, and more advanced OpenGL shaders as well with a little searching on that term.
In the simplest possible implementation, a voxel space is just a 3 dimensional Array. For solids you could use a single bit per voxel: 1 == filled and 0 == empty. You use implicit formulas to make shapes, e.g. A sphere is all the voxels within a radius from the center voxel.
Voxels are not really compatible with polygon-based 3d rendering, but they are widely used in image analysis, medical imaging, computer vision...
Let's Make a Voxel Engine on Google Sites might help one to get started creating a voxel based engine:
https://sites.google.com/site/letsmakeavoxelengine/
In addition to that there are presentations of the results on Youtube worth checking:
http://www.youtube.com/watch?v=nH_bHqury9Q&list=PL3899B2CEE4CD4687
Typically a voxel is a position in some 3D space that has a volume (analogous to the area that a pixel contains.
Just like in an image, where a pixel contains some scalar value (grayscale) or vector of values (like in a color image where the vector is either the red, green, and blue components, or hue, saturation, and value components) the entries for a voxel can have some scale or vector of values.
A couple natural examples of volumetric images that contains voxels are 3D medical imagines such as CT, MRI, 3D ultrasound etc.
Mathematically speaking a 3D image is a function from some voxel space to some set of numbers.
look for voxlap or try this http://www.html5code.com/gallery/voxel-rain/ or write your own code. Yes, a voxel can be reduced to a 3d coordinate (which can be implied by it's position in the file structure) and a graphical representation which can be anything (cube, sphere, picture, color ...). Just like a pixel is a 2D coordinate with a color index.
You only need to parse your file and render the corresponding voxels. Sadly, there is no 'right' file format although voxlaps file formats seem pretty neat.
good luck

Reconstruction of stereo image from single view images

How can I reconstruct an image from the stereo image pairs using OpenCV?
This is not necessarily an easy-to-solve problem. The thing is that both images store almost the same information, but from a slightly different perspective (angle and distance). So you have a perspective for each 2 of the stereo-optics. The only way to restore this is if(a) you knew what this perspective would be, e.g. a relative position-vector between both perspectives and the angle for both, you could create a mapping for a pixel in one of the images to the other.
The color of this (mapped) pixel ought to be the same, but as older stereo-optic-systems mapped to blue and red, you might have different values and thus have gained information doing this. Still, without these perspectives, you will need to correlate both pictures to each other and do quite complex image processing. I would suggest using scholar.google.com, unfortunately I failed to find anything useful, if you also can't find it there, start a phd ;)
Anyone who does know an algorithm of method to somehow restore such images, please let me know :) I am very curious about this as well.