OpenGL pixel precise rasterizing on different archs - opengl

for an application I'm developing I need to be able to
draw lines of different widths and colours
draw solid color filled triangles
draw textured (no alpha) quads
Very easy...but...
All coordinates are integer in pixel space and, very important: glReading all the pixels from the framebuffer
on two different machines, with two different graphic cards, running two different OS (Linux and freebsd),
must result in exactly the same sequence of bits (given an appropriate constant format conversion).
I think this is impossible to safely be achieved using opengl and hardware acceleration, since I bet different graphic
cards (from different vendors) may implement different algorithms for rasterization.
(OpenGl specs are clear about this, since they propose an algorithm but they also state that implementations may differ
under certain circumstances).
Also I don't really need hardware acceleration since I will be rendering very low speed and simple graphics.
Do you think I can achieve this by just disabling hardware acceleration? What happens in that case under linux, will I default on
MESA software rasterizer? And in that case, can I be sure it will always work or I am missing something?

That you're reading back in rendered pixels and strongly depend on their mathematical exactness/reproducability sounds like a design flaw. What's the purpose of this action? If you, for example, need to extract some information from the image, why don't you try to extract this information from the abstract, vectorized information prior to rendering?
Anyhow, if you depend on external rendering code and there's no way to make your reading code more robust to small errors, you're signing up for lots of pain and maintenance work. Other people could break your code with every tiny patch, because that kind of pixel exactness to the bit-level is usually a non-issue when they're doing their unit tests etc. Let alone the infinite permutations of hard- and software layers that are possible, and all might have influence on the exact pixel bits.
If you only need those two operatios: lines (with different widths and colors) and quads (with/without texture), I recommend writing your own rendering/rasterizer code which operates on a 8 bit uint array representing the image pixels (R8G8B8). The operations you're proposing aren't too nasty, so if performance is unimportant, this might actually be the better way to go on the long run.

Related

Render 1000+ shapes in opengl

How can I render a bunch of hand drawn shapes in opengl 1.x? I know about instancing but how is it possible in old opengl? Could I get examples of some sort? This is for a game, I'm expecting a thousand or so shapes all of which will need to be updated every frame.
Assuming that (at least most of) the shapes remain unchanged from one frame to the next, so most of the update is just moving them around, you could at least consider building a display list for each shape, then rendering the display lists during an update.
The amount of good you'll get from this varies widely depending on the hardware (and possibly driver) in use though. Some hardware supports display lists directly, and gains a lot from it. With other hardware, you'll be hard put to find any difference at all.
The good points are that at worst this won't do any harm, and building/using display lists is pretty quick and easy. So, in the worst case you don't lose much, and in the best case you might gain quite a bit.

dxt4 texture compression

When performance are kept in mind, we (unwillingly) use texture compression. The artefacts introduced by the compression may be more or less acceptable. what are the different possibilities, workarrounds that can be applied on the original image level to minimize the artefacts introduced by the compression algorithm. In my current situation, most of the artefacts are seen when using gradients.
DXT compression is all about interpolation, or gradients if you will. However, you must understand well what exactly it does. DXT compression is a compromise, it offers pretty bad quality at pretty bad compression, but it does offer some compression and is almost trivial to implement in hardware and runs at practically zero cost. That's why it is used.
There are a few means to improve quality, but if the quality issues are not not acceptable, the only solution is to not use DXT. (Besides, DXT4 which you have in your question's title is not very widely used, this is DXT5-premultiplied)
First of all, note that:
DXT encodes the colors of 4x4 blocks of texels by storing two 5:6:5 colors and interpolating along this line in RGB space. The interpolation is quantized to 2 bits, so you only have 4 values to choose from per pixel.
The alpha channel in DXT4/5 stores two 8-bit alpha values and uses 3-bit interpolators.
DXT2/3 uses an explicit 4 bit per texel alpha channel (i.e. not interpolating between some chosen 8-bit values).
DXT1 can feint an 1-bit alpha channel, but this is a sorting/encoding trick, and a different story.
It is impossible to get pure greys in DXT without converting to another color space (due to 5:6:5 storage of endpoints).
This means that DXT can in principle (more or less) perfectly reproduce many horizontal, vertical, or diagonal 1D gradients that do not have too harsh changes, but it is entirely unable to reproduce most other patterns (though it can usually reproduce something close).
For example, if you have a 2D gradient or a rotated gradient, there is no way (except by sheer coincidence!) that there exists a pair of two colors which will allow the entire 4x4 block to interpolate nicely. Also, since the interpolation is quantized to only 4 choices, the vast majority of "odd rotations" simply cannot be encoded, nor can many combinations of colors. However, for most "kind of natural" textures, this is acceptable.
The DXT compressor will usually make an attempt at finding a best possible fit within the 4x4 cell (though some compressors will/may do something else). This can lead to stepping in gradients even if the gradient inside the cell is represented well.
What you can do about DXT is:
Use different compressors and choose the best result. There are at least 3 different (non-brute force) strategies used by different compressors, giving quite different results.
Give the crunch library a try. There is a Windows GUI compressor for it around somewhere too. While crunch primarily aims to produce smaller (after zip compression) files, and it is finally bound by the technical limitations of the DXT block format, it uses an unusual search technique that takes much greater ranges into account. It might happen that it manages to give one or the other of your gradients a better look.
Avoid gradients that are 2D or that are not aligned to u/v or diagonal because you know that it is impossible to encode these.
Convert natural images to a non-RGB color space before compressing. YCoCg and YCbCr may be candidates, or the JPEG-LS transform. The human eye is not equally sensitive in every respect. Some errors are less obvious. Using a different color space exploits this.
Use the alpha channel for the most important channel if you don't need alpha, since you know that it has 8 bit endpoint resolution (instead of 5/6) and 3 bit interpolators instead of 2 bit. Otherwise, use green for the most important channel since it has one more bit.
For hand-drawn textures, you may try to use 5:6:5 matched colors, which will at least give perfect hits for these. Though for "drawn" textures (something that looks like a comic strip) DXT is a very unwise choice in general.
For cases where the quality just doesn't cut it, don't use DXT (yes, this sounds like a stupid advice, but it is really that... if it doesn't fit, don't use it).

OpenGL deterministic rendering between GPU vendor

I'm currently programming a scientific imaging application using OpenGL.
I would like to know if OpenGL rendering (in term of retrieved pixel from FBO) is supposed to be fully deterministic when my code (C++ / OpenGL and simple GLSL) is executed on different hardware (ATI vs NVidia, various NVidia generations and various OS)?
More precisely, I'd need the exact same pixels buffer everytime I run my code on any hardware (that can runs basic GLSL and OpenGL 3.0)...
Is that possible? Is there some advice I should consider?
If it's not possible, is there a specific brand of video card (perhaps Quadro?) that could do it while varying the host OS?
From the OpenGL spec (version 2.1 appendix A):
The OpenGL specification is not pixel exact. It therefore does not guarantee an exact match between images produced by different GL implementations. However, the specification does specify exact matches, in some cases, for images produced by the same implementation.
If you disable all anti-aliasing and texturing, you stand a good chance of getting consistent results across platforms. However, if you need antialiasing or texturing or a 100% pixel-perfect guarantee, use software rendering only: http://www.mesa3d.org/
By "Deterministic", I'm going to assume you mean what you said (rather than what the word actually means): that you can get pixel identical results cross-platform.
No. Not a chance.
You can change the pixel results you get from rendering just by playing with settings in your graphics driver's application. Driver revisions from the same hardware can change what you get.
The OpenGL specification has never required pixel-perfect results. Antialiasing and texture filtering especially are nebulous parts.
If you read through the OpenGL specification, there are a number of deterministic conditions that must be met in order for the implementation to comply with the standard, but there are also a significant number of implementation details that are left entirely up to the hardware vendor / driver developer. Unless you render with incredibly basic techniques that fall under the deterministic / invariant categories (which I believe will keep you from using filtered texturing, antialiasing, lighting, shaders, etc), the standard allows for pretty significant differences between different hardware and even different drivers on the same hardware.

Modifying an image with OpenGL?

I have a device to acquire XRay images. Due to some technical constrains, the detector is made of heterogeneous pixel size and multiple tilted and partially overlapping tiles. The image is thus distorted. The detector geometry is known precisely.
I need a function converting these distorted images into a flat image with homogeneous pixel size. I have already done this by CPU, but I would like to give a try with OpenGL to use the GPU in a portable way.
I have no experience with OpenGL programming, and most of the information I could find on the web was useless for this use. How should I proceed ? How do I do this ?
Image size are 560x860 pixels and we have batches of 720 images to process. I'm on Ubuntu.
OpenGL is for rendering polygons. You might be able to do multiple passes and use shaders to get what you want but you are better off re-writing the algorithm in OpenCL. The bonus then would be you have something portable that will even use multi core CPUs if no graphics accelerator card is available.
Rather than OpenGL, this sounds like a CUDA, or more generally GPGPU problem.
If you have C or C++ code to do it already, CUDA should be little more than figuring out the types you want to use on the GPU and how the algorithm can be tiled.
If you want to do this with OpengGL, you'd normally do it by supplying the current data as a texture, and writing a fragment shader that processes that data, and set it up to render to a texture. Once the output texture is fully rendered, you can retrieve it back to the CPU and write it out as a file.
I'm afraid it's hard to do much more than a very general sketch of the overall flow without knowing more about what you're doing -- but if (as you said) you've already done this with CUDA, you apparently already have a pretty fair idea of most of the details.
At heart what you are asking here is "how can I use a GPU to solve this problem?"
Modern GPUs are essentially linear algebra engines, so your first step would be to define your problem as a matrix that transforms an input coordinate < x, y > to its output in homogenous space:
For example, you would represent a transformation of scaling x by ½, scaling y by 1.2, and translating up and left by two units as:
and you can work out analogous transforms for rotation, shear, etc, as well.
Once you've got your transform represented as a matrix-vector multiplication, all you need to do is load your source data into a texture, specify your transform as the projection matrix, and render it to the result. The GPU performs the multiplication per pixel. (You can also write shaders, etc, that do more complicated math, factor in multiple vectors and matrices and what-not, but this is the basic idea.)
That said, once you have got your problem expressed as a linear transform, you can make it run a lot faster on the CPU too by leveraging eg SIMD or one of the many linear algebra libraries out there. Unless you need real-time performance or have a truly immense amount of data to process, using CUDA/GL/shaders etc may be more trouble than it's strictly worth, as there's a bit of clumsy machinery involved in initializing the libraries, setting up render targets, learning the details of graphics development, etc.
Simply converting your inner loop from ad-hoc math to a well-optimized linear algebra subroutine may give you enough of a performance boost on the CPU that you're done right there.
You might find this tutorial useful (it's a bit old, but note that it does contain some OpenGL 2.x GLSL after the Cg section). I don't believe there are any shortcuts to image processing in GLSL, if that's what you're looking for... you do need to understand a lot of the 3D rasterization aspect and historical baggage to use it effectively, although once you do have a framework for inputs and outputs set up you can forget about that and play around with your own algorithms in shader code relatively easily.
Having being doing this sort of thing for years (initially using Direct3D shaders, but more recently with CUDA) I have to say that I entirely agree with the posts here recommending CUDA/OpenCL. It makes life much simpler, and generally runs faster. I'd have to be pretty desperate to go back to a graphics API implementation of non-graphics algorithms now.

Does it make sense to use own mipmap creation algorithm for OpenGL textures?

I was wondering if the quality of texture mipmaps would be better if I used my own algorithm for pre-generating them, instead of the built-in automatic one. I'd probably use a slow but pretty algorithm, like Lanczos resampling.
Does it make sense? Will I get any quality gain on modern graphics cards?
There are good reasons to generate your own mipmaps. However, the quality of the downsampling is not one of them.
Game and graphic programmers have experimented with all kinds of downsampling algorithms in the past. In the end it turned out that the very simple "average four pixels"-method gives the best results. Also more advanced methods are in theory mathematical more correct they tend to take a lot of sharpness out of the mipmaps. This gives a flat look (Try it!).
For some (to me not understandable) reason the simple average method seems to have the best tradeoff between antialiasing and keeping the mipmaps sharp.
However, you may want to calculate your mipmaps with gamma-correction. OpenGL does not do this on it's own. This can make a real visual difference, especially for darker textures.
Doing so is simple. Instead of averaging four values together like this:
float average (float a, float b, float c, float d)
{
return (a+b+c+d)/4
}
Do this:
float GammaCorrectedAverage (float a, float b, float c, float d)
{
// assume a gamma of 2.0 In this case we can just square
// the components.
return sqrt ((a*a+b*b+c*c+d*d)/4)
}
This code assumes your color components are normalized to be in the range of 0 to 1.
What is motivating you to try? Are the mipmaps you have currently being poorly generated? (i.e. have you looked?) Bear in mind your results will often still be (tri)linearly interpolated anyway, so between that an motion there are often steeply diminishing returns to improved resampling.
It depends on the kind of assets you display. Lanczos filter gets closer to ideal low-pass filter and the results are noticeable if you compare the mip maps side by side. Most people will mistake aliasing for sharpness - again it depends whether your assets tend to contain high frequencies - I've definitely seen cases where box filter was not a good option. But since the mip map is then linearly interpolated anyway the gain might not be that noticeable. There is another thing to mention - most people use box filter and pass the output as an input into the next stage - in this way you lose both precision and visual energy (although gamma will help this one). If you can come up with code that uses arbitrary filter (mind you that most of them are separable into two passes) you would typically scale the filter kernel itself and produce mip map levels from the base texture, which is a good thing.
As an addition to this question, I have found that some completely different mipmapping (rather than those simply trying to achieve best down-scaling quality, like Lanczos filtering) algorithms have good effects on certain textures.
For instance, on some textures that are supposed to represent high-frequency information, I have tried using an algorithm that simply takes one random pixel of the four that are being considered for each iteration. The results depend much on the texture and what it is supposed to convey, but I have found that it gives great effect on some; not least for ground textures.
Another one I've tried is taking the most deviating of the four pixels to preserve contrasts. It has even fewer uses, but they do exist.
As such, I've implemented the option to choose mipmapping algorithm per texture.
EDIT: I thought I might provide some examples of the differences in practice. Here's a piece of grass texture on the ground, the leftmost picture being with standard average mipmapping, and the rightmost being with randomized mipmapping:
I hope the viewer can appreciate how much "apparent detail" is lost in the averaged mipmap, and how much flatter it looks for this kind of texture.
Also for reference, here are the same samples with 4× anisotropic filtering turned on (the above being tri-linear):
Anisotropic filtering makes the difference less pronounced, but it's still there.