When performance are kept in mind, we (unwillingly) use texture compression. The artefacts introduced by the compression may be more or less acceptable. what are the different possibilities, workarrounds that can be applied on the original image level to minimize the artefacts introduced by the compression algorithm. In my current situation, most of the artefacts are seen when using gradients.
DXT compression is all about interpolation, or gradients if you will. However, you must understand well what exactly it does. DXT compression is a compromise, it offers pretty bad quality at pretty bad compression, but it does offer some compression and is almost trivial to implement in hardware and runs at practically zero cost. That's why it is used.
There are a few means to improve quality, but if the quality issues are not not acceptable, the only solution is to not use DXT. (Besides, DXT4 which you have in your question's title is not very widely used, this is DXT5-premultiplied)
First of all, note that:
DXT encodes the colors of 4x4 blocks of texels by storing two 5:6:5 colors and interpolating along this line in RGB space. The interpolation is quantized to 2 bits, so you only have 4 values to choose from per pixel.
The alpha channel in DXT4/5 stores two 8-bit alpha values and uses 3-bit interpolators.
DXT2/3 uses an explicit 4 bit per texel alpha channel (i.e. not interpolating between some chosen 8-bit values).
DXT1 can feint an 1-bit alpha channel, but this is a sorting/encoding trick, and a different story.
It is impossible to get pure greys in DXT without converting to another color space (due to 5:6:5 storage of endpoints).
This means that DXT can in principle (more or less) perfectly reproduce many horizontal, vertical, or diagonal 1D gradients that do not have too harsh changes, but it is entirely unable to reproduce most other patterns (though it can usually reproduce something close).
For example, if you have a 2D gradient or a rotated gradient, there is no way (except by sheer coincidence!) that there exists a pair of two colors which will allow the entire 4x4 block to interpolate nicely. Also, since the interpolation is quantized to only 4 choices, the vast majority of "odd rotations" simply cannot be encoded, nor can many combinations of colors. However, for most "kind of natural" textures, this is acceptable.
The DXT compressor will usually make an attempt at finding a best possible fit within the 4x4 cell (though some compressors will/may do something else). This can lead to stepping in gradients even if the gradient inside the cell is represented well.
What you can do about DXT is:
Use different compressors and choose the best result. There are at least 3 different (non-brute force) strategies used by different compressors, giving quite different results.
Give the crunch library a try. There is a Windows GUI compressor for it around somewhere too. While crunch primarily aims to produce smaller (after zip compression) files, and it is finally bound by the technical limitations of the DXT block format, it uses an unusual search technique that takes much greater ranges into account. It might happen that it manages to give one or the other of your gradients a better look.
Avoid gradients that are 2D or that are not aligned to u/v or diagonal because you know that it is impossible to encode these.
Convert natural images to a non-RGB color space before compressing. YCoCg and YCbCr may be candidates, or the JPEG-LS transform. The human eye is not equally sensitive in every respect. Some errors are less obvious. Using a different color space exploits this.
Use the alpha channel for the most important channel if you don't need alpha, since you know that it has 8 bit endpoint resolution (instead of 5/6) and 3 bit interpolators instead of 2 bit. Otherwise, use green for the most important channel since it has one more bit.
For hand-drawn textures, you may try to use 5:6:5 matched colors, which will at least give perfect hits for these. Though for "drawn" textures (something that looks like a comic strip) DXT is a very unwise choice in general.
For cases where the quality just doesn't cut it, don't use DXT (yes, this sounds like a stupid advice, but it is really that... if it doesn't fit, don't use it).
Related
I need to implement gif from bmp to animate Abelian sandpile model using only c++ standard library.
Ideally, your starting points would be specifications for GIF and BMP.
The GIF Specification, is a pretty easy thing to find.
Unfortunately, (at least to the best of my knowledge) Microsoft has never brought all the information about BMP format into a single document to act as a specification. There's a lot of documentation in various places, but no one place that has all of it together and completely organized (at least of which I'm aware).
That means you're kind of stuck with a piecemeal approach. Fortunately, you probably don't need to read every possible legitimate BMP file--it's been around a long time, so there are lots of variations, many of which are rarely used any more (e.g., 16-color bitmaps).
At a guess, you probably only need to deal with one or two specific variants (e.g., 24 or 32-bits per pixel), which makes life a great deal easier. Here's a page that gives at least a starting point for documentation on how BMP files are formatted.
You'll probably need to consider at least a few ancillary problems though. Unless your input BMP files use 8 bits per pixel with a palette to define the color associated with each of those 255 values, you're probably going to have at least one other problem: you'll most likely be starting with a file that has lots of colors (e.g., as noted above, 24 or 32 bits per pixel), but for a GIF file you need to reduce that to only 8 bits per pixel, so you'll need to choose 255 colors that best represent those in the pictures you care about, then for each input pixel, choose one of those 255 colors to represent that pixel as well as possible.
Depending on how much you care about color fidelity vs. spatial resolution, there are multitudes of ways of doing this job, varying from fairly simple (but with results that may be rather mediocre) to extremely complex (with results that may be somewhat better, but will probably still be fairly mediocre).
for an application I'm developing I need to be able to
draw lines of different widths and colours
draw solid color filled triangles
draw textured (no alpha) quads
Very easy...but...
All coordinates are integer in pixel space and, very important: glReading all the pixels from the framebuffer
on two different machines, with two different graphic cards, running two different OS (Linux and freebsd),
must result in exactly the same sequence of bits (given an appropriate constant format conversion).
I think this is impossible to safely be achieved using opengl and hardware acceleration, since I bet different graphic
cards (from different vendors) may implement different algorithms for rasterization.
(OpenGl specs are clear about this, since they propose an algorithm but they also state that implementations may differ
under certain circumstances).
Also I don't really need hardware acceleration since I will be rendering very low speed and simple graphics.
Do you think I can achieve this by just disabling hardware acceleration? What happens in that case under linux, will I default on
MESA software rasterizer? And in that case, can I be sure it will always work or I am missing something?
That you're reading back in rendered pixels and strongly depend on their mathematical exactness/reproducability sounds like a design flaw. What's the purpose of this action? If you, for example, need to extract some information from the image, why don't you try to extract this information from the abstract, vectorized information prior to rendering?
Anyhow, if you depend on external rendering code and there's no way to make your reading code more robust to small errors, you're signing up for lots of pain and maintenance work. Other people could break your code with every tiny patch, because that kind of pixel exactness to the bit-level is usually a non-issue when they're doing their unit tests etc. Let alone the infinite permutations of hard- and software layers that are possible, and all might have influence on the exact pixel bits.
If you only need those two operatios: lines (with different widths and colors) and quads (with/without texture), I recommend writing your own rendering/rasterizer code which operates on a 8 bit uint array representing the image pixels (R8G8B8). The operations you're proposing aren't too nasty, so if performance is unimportant, this might actually be the better way to go on the long run.
I thought about:
1) Implement everything for the b/w images, then make wrappers for the methods that check if it's a color image. If it is, split the channels, make the operations on each individually and then merge them.
2) Use functors to correctly update the values depending on what I'm dealing with. Problem is that the compiler errors would be really complicated and I'm not used to it, and I think I may end up needing quite a few of them. Not sure if this is a good idea tbh.
There might be a correct design pattern here I'm not seeing too. There could also be a way to do this that's channel/color agnostic in OpenCV though I haven't found it yet, and so far the book I'm reading (OpenCV 2 Computer Vision Application Programming Cookbook) hasn't shown me such a possibility yet.
If speed is important, Don't.
It sounds like you're trying to encapsulate or abstract away the type of pixel using OO techniques or the like. This could add an extra level of indirection for every pixel access, killing your performance.
If you're calling staight to a function vs. a pointer to one (e.g., delegate, overriden method, functor) it can still be faster for the CPU, but if you're doing function calls at all reconsider; they're still extra work and if you can nest everything in the outer FOR loop, it will look ugly and functional programming snobs will sneer at you, remember, this isn't a big LOB app that will get hard to maintain. That's why engineers can still perfectly maintain 30 year old quickbasic code, the problem space doesn't need anything smarter (however usually their problems themselves need something a lot smarter than I!)
It's best to implement simple things (e.g., a threshold op or resizing) optimized for each kind of image if you want speed. You can also research transformation matrix and see if you can accomplish your work like that. That way you can write 2 transformer algorithms (b&w) only, and, using a similar (or same) matrix do the same thing for both types of pictures.
Hence accomplishing a major goal of abstraction anyway, seamless reuse, separation of concerns. And speed to boot (but hopefully not reboot!) good luck
Splitting the channels could work well with algorithms that work with the channels independently; not all of them do, so this will be quite limiting. You'll also spend a bit of time and space making all those copies.
By functors I presume you mean making templates out of your algorithm functions, with a pixel type as the template parameter. That could work also, but it means defining your basic pixel operations in a way that they could be implemented as functions or operators on a generic pixel type. This is harder than it looks and should be done after you've had some experience in implementing the algorithms.
A third option not mentioned is to promote the b/w images to full color, process them, and convert back to b/w. This optimizes the full color processing at the expense of the b/w.
For most algorithms it is not necessary to worry about monochrome vs. colour images. You either use the grey value of the monochrome image or you calculate the luminance/intensity/whatever of the colour and use that. You choose the measure luminance etc. by looking at which colour space will give you the result you want.
When you have calculated how you are going to modify your images you use some pixel aware processing, e.g. blending two pixels might be pixel_a*0.5 + pixel_b*0.5, your pixel class will sort out how to apply that to the different colour channels, i.e. Pixel::operator+(const Pixel &), Pixel::operator*(float) and so on.
There are algorithms that are applied individually to each colour channel but they are not as common and often there is some correlation between the spatiotemporal changes in the colours so you wouldn't do something as basic as process each channel totally independently of each other.
My own Image class uses a planar structure (that is, color channels are separate) instead of an interleaved structure. However this is VERY limiting when it comes to image quantization and other joint color processing tasks.
I am planning to rewrite it to use the other approach, to simply be a two dimensional array of pixels. At the moment I am not sure how will I implement it exactly (template pixel class, Pixel base class or a simple three dimensional array).
I also plan to write a planar wrapper for this interleaved image structure to ease any disadvantage I might encounter. One thing is sure, this wrapper will be much efficient than a pixel wrapper would be for planar images.
Frankly I believe splitting planes is rather inefficient, since you calculate various overheads several times. For example, if you want to resize an image, calculation of the various filter coefficients is very expensive, and it would be MUCH better to just calculate them once and apply Pixel::operator * and + instead of the same with the underlying subpixel components.
I have a device to acquire XRay images. Due to some technical constrains, the detector is made of heterogeneous pixel size and multiple tilted and partially overlapping tiles. The image is thus distorted. The detector geometry is known precisely.
I need a function converting these distorted images into a flat image with homogeneous pixel size. I have already done this by CPU, but I would like to give a try with OpenGL to use the GPU in a portable way.
I have no experience with OpenGL programming, and most of the information I could find on the web was useless for this use. How should I proceed ? How do I do this ?
Image size are 560x860 pixels and we have batches of 720 images to process. I'm on Ubuntu.
OpenGL is for rendering polygons. You might be able to do multiple passes and use shaders to get what you want but you are better off re-writing the algorithm in OpenCL. The bonus then would be you have something portable that will even use multi core CPUs if no graphics accelerator card is available.
Rather than OpenGL, this sounds like a CUDA, or more generally GPGPU problem.
If you have C or C++ code to do it already, CUDA should be little more than figuring out the types you want to use on the GPU and how the algorithm can be tiled.
If you want to do this with OpengGL, you'd normally do it by supplying the current data as a texture, and writing a fragment shader that processes that data, and set it up to render to a texture. Once the output texture is fully rendered, you can retrieve it back to the CPU and write it out as a file.
I'm afraid it's hard to do much more than a very general sketch of the overall flow without knowing more about what you're doing -- but if (as you said) you've already done this with CUDA, you apparently already have a pretty fair idea of most of the details.
At heart what you are asking here is "how can I use a GPU to solve this problem?"
Modern GPUs are essentially linear algebra engines, so your first step would be to define your problem as a matrix that transforms an input coordinate < x, y > to its output in homogenous space:
For example, you would represent a transformation of scaling x by ½, scaling y by 1.2, and translating up and left by two units as:
and you can work out analogous transforms for rotation, shear, etc, as well.
Once you've got your transform represented as a matrix-vector multiplication, all you need to do is load your source data into a texture, specify your transform as the projection matrix, and render it to the result. The GPU performs the multiplication per pixel. (You can also write shaders, etc, that do more complicated math, factor in multiple vectors and matrices and what-not, but this is the basic idea.)
That said, once you have got your problem expressed as a linear transform, you can make it run a lot faster on the CPU too by leveraging eg SIMD or one of the many linear algebra libraries out there. Unless you need real-time performance or have a truly immense amount of data to process, using CUDA/GL/shaders etc may be more trouble than it's strictly worth, as there's a bit of clumsy machinery involved in initializing the libraries, setting up render targets, learning the details of graphics development, etc.
Simply converting your inner loop from ad-hoc math to a well-optimized linear algebra subroutine may give you enough of a performance boost on the CPU that you're done right there.
You might find this tutorial useful (it's a bit old, but note that it does contain some OpenGL 2.x GLSL after the Cg section). I don't believe there are any shortcuts to image processing in GLSL, if that's what you're looking for... you do need to understand a lot of the 3D rasterization aspect and historical baggage to use it effectively, although once you do have a framework for inputs and outputs set up you can forget about that and play around with your own algorithms in shader code relatively easily.
Having being doing this sort of thing for years (initially using Direct3D shaders, but more recently with CUDA) I have to say that I entirely agree with the posts here recommending CUDA/OpenCL. It makes life much simpler, and generally runs faster. I'd have to be pretty desperate to go back to a graphics API implementation of non-graphics algorithms now.
I was wondering if the quality of texture mipmaps would be better if I used my own algorithm for pre-generating them, instead of the built-in automatic one. I'd probably use a slow but pretty algorithm, like Lanczos resampling.
Does it make sense? Will I get any quality gain on modern graphics cards?
There are good reasons to generate your own mipmaps. However, the quality of the downsampling is not one of them.
Game and graphic programmers have experimented with all kinds of downsampling algorithms in the past. In the end it turned out that the very simple "average four pixels"-method gives the best results. Also more advanced methods are in theory mathematical more correct they tend to take a lot of sharpness out of the mipmaps. This gives a flat look (Try it!).
For some (to me not understandable) reason the simple average method seems to have the best tradeoff between antialiasing and keeping the mipmaps sharp.
However, you may want to calculate your mipmaps with gamma-correction. OpenGL does not do this on it's own. This can make a real visual difference, especially for darker textures.
Doing so is simple. Instead of averaging four values together like this:
float average (float a, float b, float c, float d)
{
return (a+b+c+d)/4
}
Do this:
float GammaCorrectedAverage (float a, float b, float c, float d)
{
// assume a gamma of 2.0 In this case we can just square
// the components.
return sqrt ((a*a+b*b+c*c+d*d)/4)
}
This code assumes your color components are normalized to be in the range of 0 to 1.
What is motivating you to try? Are the mipmaps you have currently being poorly generated? (i.e. have you looked?) Bear in mind your results will often still be (tri)linearly interpolated anyway, so between that an motion there are often steeply diminishing returns to improved resampling.
It depends on the kind of assets you display. Lanczos filter gets closer to ideal low-pass filter and the results are noticeable if you compare the mip maps side by side. Most people will mistake aliasing for sharpness - again it depends whether your assets tend to contain high frequencies - I've definitely seen cases where box filter was not a good option. But since the mip map is then linearly interpolated anyway the gain might not be that noticeable. There is another thing to mention - most people use box filter and pass the output as an input into the next stage - in this way you lose both precision and visual energy (although gamma will help this one). If you can come up with code that uses arbitrary filter (mind you that most of them are separable into two passes) you would typically scale the filter kernel itself and produce mip map levels from the base texture, which is a good thing.
As an addition to this question, I have found that some completely different mipmapping (rather than those simply trying to achieve best down-scaling quality, like Lanczos filtering) algorithms have good effects on certain textures.
For instance, on some textures that are supposed to represent high-frequency information, I have tried using an algorithm that simply takes one random pixel of the four that are being considered for each iteration. The results depend much on the texture and what it is supposed to convey, but I have found that it gives great effect on some; not least for ground textures.
Another one I've tried is taking the most deviating of the four pixels to preserve contrasts. It has even fewer uses, but they do exist.
As such, I've implemented the option to choose mipmapping algorithm per texture.
EDIT: I thought I might provide some examples of the differences in practice. Here's a piece of grass texture on the ground, the leftmost picture being with standard average mipmapping, and the rightmost being with randomized mipmapping:
I hope the viewer can appreciate how much "apparent detail" is lost in the averaged mipmap, and how much flatter it looks for this kind of texture.
Also for reference, here are the same samples with 4× anisotropic filtering turned on (the above being tri-linear):
Anisotropic filtering makes the difference less pronounced, but it's still there.