I need to implement gif from bmp to animate Abelian sandpile model using only c++ standard library.
Ideally, your starting points would be specifications for GIF and BMP.
The GIF Specification, is a pretty easy thing to find.
Unfortunately, (at least to the best of my knowledge) Microsoft has never brought all the information about BMP format into a single document to act as a specification. There's a lot of documentation in various places, but no one place that has all of it together and completely organized (at least of which I'm aware).
That means you're kind of stuck with a piecemeal approach. Fortunately, you probably don't need to read every possible legitimate BMP file--it's been around a long time, so there are lots of variations, many of which are rarely used any more (e.g., 16-color bitmaps).
At a guess, you probably only need to deal with one or two specific variants (e.g., 24 or 32-bits per pixel), which makes life a great deal easier. Here's a page that gives at least a starting point for documentation on how BMP files are formatted.
You'll probably need to consider at least a few ancillary problems though. Unless your input BMP files use 8 bits per pixel with a palette to define the color associated with each of those 255 values, you're probably going to have at least one other problem: you'll most likely be starting with a file that has lots of colors (e.g., as noted above, 24 or 32 bits per pixel), but for a GIF file you need to reduce that to only 8 bits per pixel, so you'll need to choose 255 colors that best represent those in the pictures you care about, then for each input pixel, choose one of those 255 colors to represent that pixel as well as possible.
Depending on how much you care about color fidelity vs. spatial resolution, there are multitudes of ways of doing this job, varying from fairly simple (but with results that may be rather mediocre) to extremely complex (with results that may be somewhat better, but will probably still be fairly mediocre).
Related
I'm writing a general LZW decoder c++ program and I'm having trouble finding documentation on the length (in bits) of codewords used. Some articles I've found say that codewords are 12bits long, while others say 16bits, while still others say that variable bit length is used. So which is it? It would make sense to me that bit length is variable since that would give the best compression (i.e. initially start with 9 bits, then move to 10 when necessary, then move to 11 etc...). But I can't find any "official" documentation on what the industry standard is.
For example, if I were to open up Microsoft Paint and create a simple 100x100pixel all black image and save it as a Tiff. The image is saved in the Tiff using LZW compression. So in this scenario when I'm parsing the LZW codewords, should I read in 9bits, 12bits, or 16bits for the first codeword? and how would I know which to use?
Thanks for any help you can provide.
LZW can be done any of these ways. By far the most common (at least in my experience) is start with 9 bit codes, then when the dictionary gets full, move to 10 bit codes, and so on up to some maximum size.
From there, you typically have a couple of choices. One is to clear the dictionary and start over. Another is to continue using the current dictionary, without adding new entries. In the latter case, you typically track the compression rate, and if it drops too far, then you clear the dictionary and start over.
I'd have to dig through docs to be sure, but if I'm not mistaken, the specific implementation of LZW used in TIFF starts at 9 and goes up to 12 bits (when it was being designed, MS-DOS was a major target, and the dictionary for 12-bit codes used most of the available 640K of RAM). If memory serves, it clears the table as soon as the last 12-bit code has been used.
for an application I'm developing I need to be able to
draw lines of different widths and colours
draw solid color filled triangles
draw textured (no alpha) quads
Very easy...but...
All coordinates are integer in pixel space and, very important: glReading all the pixels from the framebuffer
on two different machines, with two different graphic cards, running two different OS (Linux and freebsd),
must result in exactly the same sequence of bits (given an appropriate constant format conversion).
I think this is impossible to safely be achieved using opengl and hardware acceleration, since I bet different graphic
cards (from different vendors) may implement different algorithms for rasterization.
(OpenGl specs are clear about this, since they propose an algorithm but they also state that implementations may differ
under certain circumstances).
Also I don't really need hardware acceleration since I will be rendering very low speed and simple graphics.
Do you think I can achieve this by just disabling hardware acceleration? What happens in that case under linux, will I default on
MESA software rasterizer? And in that case, can I be sure it will always work or I am missing something?
That you're reading back in rendered pixels and strongly depend on their mathematical exactness/reproducability sounds like a design flaw. What's the purpose of this action? If you, for example, need to extract some information from the image, why don't you try to extract this information from the abstract, vectorized information prior to rendering?
Anyhow, if you depend on external rendering code and there's no way to make your reading code more robust to small errors, you're signing up for lots of pain and maintenance work. Other people could break your code with every tiny patch, because that kind of pixel exactness to the bit-level is usually a non-issue when they're doing their unit tests etc. Let alone the infinite permutations of hard- and software layers that are possible, and all might have influence on the exact pixel bits.
If you only need those two operatios: lines (with different widths and colors) and quads (with/without texture), I recommend writing your own rendering/rasterizer code which operates on a 8 bit uint array representing the image pixels (R8G8B8). The operations you're proposing aren't too nasty, so if performance is unimportant, this might actually be the better way to go on the long run.
When performance are kept in mind, we (unwillingly) use texture compression. The artefacts introduced by the compression may be more or less acceptable. what are the different possibilities, workarrounds that can be applied on the original image level to minimize the artefacts introduced by the compression algorithm. In my current situation, most of the artefacts are seen when using gradients.
DXT compression is all about interpolation, or gradients if you will. However, you must understand well what exactly it does. DXT compression is a compromise, it offers pretty bad quality at pretty bad compression, but it does offer some compression and is almost trivial to implement in hardware and runs at practically zero cost. That's why it is used.
There are a few means to improve quality, but if the quality issues are not not acceptable, the only solution is to not use DXT. (Besides, DXT4 which you have in your question's title is not very widely used, this is DXT5-premultiplied)
First of all, note that:
DXT encodes the colors of 4x4 blocks of texels by storing two 5:6:5 colors and interpolating along this line in RGB space. The interpolation is quantized to 2 bits, so you only have 4 values to choose from per pixel.
The alpha channel in DXT4/5 stores two 8-bit alpha values and uses 3-bit interpolators.
DXT2/3 uses an explicit 4 bit per texel alpha channel (i.e. not interpolating between some chosen 8-bit values).
DXT1 can feint an 1-bit alpha channel, but this is a sorting/encoding trick, and a different story.
It is impossible to get pure greys in DXT without converting to another color space (due to 5:6:5 storage of endpoints).
This means that DXT can in principle (more or less) perfectly reproduce many horizontal, vertical, or diagonal 1D gradients that do not have too harsh changes, but it is entirely unable to reproduce most other patterns (though it can usually reproduce something close).
For example, if you have a 2D gradient or a rotated gradient, there is no way (except by sheer coincidence!) that there exists a pair of two colors which will allow the entire 4x4 block to interpolate nicely. Also, since the interpolation is quantized to only 4 choices, the vast majority of "odd rotations" simply cannot be encoded, nor can many combinations of colors. However, for most "kind of natural" textures, this is acceptable.
The DXT compressor will usually make an attempt at finding a best possible fit within the 4x4 cell (though some compressors will/may do something else). This can lead to stepping in gradients even if the gradient inside the cell is represented well.
What you can do about DXT is:
Use different compressors and choose the best result. There are at least 3 different (non-brute force) strategies used by different compressors, giving quite different results.
Give the crunch library a try. There is a Windows GUI compressor for it around somewhere too. While crunch primarily aims to produce smaller (after zip compression) files, and it is finally bound by the technical limitations of the DXT block format, it uses an unusual search technique that takes much greater ranges into account. It might happen that it manages to give one or the other of your gradients a better look.
Avoid gradients that are 2D or that are not aligned to u/v or diagonal because you know that it is impossible to encode these.
Convert natural images to a non-RGB color space before compressing. YCoCg and YCbCr may be candidates, or the JPEG-LS transform. The human eye is not equally sensitive in every respect. Some errors are less obvious. Using a different color space exploits this.
Use the alpha channel for the most important channel if you don't need alpha, since you know that it has 8 bit endpoint resolution (instead of 5/6) and 3 bit interpolators instead of 2 bit. Otherwise, use green for the most important channel since it has one more bit.
For hand-drawn textures, you may try to use 5:6:5 matched colors, which will at least give perfect hits for these. Though for "drawn" textures (something that looks like a comic strip) DXT is a very unwise choice in general.
For cases where the quality just doesn't cut it, don't use DXT (yes, this sounds like a stupid advice, but it is really that... if it doesn't fit, don't use it).
I thought about:
1) Implement everything for the b/w images, then make wrappers for the methods that check if it's a color image. If it is, split the channels, make the operations on each individually and then merge them.
2) Use functors to correctly update the values depending on what I'm dealing with. Problem is that the compiler errors would be really complicated and I'm not used to it, and I think I may end up needing quite a few of them. Not sure if this is a good idea tbh.
There might be a correct design pattern here I'm not seeing too. There could also be a way to do this that's channel/color agnostic in OpenCV though I haven't found it yet, and so far the book I'm reading (OpenCV 2 Computer Vision Application Programming Cookbook) hasn't shown me such a possibility yet.
If speed is important, Don't.
It sounds like you're trying to encapsulate or abstract away the type of pixel using OO techniques or the like. This could add an extra level of indirection for every pixel access, killing your performance.
If you're calling staight to a function vs. a pointer to one (e.g., delegate, overriden method, functor) it can still be faster for the CPU, but if you're doing function calls at all reconsider; they're still extra work and if you can nest everything in the outer FOR loop, it will look ugly and functional programming snobs will sneer at you, remember, this isn't a big LOB app that will get hard to maintain. That's why engineers can still perfectly maintain 30 year old quickbasic code, the problem space doesn't need anything smarter (however usually their problems themselves need something a lot smarter than I!)
It's best to implement simple things (e.g., a threshold op or resizing) optimized for each kind of image if you want speed. You can also research transformation matrix and see if you can accomplish your work like that. That way you can write 2 transformer algorithms (b&w) only, and, using a similar (or same) matrix do the same thing for both types of pictures.
Hence accomplishing a major goal of abstraction anyway, seamless reuse, separation of concerns. And speed to boot (but hopefully not reboot!) good luck
Splitting the channels could work well with algorithms that work with the channels independently; not all of them do, so this will be quite limiting. You'll also spend a bit of time and space making all those copies.
By functors I presume you mean making templates out of your algorithm functions, with a pixel type as the template parameter. That could work also, but it means defining your basic pixel operations in a way that they could be implemented as functions or operators on a generic pixel type. This is harder than it looks and should be done after you've had some experience in implementing the algorithms.
A third option not mentioned is to promote the b/w images to full color, process them, and convert back to b/w. This optimizes the full color processing at the expense of the b/w.
For most algorithms it is not necessary to worry about monochrome vs. colour images. You either use the grey value of the monochrome image or you calculate the luminance/intensity/whatever of the colour and use that. You choose the measure luminance etc. by looking at which colour space will give you the result you want.
When you have calculated how you are going to modify your images you use some pixel aware processing, e.g. blending two pixels might be pixel_a*0.5 + pixel_b*0.5, your pixel class will sort out how to apply that to the different colour channels, i.e. Pixel::operator+(const Pixel &), Pixel::operator*(float) and so on.
There are algorithms that are applied individually to each colour channel but they are not as common and often there is some correlation between the spatiotemporal changes in the colours so you wouldn't do something as basic as process each channel totally independently of each other.
My own Image class uses a planar structure (that is, color channels are separate) instead of an interleaved structure. However this is VERY limiting when it comes to image quantization and other joint color processing tasks.
I am planning to rewrite it to use the other approach, to simply be a two dimensional array of pixels. At the moment I am not sure how will I implement it exactly (template pixel class, Pixel base class or a simple three dimensional array).
I also plan to write a planar wrapper for this interleaved image structure to ease any disadvantage I might encounter. One thing is sure, this wrapper will be much efficient than a pixel wrapper would be for planar images.
Frankly I believe splitting planes is rather inefficient, since you calculate various overheads several times. For example, if you want to resize an image, calculation of the various filter coefficients is very expensive, and it would be MUCH better to just calculate them once and apply Pixel::operator * and + instead of the same with the underlying subpixel components.
I'm wondering if there is a way to extract the necessary data out of an autocad .dxf file, so I can visualize the structure in opengl?
I've found some old cold snippets for windows written in cpp but since the standard changes I assume 15 yr old code is a little outdated.
Also, there is a book about the .dxf file standard but it's also from the 90's and aside that, rarely available.
Another way might be to convert it to some other file format and then extract the data I need.
Trying to look into the .dxf files didn't give too much insight either since a simple cuboid contains a lot of data already!
Can anyone give me hint on how to approach this?
The references are a good place to start, but if you are doing heavy 3D work it may not be possible to accomplish what you are attempting..
We recently wrote a DXF converter in JAVA based entirely on the references. Although many of the entities are relatively straightfoward, many other entities (3DSOLID, BODY, REGION, SURFACE, Swept Surface) are not really possible to translate, since the reference states that the groups are primarily proprietary data. Other objects (Extruded Surface, Revolved Surface, Swept Surface (again)) have significant chunks of binary data which may hold important information you need.
These entities were not vital for our efforts, but if you are looking to convert to OpenGL, these may be the entities you were particularly concerned with.
Autodesk has references for the DXF formats used by recent revisions of AutoCAD. I'd probably take a second look at that 15 year-old code though. Even if you can't/don't use it as-is, it may provide a decent starting point. The DXF specification is sufficiently large and complex that having something to start from, and just add new bits and pieces where needed can be a big help. As an interchange format, DXF has to be pretty conservative anyway, only including elements that essentially all programs can interpret reasonably directly.
I'd probably be more concerned about the code itself than changes in the DXF format. A lot of code that old uses deep, monolithic class hierarchies that's quite a bit different from what you'd expect in modern C++.