I've implemented a plotting class that is currently capable of handling integer values only. I would like to get advice about techniques/mechanisms in order to handle floating numbers. Library used is GDI.
Thanks,
Adi
At some point, they need to be converted to integers to draw actual pixels.
Generally speaking, however, you do not want to just cast each float to int, and draw -- you'll almost certainly get a mess. Instead, you need/want to scale the floats, then round the scaled value to an integer. In most cases, you'll want to make the scaling factor variable so the user can zoom in and out as needed.
Another possibility is to let the hardware handle most of the work -- you could use OpenGL (for one example) to render your points, leaving them as floating point internally, and letting the driver/hardware handle issues like scaling and conversion to integers. This has a rather steep cost up-front (learning enough OpenGL to get it to do anything useful), but can have a fairly substantial payoff as well, such as fast, hardware-based rendering, and making it relatively easy to handle some things like scaling and (if you ever need it) being able to display 3D points as easily as 2D.
Edit:(mostly response to comment): Ultimately it comes down to this: the resolution of a screen is lower than the resolution of a floating point number. For example, a really high resolution screen might display 2048 pixels horizontally -- that's 11 bits of resolution. Even a single precision floating point number has around 24 bits of precision. No matter how you do it, reducing 24-bit resolution to 12-bit resolution is going to lose something -- usually a lot.
That's why you pretty nearly have to make your scaling factor variable -- so the user can choose whether to zoom out and see the whole picture with reduced resolution, or zoom in to see a small part at high resolution.
Since sub-pixel resolution was mentioned: it does help, but only a little. It's not going to resolve a thousand different items that map to a single pixel.
What do these float values represent? I will assume they are some co-ordinates. You will need to know two things:
The source resolution (i.e. the dpi at which these co-ordinates are drawn)
The range that you need to address
After that, this becomes a problem of scaling the points to suitable integer co-ordinates (based on your screen-resolution).
Edit: A simple formula will be:
X(dst) = X(src) * DPI(dst) / DPI(src)
You'll have to convert them to integers and then pass them to functions like MoveTo() and LineTo().
Scale. For example, multiply all the integral values by 10. Multiply the floating point values by 10.0 and then truncate or round (your choice). Now plot as normal.
This will give you extra precision in your graphing. Just remember the scale factor when you look at the picture.
Otherwise convert the floats to int before plotting.
You can try to use GDI+ instead GDI, it has functions that are using float coordinates.
Related
I've been wondering if
1. Would a 64 bit positioning system make it possible to transition from earth to space and further to another planet, kinda like Kerbal Space Program but without the origin shifting so that real data could be derived from it.
2. Would it possible to do so and how?
Just figure out what each integer increment represents and you're basically done. Millimetres should be sufficient, as that's still a +/- ten trillion kilometre range if you're using a signed value.
That's going to offer more consistency in positioning than a floating point value, but the downside is each xyz vector will be 192 bits, or 24 bytes.
I am implementing a conventional (that means not fast), separated Fourier transform for images. I know that in floating point a sum over one period of sin or cos in equally spaced samples is not perfectly zero, and that this is more a problem with the conventional transform than with the fast.
The algorithm works with 2D double arrays and is correct. The inverse is done inside (over a double sign flag and conditional check when using the asymmetric formula), not outside with conjugations. Results are nearly 100% like expected, so its a question about details:
When I perform a forward transform, save logarithmed magnitude and angle to images, reload them, and do an inverse transform, I experience different types of rounding errors with different types of implemented formulas:
F(u,v) = Sum(x=0->M-1) Sum(y=0->N-1) f(x,y) * e^(-i*2*pi*u*x/M) * e^(-i*2*pi*v*y/N)
f(x,y) = 1/M*N * (like above)
F(u,v) = 1/sqrt(M*N) * (like above)
f(x,y) = 1/sqrt(M*N) * (like above)
So the first one is the asymmetric transform pair, the second one the symmetric. With the asymmetric pair, the rounding errors are more in the bright spots of the image (some pixel are rounded slightly outside value range (e.g. 256)). With the symmetric pair, the errors are more in the constant mid-range area of the image (no exceeding of value range!). In total, it seems that the symmetric pair produces a bit more rounding errors.
Then, it also depends of the input: when image stored in [0,255] the rounding errors are other than when in [0,1].
So my question: how should an optimal, most accurate algorithm be implemented (theoretically, no code): asymmetric/symmetric pair? value range of input in [0,255] or [0,1]? How linearly upscaling result before saving logarithmed one to file?
Edit:
my algorithm simply computes the separated asymmetric or symmetric DFT formula. Factors are decomposed into real and imaginary part using Eulers identity, then expanded and sumed up separately as real and imaginary part:
sum_re += f_re * cos(-mode*pi*((2.0*v*y)/N)) - // mode = 1 for forward, -1
f_im * sin(-mode*pi*((2.0*v*y)/N)); // for inverse transform
// sum_im permutated in the known way and + instead of -
This value grouping indside cos and sin should give in my eyes the lowest rounding error (compared to e.g. cos(-mode*2*pi*v*y/N)), because not multiplicating/dividing significantly false rounded transcedental pi several times, but only one time. Isn't it?
The scale factor 1/M*N or 1/sqrt(M*N) is applied separately after each separation outside of the innermost sum. Better inside? Or combined completely at the end of both separations?
For some deeper analysis, I have quitted the input->transform->save-to-file->read-from-file->transform^-1->output workflow and chosen to compare directly in double-precision: input->transform->transform^-1->output.
Here the results for an real life 704x528 8-bit image (delta = max absolute difference between real part of input and output):
with input inside [0,1] and asymmetric formula: delta = 2.6609e-13 (corresponds to 6.785295e-11 for [0,255] range).
with input insde [0,1] and symmetric formula: delta = 2.65232e-13 (corresponds to 6.763416e-11 for [0,255] range).
with input inside [0,255] and asymmetric formula: delta = 6.74731e-11.
with input inside [0,255] and symmetric formula: delta = 6.7871e-11.
These are no real significant differences, however, the full ranged input with the asymmetric transform performs best. I think the values may get worse with 16-bit input.
But in general I see, that my experienced issues are more because of scaling-before-saving-to-file (or inverse) rounding errors, than real transformation rounding errors.
However, I am curious: what is the most used implementation of the Fourier transform: the symmetric or asymmetric? Which value range is in general used for the input: [0,1] or [0,255]? And usual shown spectra in log scale: e.g. [0,M*N] after asymmetric transform of [0,1] input is directly log-scaled to [0,255] or before linearly scaled to [0,255*M*N]?
The errors you report are tiny, normal, and generally can be ignored. Simply scale your results and clamp any results outside the target interval to the endpoints.
In library implementations of FFTs (that is, FFT routines written to be used generally by diverse applications, not custom designed for a single application), little regard is given to scaling; the routine often simply returns data that has been naturally scaled by the arithmetic, with no additional multiplication operations used to adjust the scale. This is because the scale is often either irrelevant for the application (e.g., finding the frequencies with the largest energies works no matter what the scale is) or that the scale may be distributed through multiply operations and performed just once (e.g., instead of scaling in a forward transform and in an inverse transform, the application can get the same effect by explicitly scaling just once). So, since scaling is often not needed, there is no point in including it in a library routine.
The target interval that data are scaled to depends on the application.
Regarding the question on what transform to use (logarithmic or linear) for showing spectra, I cannot advise; I do not work with visualizing spectra.
Scaling causes roundoff errors. Hence, solution 1 (which scales once) is better than solution 2 (which does it twice). Similarly, scaling once after summation is better than scaling everything before summation.
Do you run y from 0 to 2*N or from -N to +N ? Mathematically it's the same, but you have an extra bit of precision in the latter case.
BTW, what's mode doing in cos(-mode * stuff) ?
I am working on an application which is basically related with drawing annotation on image using MFC's api.
The coordinates required for drawing these annotation is persisted in xml file.
It also handle the scaling of annotations on changing the zoom-level of image.
The problem is that when scaling the coordinates the immediate result is double or float, but we save the result as integer, which result in lots of errors/deviations.
Will it be nice to save the coordinate as float in the xml, also performing the immediate operation on float?
And finally convert it to integer for using in api like LineTo(), MoveTo() which needs long.
Any suggestion or advice on this will be very helpful.
Thanks
I've worked with graphics pipelines for quite some time.
For something that involves scaling, I insist that you store all your data as doubles. Especially when you plan to go from integer to floating-point and floating-point to integer. Far less error when scaling as well.
There is no harm in storing these values in XML any differently than integers.
Also, CPUs these days are quite optimized for floating-point operations.
When serializing the coordinates into the xml, you can use reinterpret_cast<int> to simply save them as an integer with the same binary representation as the float. Conversely, during deserialization, use reinterpret_cast<float> to recover the original number. You shouldn't lose precision on saving/loading this way.
As far as errors go, the solution is trivial: don't save as integer. Keep the floats (I'd even get behind PhoenixX_2's suggestion to upgrade to doubles), then, while drawing, cast them to a temporary int variable.
edit: Note that if you do decide to use double instead of float, you'll need to account for that during serialization, as doubles are 64-bit, not 32. You could also just save the number as a human-readable decimal, which is probably the most obvious way to do it.
I'm using floats to represent a position in my game:
struct Position
{
float x;
float y;
};
I'm wondering if this is the best choice and what the consequences will be as the position values continue to grow larger. I took some time to brush up on how floats are stored and realized that I am a little confused.
(I'm using Microsoft Visual C++ compiler.)
In float.h, FLT_MAX is defined as follows:
#define FLT_MAX 3.402823466e+38F /* max value */
which is 340282346600000000000000000000000000000.
That value is much greater than UINT_MAX which is defined as:
#define UINT_MAX 0xffffffff
and corresponds to the value 4294967295.
Based on this, it seems like a float would be a good choice to store a very large number like a position. Even though FLT_MAX is very large, I'm wondering how the precision issues will come into play.
Based on my understanding, a float uses 1 bit to store the sign, 8 bits to store the exponent, and 23 bits to store the mantissa (a leading 1 is assumed):
S EEEEEEEE MMMMMMMMMMMMMMMMMMMMMMM
That means FLT_MAX might look like:
0 11111111 11111111111111111111111
which would be the equivalent of:
1.11111111111111111111111 x 2^128
or
111111111111111111111111000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Even knowing this, I have trouble visualizing the loss of precision and I'm getting confused thinking about what will happen as the values continue to increase.
Is there any easier way to think about this? Are floats or doubles generally used to store very large numbers over something like an unsigned int?
A way of thinking about the precision of a float, is to consider that they have roughly 5 digits of accuracy. So if your units are meters, and you have something 1km away, thats 1000m - attempting to deal with that object at a resolution of 10cm (0.1m) or less may be problematic.
The usual approach in a game would be to use floats, but to divide the world up such that positions are relative to local co-ordinate systems (for example, divide the world into a grid, and for each grid square have a translation value). Everything will have enough precision until it gets transformed relative to the camera for rendering, at which point the imprecision for far away things is not a problem.
As an example, imagine a game set in the solar system. If the origin of your co-ordinate system is in the heart of the sun, then co-ordinates on the surface of planets will be impossible to represent accurately in a float. However if you instead have a co-ordinate system relative to the planet's surface, which in turn is relative to the center of the planet, and then you know where the planet is relative to the sun, you can operate on things in a local space with accuracy, and then transform into whatever space you want for rendering.
No, they're not.
Let's say your position needs to increase by 10 cm for a certain frame since the game object moved.
Assuming a game world scaled in meters, this is 0.10. But if your float value is large enough it won't be able to represent a difference of 0.10 any more, and your attempt to increase the value will simply fail.
Do you need to store a value greater than 16.7m with a fractional part? Then float will be too small.
This series by Bruce Dawson may help.
If you really need to handle very large numbers, then consider using an arbitrary-precision arithmetic library. You will have to profile your code because these libraries are slower than the arithmetics of built-in types.
It is possible that you do not really need very large coordinate values. For example, you could wrap around the edges of your world, and use modulo arithmetic for handling positions.
I am writing a simulation program that proceeds in discrete steps. The simulation consists of many nodes, each of which has a floating-point value associated with it that is re-calculated on every step. The result can be positive, negative or zero.
In the case where the result is zero or less something happens. So far this seems straightforward - I can just do something like this for each node:
if (value <= 0.0f) something_happens();
A problem has arisen, however, after some recent changes I made to the program in which I re-arranged the order in which certain calculations are done. In a perfect world the values would still come out the same after this re-arrangement, but because of the imprecision of floating point representation they come out very slightly different. Since the calculations for each step depend on the results of the previous step, these slight variations in the results can accumulate into larger variations as the simulation proceeds.
Here's a simple example program that demonstrates the phenomena I'm describing:
float f1 = 0.000001f, f2 = 0.000002f;
f1 += 0.000004f; // This part happens first here
f1 += (f2 * 0.000003f);
printf("%.16f\n", f1);
f1 = 0.000001f, f2 = 0.000002f;
f1 += (f2 * 0.000003f);
f1 += 0.000004f; // This time this happens second
printf("%.16f\n", f1);
The output of this program is
0.0000050000057854
0.0000050000062402
even though addition is commutative so both results should be the same. Note: I understand perfectly well why this is happening - that's not the issue. The problem is that these variations can mean that sometimes a value that used to come out negative on step N, triggering something_happens(), now may come out negative a step or two earlier or later, which can lead to very different overall simulation results because something_happens() has a large effect.
What I want to know is whether there is a good way to decide when something_happens() should be triggered that is not going to be affected by the tiny variations in calculation results that result from re-ordering operations so that the behavior of newer versions of my program will be consistent with the older versions.
The only solution I've so far been able to think of is to use some value epsilon like this:
if (value < epsilon) something_happens();
but because the tiny variations in the results accumulate over time I need to make epsilon quite large (relatively speaking) to ensure that the variations don't result in something_happens() being triggered on a different step. Is there a better way?
I've read this excellent article on floating point comparison, but I don't see how any of the comparison methods described could help me in this situation.
Note: Using integer values instead is not an option.
Edit the possibility of using doubles instead of floats has been raised. This wouldn't solve my problem since the variations would still be there, they'd just be of a smaller magnitude.
I've worked with simulation models for 2 years and the epsilon approach is the sanest way to compare your floats.
Generally, using suitable epsilon values is the way to go if you need to use floating point numbers. Here are a few things which may help:
If your values are in a known range you and you don't need divisions you may be able to scale the problem and use exact operations on integers. In general, the conditions don't apply.
A variation is to use rational numbers to do exact computations. This still has restrictions on the operations available and it typically has severe performance implications: you trade performance for accuracy.
The rounding mode can be changed. This can be use to compute an interval rather than an individual value (possibly with 3 values resulting from round up, round down, and round closest). Again, it won't work for everything but you may get an error estimate out of this.
Keeping track of the value and a number of operations (possible multiple counters) may also be used to estimate the current size of the error.
To possibly experiment with different numeric representations (float, double, interval, etc.) you might want to implement your simulation as templates parameterized for the numeric type.
There are many books written on estimating and minimizing errors when using floating point arithmetic. This is the topic of numerical mathematics.
Most cases I'm aware of experiment briefly with some of the methods mentioned above and conclude that the model is imprecise anyway and don't bother with the effort. Also, doing something else than using float may yield better result but is just too slow, even using double due to the doubled memory footprint and the smaller opportunity of using SIMD operations.
I recommend that you single step - preferably in assembly mode - through the calculations while doing the same arithmetic on a calculator. You should be able to determine which calculation orderings yield results of lesser quality than you expect and which that work. You will learn from this and probably write better-ordered calculations in the future.
In the end - given the examples of numbers you use - you will probably need to accept the fact that you won't be able to do equality comparisons.
As to the epsilon approach you usually need one epsilon for every possible exponent. For the single-precision floating point format you would need 256 single precision floating point values as the exponent is 8 bits wide. Some exponents will be the result of exceptions but for simplicity it is better to have a 256 member vector than to do a lot of testing as well.
One way to do this could be to determine your base epsilon in the case where the exponent is 0 i e the value to be compared against is in the range 1.0 <= x < 2.0. Preferably the epsilon should be chosen to be base 2 adapted i e a value that can be exactly represented in a single precision floating point format - that way you know exactly what you are testing against and won't have to think about rounding problems in the epsilon as well. For exponent -1 you would use your base epsilon divided by two, for -2 divided by 4 and so on. As you approach the lowest and the highest parts of the exponent range you gradually run out of precision - bit by bit - so you need to be aware that extreme values can cause the epsilon method to fail.
If it absolutely has to be floats then using an epsilon value may help but may not eliminate all problems. I would recommend using doubles for the spots in the code you know for sure will have variation.
Another way is to use floats to emulate doubles, there are many techniques out there and the most basic one is to use 2 floats and do a little bit of math to save most of the number in one float and the remainder in the other (saw a great guide on this, if I find it I'll link it).
Certainly you should be using doubles instead of floats. This will probably reduce the number of flipped nodes significantly.
Generally, using an epsilon threshold is only useful when you are comparing two floating-point number for equality, not when you are comparing them to see which is bigger. So (for most models, at least) using epsilon won't gain you anything at all -- it will just change the set of flipped nodes, it wont make that set smaller. If your model itself is chaotic, then it's chaotic.