I'm trying to use Eigen in C++ for matrix manipulation.
It looks like I can choose float or double type for real numbers,
such as Eigen::Matrix4f or Eigen::Matrix4d.
In normal C++ code, I guess double is more popular nowadays than float.
However, in Eigen's documentation, float seems to be more frequently used than double.
Is there any special reason???
I know this is very immature question but I need help......
Thank you in advance.
float is usually faster. Performance makes a lot of sense for math.
Related
I'm in the process of converting a program to C++ from Scilab (similar to Matlab) and I'm required to maintain the same level of precision that is kept by the previous code.
Note: Although maintaining the same level of precision would be ideal. It's acceptable if there is some error with the finished result. The problem I'm facing (as I'll show below) is due to looping, so the calculation error compounds rather quickly. But if the final result is only a thousandth or so off (e.g. 1/1000 vs 1/1001) it won't be a problem.
I've briefly looked into a number of different ways to do this including:
GMP (A Multiple Precision
Arithmetic Library)
Using integers instead of floats (see example below)
Int vs Float Example: Instead of using the float 12.45, store it as an integer being 124,500. Then simply convert everything back when appropriate to do so. Note: I'm not exactly sure how this will work with the code I'm working with (more detail below).
An example of how my program is producing incorrect results:
for (int i = 0; i <= 1000; i++)
{
for (int j = 0; j <= 10000; j++)
{
// This calculation will be computed with less precision than in Scilab
float1 = (1.0 / 100000.0);
// The above error of float2 will become significant by the end of the loop
float2 = (float1 + float2);
}
}
My question is:
Is there a generally accepted way to go about retaining accuracy in floating point arithmetic OR will one of the above methods suffice?
Maintaining precision when porting code like this is very difficult to do. Not because the languages have implicitly different perspectives on what a float is, but because of what the different algorithms or assumptions of accuracy limits are. For example, when performing numerical integration in Scilab, it may use a Gaussian quadrature method. Whereas you might try using a trapezoidal method. The two may both be working on identical IEEE754 single-precision floating point numbers, but you will get different answers due to the convergence characteristics of the two algorithms. So how do you get around this?
Well, you can go through the Scilab source code and look at all of the algorithms it uses for each thing you need. You can then replicate these algorithms taking care of any pre- or post-conditioning of the data that Scilab implicitly does (if any at all). That's a lot of work. And, frankly, probably not the best way to spend your time. Rather, I would look into using the Interfacing with Other Languages section from the developer's documentation to see how you can call the Scilab functions directly from your C, C++, Java, or Fortran code.
Of course, with the second option, you have to consider how you are going to distribute your code (if you need to).Scilab has a GPL-compatible license, so you can just bundle it with your code. However, it is quite big (~180MB) and you may want to just bundle the pieces you need (e.g., you don't need the whole interpreter system). This is more work in a different way, but guarantees numerical-compatibility with your current Scilab solutions.
Is there a generally accepted way to go about retaining accuracy in floating
point arithmetic
"Generally accepted" is too broad, so no.
will one of the above methods suffice?
Yes. Particularly gmp seems to be a standard choice. I would also have a look at the Boost Multiprecision library.
A hand-coded integer approach can work as well, but is surely not the method of choice: it requires much more coding, and more severe a means to store and process aritrarily precise integers.
If your compiler supports it use BCD (Binary-coded decimal)
Sam
Well, another alternative if you use GCC compilers is to go with quadmath/__float128 types.
I am writing some numeric code in C++ and I want to be able to swap between using double and float. I have therefore added a #define MYFLT which I can make either a float or a double as needed. However, how do I deal with the various numeric literals.
For example
MYFLT someNumber = 1.2;
MYFLT someOtherNumber = 1.5f;
gives compiler warnings for the first line when MYFLT is a float and for the second line when MYFLT is a double. I know this is a trivial example, but there are other cases where I have longer expresions with literals in and floats can end up being converted to doubles then the result back to floats which I think is costing me significant performance. How should I deal with this?
I could do things like
MYFLT someNumber = MYFLT(1.2);
MYFLT someOtherNumber = MYFLT(1.5);
but this is quite tedious. I'm assuming that in that if I do this the compiler is clever enough to just use a float when needed (can anyone confirm that?). What would be better would be if there was a MSVC++ compiler switch or #define that will tell the compiler to treat all floating point literals as floats instead of doubles. Does such a switch exist?
Even when I wrap all my literals as above my code runs 50% slower when I use float rather than double. I was expecting a performance boost through simd type operations, not a penalty!
Phil
What you'd want is #define MYFLTCONST(x) x##f or #define MYFLTCONST(x) x depending on whether you want a f suffix for float appended.
This is a (not quite complete) answer to my own question.
I found that a small function that was called many times (a fast approximation to sin) didn't have its literals cast as MYFLT. The extra computational hit of this also meant that the compiler wasn't inlining it. This function accounted for most of the difference. Some further profiling seemed to indicate that accessing std::vector<float> was slower than std::vector<double> ( I am using [] to do the access if it matters ). Replacing std::vectors with raw fixed sized arrays sped up the double implementation a little and closed the gap significantly for the float implementation. The float version is now only about 10% slower than the double version. But definitely no speed increase due to either RAM access nor vectorization. I guess I need to think more carefully about my loops to get any benefit there.
I guess the conclusion here (yet again) is that the compiler is pretty good at optimising code - it's much better to work with it and do careful profiling than it is to try and do your own blind "optimisations" which might actually have negative effects, like stopping the compiler performing good inlining.
I apologize if this is trivial, but I've been unable to find an answer by google.
As per the OpenCL standard (since 1.0), the half type is supported for storage reasons.
It seems to me however, that without the cl_khr_fp16 extension, it's impossible to use this for anything?
What I would like to do is to save my values as half, but perform all calculations in float.
I tried using convert_half(), but that's not supported without the cl_khr_fp16.
I tried just writing (float) before the half for auto c-style conversion, didn't work eighter.
So my question is, how do I utilize half for storage?
I need to be able to both read and write half's.
Use vload_halfN and store_halfN. The halfN values stored will be converted to/from floatN.
As far as I know the type half is only supported on the GPU, but you can convert it to and back from a float fairly simply, as long as you know a bit about bitwise manipulation.
Have a look at the following link for a good explanation on how to do so.
ftp://ftp.fox-toolkit.org/pub/fasthalffloatconversion.pdf
Since it wasn't mentioned in any of the other answers I thought I'd add: You can also use half float in OpenCL images and the read_imagef and write_imagef functions will do the conversion to/from float for you (cl_khr_fp16 extension not required). That extension is only for having variables in (and doing math in) half.
Let say I have a snippet of code like this:
typedef double My_fp_t;
My_fp_t my_fun( My_fp_t input )
{
// some fp computation, it uses operator+, operator- and so on for type My_fp_t
}
My_fp_t input = 0.;
My_fp_t output = my_fun( input );
Is it possible to retrofit my existing code with a floating point arbitrary precision C++ library?
I would like to simple add #include <cpp_arbitrary_precision_fp>, change my typedef double My_fp_t; into typedef arbitrary_double_t My_fp_t; and let the operator overloading of C++ doing its job...
My main problem is that actually my code do NOT have the typedef :-( and so maybe my plan is doomed to failure.
Assuming that my code had the typedef, what other problems would I face?
This might be tough. I used a template approach in my PhD thesis code do deal with different numerical types. You might want to take a look at it to see the problems I encountered.
The thing is you are fine if all you do with your numbers is use the standard arithmetic operators. However, as soon as you use a square root or some other non operator function you need to create helper objects to detect your object's type (at compile time as it is too slow to do this at run time; see the boost metaprogramming library for help on that) and then call the correct function and return it as the correct type. It is all totally doable, but is likely to take longer than you think and will add considerably to the complexity of your code.
In my experience, (I was using GMP which must be the fastest arbitrary precision library available for C++) after all of the effort and complexity I had introduced, I found that GMP was just too slow for the sorts of computation that I was doing; so it was academically interesting, but practically useless. Before you start on this do some speed tests to see whether your library will still be usable if you use arbitrary precision arithmetic.
If the library defines a type that correctly overloads the operators you use, I don't see any problem...
I am trying to figure out how to use complex numbers in both my host and device code. I came across cuComplex (but can't find any documentation!) and float2 which at least gets a mention in the CUDA programming guide.
What should I use? In the header file for cuComplex, it looks like the functions are declared with __host__ __device__ so I am assuming that means that it would be ok to use them in either place.
My original data is being read in from a file into a std::complex<float> so I dont really want to mess with that. I guess in order to use the complex values on the GPU though, I will have to copy from the original complex<float> to the cuComplex?
cuComplex is defined in /usr/local/cuda/include/cuComplex.h (modulo your install dir). The relevant snippets:
typedef float2 cuFloatComplex;
typedef cuFloatComplex cuComplex;
typedef double2 cuDoubleComplex;
There are also handy functions in there for working with complex numbers -- multiplying them, building them, etc.
As for whether to use float2 or cuComplex, you should use whichever is semantically appropriate -- is it a vector or a complex number? Also, if it is a complex number, you may want to consider using cuFloatComplex or cuDoubleComplex just to be fully explicit.
If you're trying to work with cuBLAS or cuFFT you should use cuComplex. If you're are going to write your own functions there should be no difference in performance as both are just a structure of two floats.
IIRC, float2 is an array of 2 numbers. cuComplex (from the name alone) sounds like CUDA's complex format.
This post seems to point to where to find more on cuComplex: http://forums.nvidia.com/index.php?showtopic=81514