From reading other questions about floating-point performance on modern desktop processors, it is my understanding that the answer to the question "Which is faster, double or float?" is dependent on which of these types is implemented in hardware, or in the CPU's ALU (Arithmatic Logic Unit, I think).
My understanding is that if float is implemented in hardware, then using double data types is slower because math using that data type is implemented through software using the float data type. Therefore double is slower and uses more ram.
On the other hand, if double is implemented in hardware, it is my understanding that a conversion (sort of like a truncation) must be done to convert to float data types. Hence using float will be slower, although it will use less ram.
On the Raspberry-Pi, which data type is implemented in hardware? (Equivalently, which is faster, float or double?)
I tried reading the limited parts of the datasheets for the BCM2835, but I didn't find the information I was searching for.
I should explain that I couldn't think of a good method of testing the performance, and so I didn't run any timed tests. By good method I mean one that assessed all possible calculations, or at least the calculations which I should test, and a test that would give consistent results, with enough difference that it could be concluded with reasonable certainty that one data type is faster than the other.
Raspberry Pi uses an ARM1176JZF-S(*) as its CPU, according to Wikipedia, which has hardware support for pipelined single- and double-precision floating point. You can look up the exact latency and throughput data in the TRM on ARM's website. Short version: the latency is comparable for single and double; double-precision multiplies have half the throughput of single-precision.
Note that floating-point is only supported in arm mode on the 1176; the "thumb 1" instruction set does not allow access to the floating-point registers at all.
(*) The 1176 is ancient. I'm slightly surprised to not find something a bit more modern like an A9 or an M4.
Related
What would be the expected performance gain with using (u)int16 over float in an OpenCL kernel ? If any ?
I expect the memory transfer to be roughly divided by two but what of the device load ?
Strangely I can hardly find any benchs or documentations on the subject. (or maybe my google fu is just failing me...)
I'm working on image processing (filtering mostly). The precision is not that critical, indeed the result of several kernels operations is cast into a char. We narrowed several operations where using shorter data types is acceptable. So I was wondering if those operations can be speed up by using shorter data where the precision is not critical.
thanks for your help.
GPUs tend to do floating-point operations better than integral. For example, some will have extra pipelines for floating-point ops, and making everything integral just reduces the GPU's throughput. Data copy may not be your bottleneck and halving the amount by using 16-bit integers may not help. Moreover, on integrated GPU's like Intel or AMD's you can get zero-copy behavior. So the effect on image or buffer size is minimal (to a point).
Also, you might look into 16-bit floating point number support. That gets you the best part of both worlds (half the data w/ floating point numbers).
I am reading through code for optimization routines (Nelder Mead, SQP...). Languages are C++, Python. I observe that often conversion from double to float is performed, or methods are duplicated with double resp. float arguments. Why is it profitable in optimization routines code, and is it significant? In my own code in C++, should I be careful for types double and float and why?
Kind regards.
Often the choice between double and float is made more on space demands than speed. Modern processors are capable of operating on double quite fast.
Floats may be faster than doubles when using SIMD instructions (such as SSE) which can operate on multiple values at a time. Also if the operations are faster than the memory pipeline, the smaller memory requirements of float will speed things overall.
Other times that I've come across the need to consider the choice between double and float types in terms of optimisation include:
Networking. Sending double precision data across a socket connection
will obviously require more time than sending half that amount of
data.
Mobile and embedded processors may only be able to handle high
speed single precision calculations efficiently on a coprocessor.
As mentioned in another answer, modern desktop processors can handle double precision Processing quite fast. However, you have to ask yourself if the
double precision processing is really required. I work with audio,
and the only time that I can think of where I would need to process
double precision data would be when using high order filters where
numerical errors can accumulate. Most of the time this can be avoided
by paying more careful attention to the algorithm design. There are,
of course, other scientific or engineering applications where double
precision data is required in order to correctly represent a huge
dynamic range.
Even so, the question of how much effort to spend on considering the data type to use really depends on your target platform. If the platform can crunch through doubles with negligible overhead and you have memory to spare then there is no need to concern yourself. Profile small sections of test code to find out.
In certain optimization algorithms, the choice between double and float is not made more on space demands than speed. For example, with penalty or barrier methods that are used for interior point methods in nonlinear optimization, a float has insufficient precision compared to a double, and using floats in the algorithm will yield garbage. For this reason, penalty and barrier methods were not used in the 1960s, but were rediscovered later on with the advent of the double precision data type. (For more on these methods, consult Nonlinear Programming: Sequential Unconstrained Minimization Techniques (Classics in Applied Mathematics) by Fiacco and McCormick.)
Another consideration is the conditioning of the underlying linear systems solved in many optimization algorithms. If the linear systems you're solving in something like a Newton iteration are sufficiently ill-conditioned, you will not be able to obtain an accurate solution to those systems.
Only if the loss in precision will not jeopardize your numerics should you consider replacing doubles with floats; even if space constraints force you to do so, you should make sure that the accuracy of your numerical results is not compromised. Once sufficient accuracy is assured for the problems you're working on, you can then worry about space and performance optimizations. You can use the CUTEr test set to validate your optimization routines.
This question already has answers here:
Is using double faster than float?
(10 answers)
Closed 8 years ago.
I am reading "accelerated C++". I found one sentence which states "sometimes double is faster in execution than float in C++". After reading sentence I got confused about float and double working. Please explain this point to me.
Depends on what the native hardware does.
If the hardware is (or is like) x86 with legacy x87 math, float and double are both extended (for free) to an internal 80-bit format, so both have the same performance (except for cache footprint / memory bandwidth)
If the hardware implements both natively, like most modern ISAs (including x86-64 where SSE2 is the default for scalar FP math), then usually most FPU operations are the same speed for both. Double division and sqrt can be slower than float, as well as of course being significantly slower than multiply or add. (Float being smaller can mean fewer cache misses. And with SIMD, twice as many elements per vector for loops that vectorize).
If the hardware implements only double, then float will be slower if conversion to/from the native double format isn't free as part of float-load and float-store instructions.
If the hardware implements float only, then emulating double with it will cost even more time. In this case, float will be faster.
And if the hardware implements neither, and both have to be implemented in software. In this case, both will be slow, but double will be slightly slower (more load and store operations at the least).
The quote you mention is probably referring to the x86 platform, where the first case was given. But this doesn't hold true in general.
Also beware that x * 3.3 + y for float x,y will trigger promotion to double for both variables. This is not the hardware's fault, and you should avoid it by writing 3.3f to let your compiler make efficient asm that actually keeps numbers as floats if that's what you want.
You can find a complete answer in this article:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
This is a quote from a previous Stack Overflow thread, about how float and double variables affect memory bandwidth:
If a double requires
more storage than a float, then it
will take longer to read the data.
That's the naive answer. On a modern
IA32, it all depends on where the data
is coming from. If it's in L1 cache,
the load is negligible provided the
data comes from a single cache line.
If it spans more than one cache line
there's a small overhead. If it's from
L2, it takes a while longer, if it's
in RAM then it's longer still and
finally, if it's on disk it's a huge
time. So the choice of float or double
is less imporant than the way the data
is used. If you want to do a small
calculation on lots of sequential
data, a small data type is preferable.
Doing a lot of computation on a small
data set would allow you to use bigger
data types with any significant
effect. If you're accessing the data
very randomly, then the choice of data
size is unimportant - data is loaded
in pages / cache lines. So even if you
only want a byte from RAM, you could
get 32 bytes transfered (this is very
dependant on the architecture of the
system). On top of all of this, the
CPU/FPU could be super-scalar (aka
pipelined). So, even though a load may
take several cycles, the CPU/FPU could
be busy doing something else (a
multiply for instance) that hides the
load time to a degree
Short answer is: it depends.
CPU with x87 will crunch floats and doubles equally fast. Vectorized code will run faster with floats, because SSE can crunch 4 floats or 2 doubles in one pass.
Another thing to consider is memory speed. Depending on your algorithm, your CPU could be idling a lot while waiting for the data. Memory intensive code will benefit from using floats, but ALU limited code won't (unless it is vectorized).
I can think of two basic cases when doubles are faster than floats:
Your hardware supports double operations but not float operations, so floats will be emulated by software and therefore be slower.
You really need the precision of doubles. Now, if you use floats anyway you will have to use two floats to reach similar precision to double. The emulation of a true double with floats will be slower than using floats in the first place.
You do not necessarily need doubles but your numeric algorithm converges faster due to the enhanced precision of doubles. Also, doubles might offer enough precision to use a faster but numerically less stable algorithm at all.
For completeness' sake I also give some reasons for the opposite case of floats being faster. You can see for yourself whichs reasons dominate in your case:
Floats are faster than doubles when you don't need double's
precision and you are memory-bandwidth bound and your hardware
doesn't carry a penalty on floats.
They conserve memory-bandwidth because they occupy half the space
per number.
There are also platforms that can process more floats than doubles
in parallel.
On Intel, the coprocessor (nowadays integrated) will handle both equally fast, but as some others have noted, doubles result in higher memory bandwidth which can cause bottlenecks. If you're using scalar SSE instructions (default for most compilers on 64-bit), the same applies. So generally, unless you're working on a large set of data, it doesn't matter much.
However, parallel SSE instructions will allow four floats to be handled in one instruction, but only two doubles, so here float can be significantly faster.
In experiments of adding 3.3 for 2000000000 times, results are:
Summation time in s: 2.82 summed value: 6.71089e+07 // float
Summation time in s: 2.78585 summed value: 6.6e+09 // double
Summation time in s: 2.76812 summed value: 6.6e+09 // long double
So double is faster and default in C and C++. It's more portable and the default across all C and C++ library functions. Alos double has significantly higher precision than float.
Even Stroustrup recommends double over float:
"The exact meaning of single-, double-, and extended-precision is implementation-defined. Choosing the right precision for a problem where the choice matters requires significant understanding of floating-point computation. If you don't have that understanding, get advice, take the time to learn, or use double and hope for the best."
Perhaps the only case where you should use float instead of double is on 64bit hardware with a modern gcc. Because float is smaller; double is 8 bytes and float is 4 bytes.
float is usually faster. double offers greater precision. However performance may vary in some cases if special processor extensions such as 3dNow or SSE are used.
There is only one reason 32-bit floats can be slower than 64-bit doubles (or 80-bit 80x87). And that is alignment. Other than that, floats take less memory, generally meaning faster access, better cache performance. It also takes fewer cycles to process 32-bit instructions. And even when (co)-processor has no 32-bit instructions, it can perform them on 64-bit registers with the same speed. It probably possible to create a test case where doubles will be faster than floats, and v.v., but my measurements of real statistics algos didn't show noticeable difference.
We have a measurement data processing application and currently all data is held as C++ float which means 32bit/4byte on our x86/Windows platform. (32bit Windows Application).
Since precision is becoming an issue, there have been discussions to move to another datatype. The options currently discussed are switching to double (8byte) or implementing a fixed decimal type on top of __int64 (8byte).
The reason the fixed-decimal solution using __int64 as underlying type is even discussed is that someone claimed that double performance is (still) significantly worse than processing floats and that we might see significant performance benefits using a native integer type to store our numbers. (Note that we really would be fine with fixed decimal precision, although the code would obviously become more complex.)
Obviously we need to benchmark in the end, but I would like to ask whether the statement that doubles are worse holds any truth looking at modern processors? I guess for large arrays doubles may mess up cache hits more that floats, but otherwise I really fail to see how they could differ in performance?
It depends on what you do. Additions, subtractions and multiplies on double are just as fast as on float on current x86 and POWER architecture processors. Divisions, square roots and transcendental functions (exp, log, sin, cos, etc.) are usually notably slower with double arguments, since their runtime is dependent on the desired accuracy.
If you go fixed point, multiplies and divisions need to be implemented with long integer multiply / divide instructions which are usually slower than arithmetic on doubles (since processors aren't optimized as much for it). Even more so if you're running in 32 bit mode where a long 64 bit multiply with 128 bit results needs to be synthesized from several 32-bit long multiplies!
Cache utilization is a red herring here. 64-bit integers and doubles are the same size - if you need more than 32 bits, you're gonna eat that penalty no matter what.
Look it up. Both and Intel publish the instruction latencies for their CPUs in freely available PDF documents on their websites.
However, for the most part, performance won't be significantly different, or a couple of reasons:
when using the x87 FPU instead of SSE, all floating point operations are calculated at 80 bits precision internally, and then rounded off, which means that the actual computation is equally expensive for all floating-point types. The only cost is really memory-related then (in terms of CPU cache and memory bandwidth usage, and that's only an issue in float vs double, but irrelevant if you're comparing to int64)
with or without SSE, nearly all floating-point operations are pipelined. When using SSE, the double instructions may (I haven't looked this up) have a higher latency than their float equivalents, but the throughput is the same, so it should be possible to achieve similar performance with doubles.
It's also not a given that a fixed-point datatype would actually be faster either. It might, but the overhead of keeping this datatype consistent after some operations might outweigh the savings. Floating-point operations are fairly cheap on a modern CPU. They have a bit of latency, but as mentioned before, they're generally pipelined, potentially hiding this cost.
So my advice:
Write some quick tests. It shouldn't be that hard to write a program that performs a number of floating-point ops, and then measure how much slower the double version is relative to the float one.
Look it up in the manuals, and see for yourself if there's any significant performance difference between float and double computations
I've trouble the understand the rationale "as double as slower than float we'll use 64 bits int". Guessing performance has always been an black art needing much of experience, on today hardware it is even worse considering the number of factors to take into account. Even measuring is difficult. I know of several cases where micro-benchmarks lent to one solution but in context measurement showed that another was better.
First note that two of the factors which have been given to explain the claimed slower double performance than float are not pertinent here: bandwidth needed will the be same for double as for 64 bits int and SSE2 vectorization would give an advantage to double...
Then consider than using integer computation will increase the pressure on the integer registers and computation units when apparently the floating point one will stay still. (I've already seen cases where doing integer computation in double was a win attributed to the added computation units available)
So I doubt that rolling your own fixed point arithmetic would be advantageous over using double (but I could be showed wrong by measures).
Implementing 64 fixed points isn't really fun. Especially for more complex functions like Sqrt or logarithm. Integers will probably still a bit faster for simple operations like additions. And you'll need to deal with integer overflows. And you need to be careful when implementing rounding, else errors can easily accumulate.
We're implementing fixed points in a C# project because we need determinism which floatingpoint on .net doesn't guarantee. And it's relatively painful. Some formula contained x^3 bang int overflow. Unless you have really compelling reasons not to, use float or double instead of fixedpoint.
SIMD instructions from SSE2 complicate the comparison further, since they allow operation on several floating point numbers(4 floats or 2 doubles) at the same time. I'd use double and try to take advantage of these instructions. So double will probably be significantly slower than floats, but comparing with ints is difficult and I'd prefer float/double over fixedpoint is most scenarios.
It's always best to measure instead of guess. Yes, on many architectures, calculations on doubles process twice the data as calculations on floats (and long doubles are slower still). However, as other answers, and comments on this answer, have pointed out, the x86 architecture doesn't follow the same rules as, say, ARM processors, SPARC processors, etc. On x86 floats, doubles and long doubles are all converted to long doubles for computation. I should have known this, because the conversion causes x86 results to be more accurate than SPARC and Sun went through a lot of trouble to get the less accurate results for Java, sparking some debate (note, that page is from 1998, things have since changed).
Additionally, calculations on doubles are built in to the CPU where calculations on a fixed decimal datatype would be written in software and potentially slower.
You should be able to find a decent fixed sized decimal library and compare.
With various SIMD instruction sets you can perform 4 single precision floating point operations at the same cost as one, essentially you pack 4 floats into a single 128 bit register. When switching to doubles you can only pack 2 doubles into these registers and hence you can only do two operations at the same time.
As many people have said, a 64bit int is probably not worth it if double is an option. At least when SSE is available. This might be different on micro controllers of various kinds but I guess that is not your application. If you need additional precision in long sums of floats, you should keep in mind that this operation is sometimes problematic with floats and doubles and would be more exact on integers.
What are the advantages and disadvantages of using one instead of the other in C++?
If you want to know the true answer, you should read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
In short, although double allows for higher precision in its representation, for certain calculations it would produce larger errors. The "right" choice is: use as much precision as you need but not more and choose the right algorithm.
Many compilers do extended floating point math in "non-strict" mode anyway (i.e. use a wider floating point type available in hardware, e.g. 80-bits and 128-bits floating), this should be taken into account as well. In practice, you can hardly see any difference in speed -- they are natives to hardware anyway.
Unless you have some specific reason to do otherwise, use double.
Perhaps surprisingly, it is double and not float that is the "normal" floating-point type in C (and C++). The standard math functions such as sin and log take doubles as arguments, and return doubles. A normal floating-point literal, as when you write 3.14 in your program, has the type double. Not float.
On typical modern computers, doubles can be just as fast as floats, or even faster, so performance is usually not a factor to consider, even for large calculations. (And those would have to be large calculations, or performance shouldn't even enter your mind. My new i7 desktop computer can do six billion multiplications of doubles in one second.)
This question is impossible to answer since there is no context to the question. Here are some things that can affect the choice:
Compiler implementation of floats, doubles and long doubles. The C++ standard states:
There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double.
So, all three can be the same size in memory.
Presence of an FPU. Not all CPUs have FPUs and sometimes the floating point types are emulated and sometimes the floating point types are just not supported.
FPU Architecture. The IA32's FPU is 80bit internally - 32 bit and 64 bit floats are expanded to 80bit on load and reduced on store. There's also SIMD which can do four 32bit floats or two 64bit floats in parallel. Use of SIMD is not defined in the standard so it would require a compiler that does more complex analysis to determine if SIMD can be used, or requires the use of special functions (libraries or intrinsics). The upshot of the 80bit internal format is that you can get slightly different results depending on how often the data is saved to RAM (thus, losing precision). For this reason, compilers don't optimise floating point code particularly well.
Memory bandwidth. If a double requires more storage than a float, then it will take longer to read the data. That's the naive answer. On a modern IA32, it all depends on where the data is coming from. If it's in L1 cache, the load is negligible provided the data comes from a single cache line. If it spans more than one cache line there's a small overhead. If it's from L2, it takes a while longer, if it's in RAM then it's longer still and finally, if it's on disk it's a huge time. So the choice of float or double is less important than the way the data is used. If you want to do a small calculation on lots of sequential data, a small data type is preferable. Doing a lot of computation on a small data set would allow you to use bigger data types with any significant effect. If you're accessing the data very randomly, then the choice of data size is unimportant - data is loaded in pages / cache lines. So even if you only want a byte from RAM, you could get 32 bytes transferred (this is very dependant on the architecture of the system). On top of all of this, the CPU/FPU could be super-scalar (aka pipelined). So, even though a load may take several cycles, the CPU/FPU could be busy doing something else (a multiply for instance) that hides the load time to a degree.
The standard does not enforce any particular format for floating point values.
If you have a specification, then that will guide you to the optimal choice. Otherwise, it's down to experience as to what to use.
Double is more precise but is coded on 8 bytes. float is only 4 bytes, so less room and less precision.
You should be very careful if you have double and float in your application. I had a bug due to that in the past. One part of the code was using float while the rest of the code was using double. Copying double to float and then float to double can cause precision error that can have big impact. In my case, it was a chemical factory... hopefully it didn't have dramatic consequences :)
I think that it is because of this kind of bug that the Ariane 6 rocket has exploded a few years ago!!!
Think carefully about the type to be used for a variable
I personnaly go for double all the time until I see some bottlenecks. Then I consider moving to float or optimizing some other part
This depends on how the compiler implements double. It's legal for double and float to be the same type (and it is on some systems).
That being said, if they are indeed different, the main issue is precision. A double has a much higher precision due to it's difference in size. If the numbers you are using will commonly exceed the value of a float, then use a double.
Several other people have mentioned performance isssues. That would be exactly last on my list of considerations. Correctness should be your #1 consideration.
I think regardless of the differences (which as everyone points out, floats take up less space and are in general faster)... does anyone ever suffer performance issues using double? I say use double... and if later on you decide "wow, this is really slow"... find your performance bottleneck (which is probably not the fact you used double). THEN, if it's still too slow for you, see where you can sacrifice some precision and use float.
Use whichever precision is required to achieve the appropriate results. If you then find that your code isn't performing as well as you'd like (you used profiling correct?) take a look at:
Intel 64 and IA-32 Architectures Optimization Reference Manual
Software Optimization for the AMD64 Processor
It depends highly on the CPU the most obvious trade-offs are between precision and memory. With GBs of RAM, memory is not much of an issue, so it's generally better to use doubles.
As for performance, it depends highly on the CPU. floats will usually get better performance than doubles on a 32 bit machine. On 64 bit, doubles are sometimes faster, since it is (usually) the native size. Still, what will matter much more than your choice of data types is whether or not you can take advantage of SIMD instructions on your processor.
double has higher precision, whereas floats take up less memory and are faster. In general you should use float unless you have a case where it isn't accurate enough.
The main difference between float and double is precision. Wikipedia has more info about
Single precision (float) and Double precision.