Why isn't `int pow(int base, int exponent)` in the standard C++ libraries? - c++

I feel like I must just be unable to find it. Is there any reason that the C++ pow function does not implement the "power" function for anything except floats and doubles?
I know the implementation is trivial, I just feel like I'm doing work that should be in a standard library. A robust power function (i.e. handles overflow in some consistent, explicit way) is not fun to write.

As of C++11, special cases were added to the suite of power functions (and others). C++11 [c.math] /11 states, after listing all the float/double/long double overloads (my emphasis, and paraphrased):
Moreover, there shall be additional overloads sufficient to ensure that, if any argument corresponding to a double parameter has type double or an integer type, then all arguments corresponding to double parameters are effectively cast to double.
So, basically, integer parameters will be upgraded to doubles to perform the operation.
Prior to C++11 (which was when your question was asked), no integer overloads existed.
Since I was neither closely associated with the creators of C nor C++ in the days of their creation (though I am rather old), nor part of the ANSI/ISO committees that created the standards, this is necessarily opinion on my part. I'd like to think it's informed opinion but, as my wife will tell you (frequently and without much encouragement needed), I've been wrong before :-)
Supposition, for what it's worth, follows.
I suspect that the reason the original pre-ANSI C didn't have this feature is because it was totally unnecessary. First, there was already a perfectly good way of doing integer powers (with doubles and then simply converting back to an integer, checking for integer overflow and underflow before converting).
Second, another thing you have to remember is that the original intent of C was as a systems programming language, and it's questionable whether floating point is desirable in that arena at all.
Since one of its initial use cases was to code up UNIX, the floating point would have been next to useless. BCPL, on which C was based, also had no use for powers (it didn't have floating point at all, from memory).
As an aside, an integral power operator would probably have been a binary operator rather than a library call. You don't add two integers with x = add (y, z) but with x = y + z - part of the language proper rather than the library.
Third, since the implementation of integral power is relatively trivial, it's almost certain that the developers of the language would better use their time providing more useful stuff (see below comments on opportunity cost).
That's also relevant for the original C++. Since the original implementation was effectively just a translator which produced C code, it carried over many of the attributes of C. Its original intent was C-with-classes, not C-with-classes-plus-a-little-bit-of-extra-math-stuff.
As to why it was never added to the standards before C++11, you have to remember that the standards-setting bodies have specific guidelines to follow. For example, ANSI C was specifically tasked to codify existing practice, not to create a new language. Otherwise, they could have gone crazy and given us Ada :-)
Later iterations of that standard also have specific guidelines and can be found in the rationale documents (rationale as to why the committee made certain decisions, not rationale for the language itself).
For example the C99 rationale document specifically carries forward two of the C89 guiding principles which limit what can be added:
Keep the language small and simple.
Provide only one way to do an operation.
Guidelines (not necessarily those specific ones) are laid down for the individual working groups and hence limit the C++ committees (and all other ISO groups) as well.
In addition, the standards-setting bodies realise that there is an opportunity cost (an economic term meaning what you have to forego for a decision made) to every decision they make. For example, the opportunity cost of buying that $10,000 uber-gaming machine is cordial relations (or probably all relations) with your other half for about six months.
Eric Gunnerson explains this well with his -100 points explanation as to why things aren't always added to Microsoft products- basically a feature starts 100 points in the hole so it has to add quite a bit of value to be even considered.
In other words, would you rather have a integral power operator (which, honestly, any half-decent coder could whip up in ten minutes) or multi-threading added to the standard? For myself, I'd prefer to have the latter and not have to muck about with the differing implementations under UNIX and Windows.
I would like to also see thousands and thousands of collection the standard library (hashes, btrees, red-black trees, dictionary, arbitrary maps and so forth) as well but, as the rationale states:
A standard is a treaty between implementer and programmer.
And the number of implementers on the standards bodies far outweigh the number of programmers (or at least those programmers that don't understand opportunity cost). If all that stuff was added, the next standard C++ would be C++215x and would probably be fully implemented by compiler developers three hundred years after that.
Anyway, that's my (rather voluminous) thoughts on the matter. If only votes were handed out based on quantity rather than quality, I'd soon blow everyone else out of the water. Thanks for listening :-)

For any fixed-width integral type, nearly all of the possible input pairs overflow the type, anyway. What's the use of standardizing a function that doesn't give a useful result for vast majority of its possible inputs?
You pretty much need to have an big integer type in order to make the function useful, and most big integer libraries provide the function.
Edit: In a comment on the question, static_rtti writes "Most inputs cause it to overflow? The same is true for exp and double pow, I don't see anyone complaining." This is incorrect.
Let's leave aside exp, because that's beside the point (though it would actually make my case stronger), and focus on double pow(double x, double y). For what portion of (x,y) pairs does this function do something useful (i.e., not simply overflow or underflow)?
I'm actually going to focus only on a small portion of the input pairs for which pow makes sense, because that will be sufficient to prove my point: if x is positive and |y| <= 1, then pow does not overflow or underflow. This comprises nearly one-quarter of all floating-point pairs (exactly half of non-NaN floating-point numbers are positive, and just less than half of non-NaN floating-point numbers have magnitude less than 1). Obviously, there are a lot of other input pairs for which pow produces useful results, but we've ascertained that it's at least one-quarter of all inputs.
Now let's look at a fixed-width (i.e. non-bignum) integer power function. For what portion inputs does it not simply overflow? To maximize the number of meaningful input pairs, the base should be signed and the exponent unsigned. Suppose that the base and exponent are both n bits wide. We can easily get a bound on the portion of inputs that are meaningful:
If the exponent 0 or 1, then any base is meaningful.
If the exponent is 2 or greater, then no base larger than 2^(n/2) produces a meaningful result.
Thus, of the 2^(2n) input pairs, less than 2^(n+1) + 2^(3n/2) produce meaningful results. If we look at what is likely the most common usage, 32-bit integers, this means that something on the order of 1/1000th of one percent of input pairs do not simply overflow.

Because there's no way to represent all integer powers in an int anyways:
>>> print 2**-4
0.0625

That's actually an interesting question. One argument I haven't found in the discussion is the simple lack of obvious return values for the arguments. Let's count the ways the hypthetical int pow_int(int, int) function could fail.
Overflow
Result undefined pow_int(0,0)
Result can't be represented pow_int(2,-1)
The function has at least 2 failure modes. Integers can't represent these values, the behaviour of the function in these cases would need to be defined by the standard - and programmers would need to be aware of how exactly the function handles these cases.
Overall leaving the function out seems like the only sensible option. The programmer can use the floating point version with all the error reporting available instead.

Short answer:
A specialisation of pow(x, n) to where n is a natural number is often useful for time performance. But the standard library's generic pow() still works pretty (surprisingly!) well for this purpose and it is absolutely critical to include as little as possible in the standard C library so it can be made as portable and as easy to implement as possible. On the other hand, that doesn't stop it at all from being in the C++ standard library or the STL, which I'm pretty sure nobody is planning on using in some kind of embedded platform.
Now, for the long answer.
pow(x, n) can be made much faster in many cases by specialising n to a natural number. I have had to use my own implementation of this function for almost every program I write (but I write a lot of mathematical programs in C). The specialised operation can be done in O(log(n)) time, but when n is small, a simpler linear version can be faster. Here are implementations of both:
// Computes x^n, where n is a natural number.
double pown(double x, unsigned n)
{
double y = 1;
// n = 2*d + r. x^n = (x^2)^d * x^r.
unsigned d = n >> 1;
unsigned r = n & 1;
double x_2_d = d == 0? 1 : pown(x*x, d);
double x_r = r == 0? 1 : x;
return x_2_d*x_r;
}
// The linear implementation.
double pown_l(double x, unsigned n)
{
double y = 1;
for (unsigned i = 0; i < n; i++)
y *= x;
return y;
}
(I left x and the return value as doubles because the result of pow(double x, unsigned n) will fit in a double about as often as pow(double, double) will.)
(Yes, pown is recursive, but breaking the stack is absolutely impossible since the maximum stack size will roughly equal log_2(n) and n is an integer. If n is a 64-bit integer, that gives you a maximum stack size of about 64. No hardware has such extreme memory limitations, except for some dodgy PICs with hardware stacks that only go 3 to 8 function calls deep.)
As for performance, you'll be surprised by what a garden variety pow(double, double) is capable of. I tested a hundred million iterations on my 5-year-old IBM Thinkpad with x equal to the iteration number and n equal to 10. In this scenario, pown_l won. glibc pow() took 12.0 user seconds, pown took 7.4 user seconds, and pown_l took only 6.5 user seconds. So that's not too surprising. We were more or less expecting this.
Then, I let x be constant (I set it to 2.5), and I looped n from 0 to 19 a hundred million times. This time, quite unexpectedly, glibc pow won, and by a landslide! It took only 2.0 user seconds. My pown took 9.6 seconds, and pown_l took 12.2 seconds. What happened here? I did another test to find out.
I did the same thing as above only with x equal to a million. This time, pown won at 9.6s. pown_l took 12.2s and glibc pow took 16.3s. Now, it's clear! glibc pow performs better than the three when x is low, but worst when x is high. When x is high, pown_l performs best when n is low, and pown performs best when x is high.
So here are three different algorithms, each capable of performing better than the others under the right circumstances. So, ultimately, which to use most likely depends on how you're planning on using pow, but using the right version is worth it, and having all of the versions is nice. In fact, you could even automate the choice of algorithm with a function like this:
double pown_auto(double x, unsigned n, double x_expected, unsigned n_expected) {
if (x_expected < x_threshold)
return pow(x, n);
if (n_expected < n_threshold)
return pown_l(x, n);
return pown(x, n);
}
As long as x_expected and n_expected are constants decided at compile time, along with possibly some other caveats, an optimising compiler worth its salt will automatically remove the entire pown_auto function call and replace it with the appropriate choice of the three algorithms. (Now, if you are actually going to attempt to use this, you'll probably have to toy with it a little, because I didn't exactly try compiling what I'd written above. ;))
On the other hand, glibc pow does work and glibc is big enough already. The C standard is supposed to be portable, including to various embedded devices (in fact embedded developers everywhere generally agree that glibc is already too big for them), and it can't be portable if for every simple math function it needs to include every alternative algorithm that might be of use. So, that's why it isn't in the C standard.
footnote: In the time performance testing, I gave my functions relatively generous optimisation flags (-s -O2) that are likely to be comparable to, if not worse than, what was likely used to compile glibc on my system (archlinux), so the results are probably fair. For a more rigorous test, I'd have to compile glibc myself and I reeeally don't feel like doing that. I used to use Gentoo, so I remember how long it takes, even when the task is automated. The results are conclusive (or rather inconclusive) enough for me. You're of course welcome to do this yourself.
Bonus round: A specialisation of pow(x, n) to all integers is instrumental if an exact integer output is required, which does happen. Consider allocating memory for an N-dimensional array with p^N elements. Getting p^N off even by one will result in a possibly randomly occurring segfault.

One reason for C++ to not have additional overloads is to be compatible with C.
C++98 has functions like double pow(double, int), but these have been removed in C++11 with the argument that C99 didn't include them.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3286.html#550
Getting a slightly more accurate result also means getting a slightly different result.

The World is constantly evolving and so are the programming languages. The fourth part of the C decimal TR¹ adds some more functions to <math.h>. Two families of these functions may be of interest for this question:
The pown functions, that takes a floating point number and an intmax_t exponent.
The powr functions, that takes two floating points numbers (x and y) and compute x to the power y with the formula exp(y*log(x)).
It seems that the standard guys eventually deemed these features useful enough to be integrated in the standard library. However, the rational is that these functions are recommended by the ISO/IEC/IEEE 60559:2011 standard for binary and decimal floating point numbers. I can't say for sure what "standard" was followed at the time of C89, but the future evolutions of <math.h> will probably be heavily influenced by the future evolutions of the ISO/IEC/IEEE 60559 standard.
Note that the fourth part of the decimal TR won't be included in C2x (the next major C revision), and will probably be included later as an optional feature. There hasn't been any intent I know of to include this part of the TR in a future C++ revision.
¹ You can find some work-in-progress documentation here.

Here's a really simple O(log(n)) implementation of pow() that works for any numeric types, including integers:
template<typename T>
static constexpr inline T pown(T x, unsigned p) {
T result = 1;
while (p) {
if (p & 0x1) {
result *= x;
}
x *= x;
p >>= 1;
}
return result;
}
It's better than enigmaticPhysicist's O(log(n)) implementation because it doesn't use recursion.
It's also almost always faster than his linear implementation (as long as p > ~3) because:
it doesn't require any extra memory
it only does ~1.5x more operations per loop
it only does ~1.25x more memory updates per loop

Perhaps because the processor's ALU didn't implement such a function for integers, but there is such an FPU instruction (as Stephen points out, it's actually a pair). So it was actually faster to cast to double, call pow with doubles, then test for overflow and cast back, than to implement it using integer arithmetic.
(for one thing, logarithms reduce powers to multiplication, but logarithms of integers lose a lot of accuracy for most inputs)
Stephen is right that on modern processors this is no longer true, but the C standard when the math functions were selected (C++ just used the C functions) is now what, 20 years old?

As a matter of fact, it does.
Since C++11 there is a templated implementation of pow(int, int) --- and even more general cases, see (7) in
http://en.cppreference.com/w/cpp/numeric/math/pow
EDIT: purists may argue this is not correct, as there is actually "promoted" typing used. One way or another, one gets a correct int result, or an error, on int parameters.

A very simple reason:
5^-2 = 1/25
Everything in the STL library is based on the most accurate, robust stuff imaginable. Sure, the int would return to a zero (from 1/25) but this would be an inaccurate answer.
I agree, it's weird in some cases.

Related

Enforcing order of execution

I would like to ensure that the calculations requested are executed exactly in the order I specify, without any alterations from either the compiler or CPU (including the linker, assembler, and anything else you can think of).
Operator left-to-right associativity is assumed in the C language
I am working in C (possibly also interested in C++ solutions), which states that for operations of equal precedence there is an assumed left-to-right operator associativity, and hence
a = b + c - d + e + f - g ...;
is equivalent to
a = (...(((((b + c) - d) + e) + f) - g) ...);
A small example
However, consider the following example:
double a, b = -2, c = -3;
a = 1 + 2 - 2 + 3 + 4;
a += 2*b;
a += c;
So many opportunities for optimisation
For many compilers and pre-processors they may be clever enough to recognise the "+ 2 - 2" is redundant and optimise this away. Similarly they could recognise that the "+= 2*b" followed by the "+= c" can be written using a single FMA. Even if they don't optimise in an FMA, they may switch the order of these operations etc. Furthermore, if the compiler doesn't do any of these optimisations, the CPU may well decide to do some out of order execution, and decide it can do the "+= c" before the "+= 2*b", etc.
As floating-point arithmetic is non-associative, each type of optimisation may result in a different end result, which may be noticeable if the following is inlined somewhere.
Why worry about floating point associativity?
For most of my code I would like as much optimisation as I can have and don't care about floating-point associativity or bit-wise reproduciblilty, but occasionally there is a small snippet (similar to the above example) which I would like to be untampered with and totally respected. This is because I am working with a mathematical method which exactly requires a reproducible result.
What can I do to resolve this?
A few ideas which have come to mind:
Disable compiler optimisations and out of order execution
I don't want this, as I want the other 99% of my code to be heavily optimised. (This seems to be cutting off my nose to spite my face). I also most likely won't have permission to change my hardware settings.
Use a pragma
Write some assembly
The code snippets are small enough that this might be reasonable, although I'm not very confident in this, especially if (when) it comes to debugging.
Put this in a separate file, compile separately as un-optimised as possible, and then link using a function call
Volatile variables
To my mind these are just for ensuring that memory access is respected and un-optimised, but perhaps they might prove useful.
Access everything through judicious use of pointers
Perhaps, but this seems like a disaster in readability, performance, and bugs waiting to happen.
If anyone can think of any feasibly solutions (either from any of the ideas I've suggested or otherwise) that would be ideal. The "pragma" option or "function call" to my mind seem like the best approaches.
The ultimate goal
To have something that marks off a small chuck of simple and largely vanilla C code as protected and untouchable to any (realistically most) optimisations, while allowing for the rest of the code to be heavily optimised, covering optimisations from both the CPU and compiler.
This is not a complete answer, but it is informative, partially answers, and is too long for a comment.
Clarifying the Goal
The question actually seeks reproducibility of floating-point results, not order of execution. Also, order of execution is irrelevant; we do not care if, in (a+b)+(c+d), a+b or c+d is executed first. We care that the result of a+b is added to the result of c+d, without any reassociation or other rewriting of arithmetic unless the result is known to be the same.
Reproducibility of floating-point arithmetic is in general an unsolved technological problem. (There is no theoretical barrier; we have reproducible elementary operations. Reproducibility is a matter of what hardware and software vendors have provided and how hard it is to express the computations we want performed.)
Do you want reproducibility on one platform (e.g., always using the same version of the same math library)? Does your code use any math library routines like sin or log? Do you want reproducibility across different platforms? With multithreading? Across changes of compiler version?
Addressing Some Specific Issues
The samples shown in the question can largely be handled by writing each individual floating-point operation in its own statement, as by replacing:
a = 1 + 2 - 2 + 3 + 4;
a += 2*b;
a += c;
with:
t0 = 1 + 2;
t0 = t0 - 2;
t0 = t0 + 3;
t0 = t0 + 4;
t1 = 2*b;
t0 += t1;
a += c;
The basis for this is that both C and C++ permit an implementation to use “excess precision” when evaluating an expression but require that precision to be “discarded” when an assignment or cast is performed. Limiting each assignment expression to one operation or executing a cast after each operation effectively isolates the operations.
In many cases, a compiler will then generate code using instructions of the nominal type, instead of instructions using a type with excess precision. In particular, this should avoid a fused multiply-add (FMA) being substituted for a multiplication followed by an addition. (An FMA has effectively infinite precision in the product before it is added to the addend, thus falling under the “excess precision is permitted” rule.) There are caveats, however. An implementation might first evaluate an operation with excess precision and then round it to the nominal precision. In general, this can cause a different result than doing a single operation in the nominal precision. For the elementary operations of addition, subtract, multiplication, division, and even square root, this does not happen if the excess precision is sufficient greater than the nominal precision. (There are proofs that a result with sufficient excess precision is always close enough to the infinitely precise result that the rounding to nominal precision gets the same result.) This is true for the case where the nominal precision is the IEEE-754 basic 32-bit binary floating-point format, and the excess precision is the 64-bit format. However, it is not true where the nominal precision is the 64-bit format and the excess precision is Intel’s 80-bit format.
So, whether this workaround works depends on the platform.
Other Issues
Aside from the use of excess precision and features like FMA or the optimizer rewriting expressions, there are other things that affect reproducibility, such as non-standard treatment of subnormals (notably replacing them with zeroes), variations between math library routines. (sin, log, and similar functions return different results on different platforms. Nobody has fully implemented correctly rounded math library routines with known bounded performance.)
These are discussed in other Stack Overflow questions about floating-point reproducibility, as well as papers, specifications, and standards documents.
Irrelevant Issues
The order in which a processor executes floating-point operations is irrelevant. Processor reordering of calculations obeys rigid semantics; the results are identical regardless of the chronological order of execution. (Processor timing can affect results if, for example, a task is partitioned into subtasks, such as assigning multiple threads or processes to process different parts of the arrays. Among other issues, their results could arrive in different orders, and the process receiving their results might then add or otherwise combine their results in different orders.)
Using pointers will not fix anything. As far as C or C++ is concerned, *p where p is a pointer to double is the same as a where a is a double. One the objects has a name (a) and one of them does not, but they are like roses: They smell the same. (There are issues where, if you have some other pointer q, the compiler might not know whether *q and *p refer to the same thing. But that also holds true for *q and a.)
Using volatile qualifiers will not aid in reproducibility regarding the excess precision or expression rewriting issue. That is because only an object (not a value) is volatile, which means it has no effect until you write it or read it. But, if you write it, you are using an assignment expression1, so the rule about discarding excess precision already applies. When reading the object, you would force the compiler to retrieve the actual value from memory, but this value will not be any different than the non-volatile object has after assignment, so nothing is accomplished.
Footnote
1 I would have to check on other things that modify an object, such as ++, but those are likely not significant for this discussion.
Write this critical chunk of code in assembly language.
The situation you're in is unusual. Most of the time people want the compiler to do optimizations, so compiler developers don't spend much development effort on means to avoid them. Even with the knobs you do get (pragmas, separate compilation, indirections, ...) you can never be sure something won't be optimized. Some of the undesirable optimizations you mention (constant folding, for instance) cannot be turned off by any means in modern compilers.
If you use assembly language you can be sure you're getting exactly what you wrote. If you do it any other way you won't have that level of confidence.
"clever enough to recognise the + 2 - 2 is redundant and optimise this
away"
No ! All decent compilers will apply constant propagation and figure out that a is constant and optimize all your statement away, into something equivalent to a = 1;. Here the example with assembly.
Now if you make a volatile, the compiler has to assume that any change of a could have an impact outside the C++ programme. Constant propagation will still be performed to optimise each of these calculations, but the intermediary assignments are guaranteed to happen. Here the example with assembly.
If you don't want constant propagation to happen, you need to deactivate optimizations. In this case, the best would be to keep your code separate so to compile the rest with all optilizations on.
However this is not ideal. The optimizer could outperform you and with this approach, you'll loose global optimisation across the function boundaries.
Recommendation/quote of the day:
Don't diddle code; Find better algorithms
- B.W.Kernighan & P.J.Plauger

Do type conversions slow program running?

Th title is quite obvious.
In my case, and for the sake of simplicity, I avoid using, for instance, unsigned int instead of int, as it makes coding faster and simpler.
(BTW, Im using an Android IDE, CppDroid)
Yet, the IDE frequently alerts me to implicit conversions at, for example, For loops where the incremented variable (int) is compared with the size of a vector (size_t/unsigned int).
My questions are:
Do type conversions take time?
If so, how long do they take compared to other common operations?
In the case convertions do take some time, is it worth to correctly define variables in order to avoid convertions?
Your question is valid, although the goal is misconstrued. It is paramount to correctly define variables, but not because of mysterious performance.
It is to ensure correctness. Comparing unsigned integer with signed one is a ticking bomb, as well as (most usually) comparing size_t with integer.
For example, consider following snippet:
for (int i = 0; i < vec.size(); ++i) { }
For all you know, this code can lead to undefined behavior! If the size of the vector is bigger than maximum size signed integer can hold (which is usually the case with 64bit systems) your integer will be overflowing, which is undefined. Compiler might just remove the loop altogether, if it can proove that size of the vector is bigger than maximum int!
Similar looking (and incorrecet as well) line
for (unsigned int i = 0; i < vec.size(), ++i) { }
Is not going to cause undefined behaviour, but it will hang the program when vector size is greater than maximum int. No good thing either.
And of course, the correct way of doing this is
for (decltype(vec.size()) i = 0; i < vec.size(), ++i) { }
Depends what you convert to what.
That particular warning of signed/unsigned mismatch results in zero overhead, but you may end treating negative number as huge unsigned one (or other way around) - so as long as you are using int, and you don't expect to break into 2^31 numbers land, you are safe.
As safe, as people writing file I/O routines around 1990 (never expecting to see 3GiB file in their life). ...not very funny nowadays (still so much SW is broken on 2+GiB file size).
Some other conversions like from int to uint_8 may have tiny overhead, so it's better to avoid them - if possible (by designing the code to use the desired data type all around).
I would firstly address clarity and functionality of the code, and that usually leads to usage of particular data type for particular value all the time, without any conversion.
After the code works, you can measure the performance and consider what optimization makes sense (including usage of mismatched data types with conversions between them).
conclusion: just fix it, use proper data type.
Type conversions might give performance hits (when signedness, bitness or conversion between floating-point types are involved), but, as a general rule, the type identity of the many things in a program is merely a conceptual language front-end feature. When such hits do happen, however, it is because the types involved mean reasonably different things, and hence code must be emitted in order to properly solve the conversion.
Another thing which is completely different from the above is the invocation of type conversion operators in C++, which can run arbitrary code and, thus, most obviously influence in the final program behavior (not only performance).
As mentioned by others, correct use of the type system is most important for program correctness, at least or specially in languages such as C and C++. Using mismatched types can affect the program behavior in some corner cases, albeit can have no impact whatsoever on the execution time otherwise.
It depends.
Actually converting the data between types will require extra calculations. That much should probably be obvious. Usually those calculations take extra time, so they will have a performance impact. However, there are several factors that mitigate the actual impact of this:
The compiler can optimize types in some cases to minimize conversions.
Some platforms implement certain conversions in hardware.
The primary concern surrounding calculations with unlike types typically has much less to do with performance and more to do with safety and producing expected results. That is why the compiler is warning you; the vast majority of compilers will not tell you that you are doing something inefficient, but they will tell you that you are doing something dangerous.
For example, comparing an int with an unsigned int is asking for trouble. The int has a negative range, the unsigned int has a larger positive range. On conversion to unsigned int, the negative range of the int will appear to be larger than its own positive range. It is very easy to generate endless loops or out-of-bounds errors with such a construct.
You should normally only worry yourself about type conversion performance if you are dealing with huge amounts of data - like large vectors/arrays that need converted between formats. You would have to loop over the data in some way, so it would be a conscious action. For example, converting a 10000 element vector of chars to ints. In these cases, you might need to consider if you have a design flaw that is requiring needless conversion of data.
It is worth pointing out that in the above example, even if the conversion itself were instant, the iteration and copy is not.
As for an example of platforms where this can be done to an extent in hardware, most video cards are able to interpret integers as floats on a normalized range, of the sort 255 --> 1.0. However, many other conversions, like conversions between image formats, are still done in software.
Given the platform and optimization details vary greatly, answering how long a given conversion takes relative to other operations is effectively impossible. If you are dealing with enough data that a conversion is creating a noticeable performance bottleneck, then profile that conversion.
It is worth it to make sure your types match to the best of your ability if you are dealing with enough data for it to matter; that is a subjective measurement of value, though, and will depend on what you are doing.
It is always worth it to make sure implicit type conversions do not cause errors, as errors due to them can be some of the worst possible in C/C++ (memory leaks, buffer overflows, access violations, etc.).

What is the standard way to maintain accuracy when dealing with incredibly precise floating point calculations in C++?

I'm in the process of converting a program to C++ from Scilab (similar to Matlab) and I'm required to maintain the same level of precision that is kept by the previous code.
Note: Although maintaining the same level of precision would be ideal. It's acceptable if there is some error with the finished result. The problem I'm facing (as I'll show below) is due to looping, so the calculation error compounds rather quickly. But if the final result is only a thousandth or so off (e.g. 1/1000 vs 1/1001) it won't be a problem.
I've briefly looked into a number of different ways to do this including:
GMP (A Multiple Precision
Arithmetic Library)
Using integers instead of floats (see example below)
Int vs Float Example: Instead of using the float 12.45, store it as an integer being 124,500. Then simply convert everything back when appropriate to do so. Note: I'm not exactly sure how this will work with the code I'm working with (more detail below).
An example of how my program is producing incorrect results:
for (int i = 0; i <= 1000; i++)
{
for (int j = 0; j <= 10000; j++)
{
// This calculation will be computed with less precision than in Scilab
float1 = (1.0 / 100000.0);
// The above error of float2 will become significant by the end of the loop
float2 = (float1 + float2);
}
}
My question is:
Is there a generally accepted way to go about retaining accuracy in floating point arithmetic OR will one of the above methods suffice?
Maintaining precision when porting code like this is very difficult to do. Not because the languages have implicitly different perspectives on what a float is, but because of what the different algorithms or assumptions of accuracy limits are. For example, when performing numerical integration in Scilab, it may use a Gaussian quadrature method. Whereas you might try using a trapezoidal method. The two may both be working on identical IEEE754 single-precision floating point numbers, but you will get different answers due to the convergence characteristics of the two algorithms. So how do you get around this?
Well, you can go through the Scilab source code and look at all of the algorithms it uses for each thing you need. You can then replicate these algorithms taking care of any pre- or post-conditioning of the data that Scilab implicitly does (if any at all). That's a lot of work. And, frankly, probably not the best way to spend your time. Rather, I would look into using the Interfacing with Other Languages section from the developer's documentation to see how you can call the Scilab functions directly from your C, C++, Java, or Fortran code.
Of course, with the second option, you have to consider how you are going to distribute your code (if you need to).Scilab has a GPL-compatible license, so you can just bundle it with your code. However, it is quite big (~180MB) and you may want to just bundle the pieces you need (e.g., you don't need the whole interpreter system). This is more work in a different way, but guarantees numerical-compatibility with your current Scilab solutions.
Is there a generally accepted way to go about retaining accuracy in floating
point arithmetic
"Generally accepted" is too broad, so no.
will one of the above methods suffice?
Yes. Particularly gmp seems to be a standard choice. I would also have a look at the Boost Multiprecision library.
A hand-coded integer approach can work as well, but is surely not the method of choice: it requires much more coding, and more severe a means to store and process aritrarily precise integers.
If your compiler supports it use BCD (Binary-coded decimal)
Sam
Well, another alternative if you use GCC compilers is to go with quadmath/__float128 types.

C/C++ fastest cmath log operation

I'm trying to calculate logab (and get a floating point back, not an integer). I was planning to do this as log(b)/log(a). Mathematically speaking, I can use any of the cmath log functions (base 2, e, or 10) to do this calculation; however, I will be running this calculation a lot during my program, so I was wondering if one of them is significantly faster than the others (or better yet, if there is a faster, but still simple, way to do this). If it matters, both a and b are integers.
First, precalculate 1.0/log(a) and multiply each log(b) by that expression instead.
Edit: I originally said that the natural logarithm (base e) would be fastest, but others state that base 2 is supported directly by the processor and would be fastest. I have no reason to doubt it.
Edit 2: I originally assumed that a was a constant, but in re-reading the question that is never stated. If so then there would be no benefit to precalculating. If it is however, you can maintain readability with an appropriate choice of variable names:
const double base_a = 1.0 / log(a);
for (int b = 0; b < bazillions; ++b)
double result = log(b) * base_a;
Strangely enough Microsoft doesn't supply a base 2 log function, which explains why I was unfamiliar with it. Also the x86 instruction for calculating logs includes a multiplication automatically, and the constants required for the different bases are also available via an optimized instruction, so I'd expect the 3 different log functions to have identical timing (even base 2 would have to multiply by 1).
Since b and a are integers, you can use all the glory of bit twiddling to find their logs to the base 2. Here are some:
Find the log base 2 of an integer with the MSB N set in O(N) operations (the obvious way)
Find the integer log base 2 of an integer with an 64-bit IEEE float
Find the log base 2 of an integer with a lookup table
Find the log base 2 of an N-bit integer in O(lg(N)) operations
Find the log base 2 of an N-bit integer in O(lg(N)) operations with multiply and lookup
I'll leave it to you to choose the best "fast-log" function for your needs.
On the platforms for which I have data, log2 is very slightly faster than the others, in line with my expectations. Note however, that the difference is extremely slight (only a couple percent). This is really not worth worrying about.
Write an implementation that is clear. Then measure the performance.
In the 8087 instruction set, there is only an instruction for the logarithm to base 2, so I would guess this one to be the fastest.
Of course this kind of question depends largely on your processor/architecture, so I would suggest to make a simple test and time it.
The answer is:
it depends
profile it
You don't even mention your CPU type, the variable type, the compiler flags, the data layout. If you need to do lot's of these in parallel, I'm sure there will be a SIMD option. Your compiler will optimize that as long as you use alignment and clear simple loops (or valarray if you like archaic approaches).
Chances are, the intel compiler has specific tricks for intel processors in this area.
If you really wanted you could use CUDA and leverage GPU.
I suppose, if you are unfortunate enough to lack these instruction sets you could go down at the bit fiddling level and write an algorithm that does a nice approximation. In this case, I can bet more than one apple-pie that 2-log is going to be faster than any other base-log

Any better alternatives for getting the digits of a number? (C++)

I know that you can get the digits of a number using modulus and division. The following is how I've done it in the past: (Psuedocode so as to make students reading this do some work for their homework assignment):
int pointer getDigits(int number)
initialize int pointer to array of some size
initialize int i to zero
while number is greater than zero
store result of number mod 10 in array at index i
divide number by 10 and store result in number
increment i
return int pointer
Anyway, I was wondering if there is a better, more efficient way to accomplish this task? If not, is there any alternative methods for this task, avoiding the use of strings? C-style or otherwise?
Thanks. I ask because I'm going to be wanting to do this in a personal project of mine, and I would like to do it as efficiently as possible.
Any help and/or insight is greatly appreciated.
The time it takes to extract the digits will be dwarfed by the time required to dynamically allocate the array. Consider returning the result in a struct:
struct extracted_digits
{
int number_of_digits;
char digits[12];
};
You'll want to pick a suitable value for the maximum number of digits (12 here, which is enough for a 32-bit integer). Alternatively, you could return a std::array<char, 12> and encode the terminal by using an invalid value (so, after the last value, store a 10 or something else that isn't a digit).
Depending on whether you want to handle negative values, you'll also have to decide how to report the unary minus (-).
Unless you want the representation of the number in a base that's a power of 2, that's about the only way to do it.
Smacks of premature optimisation. If profiling proves it matters, then be sure to compare your algo to itoa - internally it may use some CPU instructions that you don't have explicit access to from C++, and which your compiler's optimiser may not be clever enough to employ (e.g. AAM, which divs while saving the mod result). Experiment (and benchmark) coding the assembler yourself. You might dig around for assembly implementations of ITOA (which isn't identical to what you're asking for, but might suggest the optimal CPU instructions).
By "avoiding the use of strings", I'm going to assume you're doing this because a string-only representation is pretty inefficient if you want an integer value.
To that end, I'm going to suggest a slightly unorthodox approach which may be suitable. Don't store them in one form, store them in both. The code below is in C - it will work in C++ but you may want to consider using c++ equivalents - the idea behind it doesn't change however.
By "storing both forms", I mean you can have a structure like:
typedef struct {
int ival;
char sval[sizeof("-2147483648")]; // enough for 32-bits
int dirtyS;
} tIntStr;
and pass around this structure (or its address) rather than the integer itself.
By having macros or inline functions like:
inline void intstrSetI (tIntStr *is, int ival) {
is->ival = i;
is->dirtyS = 1;
}
inline char *intstrGetS (tIntStr *is) {
if (is->dirtyS) {
sprintf (is->sval, "%d", is->ival);
is->dirtyS = 0;
}
return is->sval;
}
Then, to set the value, you would use:
tIntStr is;
intstrSetI (&is, 42);
And whenever you wanted the string representation:
printf ("%s\n" intstrGetS(&is));
fprintf (logFile, "%s\n" intstrGetS(&is));
This has the advantage of calculating the string representation only when needed (the fprintf above would not have to recalculate the string representation and the printf only if it was dirty).
This is a similar trick I use in SQL with using precomputed columns and triggers. The idea there is that you only perform calculations when needed. So an extra column to hold the indexed lowercased last name along with an insert/update trigger to calculate it, is usually a lot more efficient than select lower(non_lowercased_last_name). That's because it amortises the cost of the calculation (done at write time) across all reads.
In that sense, there's little advantage if your code profile is set-int/use-string/set-int/use-string.... But, if it's set-int/use-string/use-string/use-string/use-string..., you'll get a performance boost.
Granted this has a cost, at the bare minimum extra storage required, but most performance issues boil down to a space/time trade-off.
And, if you really want to avoid strings, you can still use the same method (calculate only when needed), it's just that the calculation (and structure) will be different.
As an aside: you may well want to use the library functions to do this rather than handcrafting your own code. Library functions will normally be heavily optimised, possibly more so than your compiler can make from your code (although that's not guaranteed of course).
It's also likely that an itoa, if you have one, will probably outperform sprintf("%d") as well, given its limited use case. You should, however, measure, not guess! Not just in terms of the library functions, but also this entire solution (and the others).
It's fairly trivial to see that a base-100 solution could work as well, using the "digits" 00-99. In each iteration, you'd do a %100 to produce such a digit pair, thus halving the number of steps. The tradeoff is that your digit table is now 200 bytes instead of 10. Still, it easily fits in L1 cache (obviously, this only applies if you're converting a lot of numbers, but otherwise efficientcy is moot anyway). Also, you might end up with a leading zero, as in "0128".
Yes, there is a more efficient way, but not portable, though. Intel's FPU has a special BCD format numbers. So, all you have to do is just to call the correspondent assembler instruction that converts ST(0) to BCD format and stores the result in memory. The instruction name is FBSTP.
Mathematically speaking, the number of decimal digits of an integer is 1+int(log10(abs(a)+1))+(a<0);.
You will not use strings but go through floating points and the log functions. If your platform has whatever type of FP accelerator (every PC or similar has) that will not be a big deal ,and will beat whatever "sting based" algorithm (that is noting more than an iterative divide by ten and count)