Related
Is there a value of type double (IEEE 64-bit float / binary64), K, such that K * K == 3.0? (The irrational number is of course "square root of 3")
I tried:
static constexpr double Sqrt3 = 1.732050807568877293527446341505872366942805253810380628055806;
static_assert(Sqrt3 * Sqrt3 == 3.0);
but the static assert fails.
(I'm guessing neither the next higher nor next lower floating-point representable number square to 3.0 after rounding? Or is the parser of the floating point literal being stupid? Or is it doable in IEEE standard but fast math optimizations are messing it up?)
I think the digits are right:
$ python
>>> N = 1732050807568877293527446341505872366942805253810380628055806
>>> N * N
2999999999999999999999999999999999999999999999999999999999996\
607078976886330406910974461358291614910225958586655450309636
Update
I've discovered that:
static_assert(Sqrt3 * Sqrt3 < 3.0); // pass
static_assert(Sqrt3 * Sqrt3 > 2.999999999999999); // pass
static_assert(Sqrt3 * Sqrt3 > 2.9999999999999999); // fail
So the literal must produce the next lower value.
I guess I need to check the next higher value. Could bit-dump the representation maybe and then increment the last bit of the mantissa.
Update 2
For posterity: I wound up going with this for the Sqrt3 constant and the test:
static constexpr double Sqrt3 = 1.7320508075688772;
static_assert(0x1.BB67AE8584CAAP+0 == 1.7320508075688772);
static_assert(Sqrt3 * Sqrt3 == 2.9999999999999996);
The answer is no; there is no such K.
The closest binary64 value to the actual square root of 3 is equal to 7800463371553962 × 2-52. Its square is:
60847228810955004221158677897444 × 2-104
This value is not exactly representable. It falls between (3 - 2-51) and 3, which are respectively equal to
60847228810955002264642499117056 × 2-104
and
60847228810955011271841753858048 × 2-104
As you can see, K * K is much closer to 3 - 2-51 than it is to 3. So IEEE 754 requires the result of the operation K * K to yield 3 - 2-51, not 3. (The compiler might convert K to an extended-precision format for the calculation, but the result will still be 3 - 2-51 after conversion back to binary64.)
Furthermore, if we go to the next representable value after K in the binary64 format, we will find that its square is closest to 3 + 2-51, which is the next representable value after 3.
This result should not be too surprising; in general, incrementing a number by 1 ulp will increment its square by roughly 2 ulps, so you have about a 50% chance, given some value x, that there is a K with the same precision as x such that K * K == x.
The C standard does not dictate the default rounding mode. While it is typically round-to-nearest, ties-to-even, it could be round-upward, and some implementations support changing the mode. In such case, squaring 1.732050807568877193176604123436845839023590087890625 while rounding upward produces exactly 3.
#include <fenv.h>
#include <math.h>
#include <stdio.h>
#pragma STDC FENV_ACCESS ON
int main(void)
{
volatile double x = 1.732050807568877193176604123436845839023590087890625;
fesetround(FE_UPWARD);
printf("%.99g\n", x*x); // Prints “3”.
}
x is declared volatile to prevent the compiler from computing x*x at compile-time with a different rounding mode. Some compilers do not support #pragma STDC FENV_ACCESS but may support fesetround once the #pragma line is removed.
Testing with Python is valid I think, since both use the IEEE-754 representation for doubles along with the rules for operations on same.
The closest possible double to the square root of 3 is slightly low.
>>> Sqrt3 = 3**0.5
>>> Sqrt3*Sqrt3
2.9999999999999996
The next available value is too high.
>>> import numpy as np
>>> Sqrt3p = np.nextafter(Sqrt3,999)
>>> Sqrt3p*Sqrt3p
3.0000000000000004
If you could split the difference, you'd have it.
>>> Sqrt3*Sqrt3p
3.0
In the Ruby language, the Float class uses "the native architecture's double-precision floating point representation" and it has methods named prev_float and next_float that let you iterate through different possible floats using the smallest possible steps. Using this, I was able to do a simple test and see that there is no double (at least on x86_64 Linux) that meets your criterion. The Ruby interpreter is written in C, so I think my results should be applicable to the C double type.
Here is the Ruby code:
x = Math.sqrt(3)
4.times { x = x.prev_float }
9.times do
puts "%.20f squared is %.20f" % [x, x * x]
puts "Success!" if x * x == 3
x = x.next_float
end
And the output:
1.73205080756887630500 squared is 2.99999999999999644729
1.73205080756887652704 squared is 2.99999999999999733546
1.73205080756887674909 squared is 2.99999999999999822364
1.73205080756887697113 squared is 2.99999999999999866773
1.73205080756887719318 squared is 2.99999999999999955591
1.73205080756887741522 squared is 3.00000000000000044409
1.73205080756887763727 squared is 3.00000000000000133227
1.73205080756887785931 squared is 3.00000000000000177636
1.73205080756887808136 squared is 3.00000000000000266454
Is there a value of type double, K, such that K * K == 3.0?
Yes.
K = sqrt(n); and K * K == n may be true, even when √n is irrational.
Note that K, the result of sqrt(n), as a double, is a rational number.
Various rounding modes: #Eric
K * K rounds to n
Example: Roots n: 11, 14 and 17 when squared are n.
for (int i = 10; i < 20; i++) {
double x = sqrt(i);
double y = x * x;
printf("%2d %.25g\n", i, y);
}
10 10.00000000000000177635684
11 11
12 11.99999999999999822364316
13 12.99999999999999822364316
14 14
15 15.00000000000000177635684
16 16
17 17
18 17.99999999999999644728632
19 19.00000000000000355271368
Different precision
Rather than 53 bits with common double, say the FP math was done with 24. Roots n: 3, 5 and 10 when squared are n.
for (int i = 2; i < 11; i++) {
float x = sqrtf(i);
printf("%2d %.25g\n", i, x*x);
}
2 1.99999988079071044921875
3 3
4 4
5 5
6 6.000000476837158203125
7 6.999999523162841796875
8 7.999999523162841796875
9 9
10 10
or say the FP math was done with 64 bits. Roots n: 5, 6 and 10 when squared are n.
for (int i = 2; i < 11; i++) {
long double x = sqrtl(i);
printf("%2d %.35Lg\n", i, x*x);
}
2 1.9999999999999999998915797827514496
3 3.0000000000000000002168404344971009
4 4
5 5
6 6
7 6.9999999999999999995663191310057982
8 7.9999999999999999995663191310057982
9 9
10 10
With various precisions, (note C does not specify a fixed precision), K * K == 3.0 is possible.
FLT_EVAL_METHOD == 2
When FLT_EVAL_METHOD == 2, intermediate calculations may be done at higher precession, thus affecting the product of k*k.
(Have yet to come up with a good simple example.)
sqrt(3) is irrational, which means that there is no rational number k such that k*k == 3. A double can only represent rational numbers; therefore, there is no double k such that k*k == 3.
If you can accept a number that is close to satisfying k*k == 3, then you can use std::numeric_limits (in <type_traits>, if memory serves) to see if you’re within some minimal interval around 3. It may look like:
assert( abs(k*k - 3.) <= abs(k*k + 3.) * std::numeric_limits<double>::epsilon * X);
Epsilon is the smallest difference from one that double can represent. We scale it by the sum of the two values to compare in order to bring its magnitude in line with the numbers we’re checking. X is a scaling factor that lets you adjust the precision you accept.
If this is a theoretical question: no. If it’s a practical question: yes, up some level of precision.
I have to find nth root of numbers that can be as large as 10^18, with n as large as 10^4.
I know using pow() we can find the nth roots using,
x = (long int)(1e-7 + pow(number, 1.0 / n))
But this is giving wrong answers on online programming judges, but on all the cases i have taken, it is giving correct results. Is there something wrong with this method for the given constraints
Note: nth root here means the largest integer whose nth power is less than or equal to the given number, i.e., largest 'x' for which x^n <= number.
Following the answers, i know this approach is wrong, then what is the way i should do it?
You can just use
x = (long int)pow(number, 1.0 / n)
Given the high value of n, most answers will be 1.
UPDATE:
Following the OP comment, this approach is indeed flawed, because in most cases 1/n does not have an exact floating-point representation and the floor of the 1/n-th power can be off by one.
And rounding is not better solution, it can make the root off by one in excess.
Another problem is that values up to 10^18 cannot be represented exactly using double precision, whereas 64 bits ints do.
My proposal:
1) truncate the 11 low order bits of number before the (implicit) cast to double, to avoid rounding up by the FP unit (unsure if this is useful).
2) use the pow function to get an inferior estimate of the n-th root, let r.
3) compute the n-th power of r+1 using integer arithmetic only (by repeated squaring).
4) the solution is r+1 rather than r in case that the n-th power fits.
There remains a possibility that the FP unit rounds up when computing 1/n, leading to a slightly too large result. I doubt that this "too large" can get as large as one unit in the final result, but this should be checked.
I think I finally understood your problem. All you want to do is raise a value, say X, to the reciprocal of a number, say n (i.e., find ⁿ√X̅), and round down. If you then raise that answer to the n-th power, it will never be larger than your original X. The problem is that the computer sometimes runs into rounding error.
#include <cmath>
long find_nth_root(double X, int n)
{
long nth_root = std::trunc(std::pow(X, 1.0 / n));
// because of rounding error, it's possible that nth_root + 1 is what we actually want; let's check
if (std::pow(nth_root + 1, n) <= X) {
return nth_root + 1;
}
return nth_root;
}
Of course, the original question was to find the largest integer, Y, that satisfies the equation X ≤ Yⁿ. That's easy enough to write:
long find_nth_root(double x, int d)
{
long i = 0;
for (; std::pow(i + 1, d) <= x; ++i) { }
return i;
}
This will probably run faster than you'd expect. But you can do better with a binary search:
#include <cmath>
long find_nth_root(double x, int d)
{
long low = 0, high = 1;
while (std::pow(high, d) <= x) {
low = high;
high *= 2;
}
while (low != high - 1) {
long step = (high - low) / 2;
long candidate = low + step;
double value = std::pow(candidate, d);
if (value == x) {
return candidate;
}
if (value < x) {
low = candidate;
continue;
}
high = candidate;
}
return low;
}
I use this routine I wrote. It's the faster of the ones I've seen here. It also handles up to 64 bits. BTW, n1 is the input number.
for (n3 = 0; ((mnk) < n1) ; n3+=0.015625, nmrk++) {
mk += 0.0073125;
dad += 0.00390625;
mnk = pow(n1, 1.0/(mk+n3+dad));
mnk = pow(mnk, (mk+n3+dad));
}
Although not always perfect, it does come the closest.
You can try this to get the nth_root with unsigned in C :
// return a number that, when multiplied by itself nth times, makes N.
unsigned nth_root(const unsigned n, const unsigned nth) {
unsigned a = n, c, d, r = nth ? n + (n > 1) : n == 1 ;
for (; a < r; c = a + (nth - 1) * r, a = c / nth)
for (r = a, a = n, d = nth - 1; d && (a /= r); --d);
return r;
}
Yes it does not include <math.h>, example of output :
24 == (int) pow(15625, 1.0/3)
25 == nth_root(15625, 3)
0 == nth_root(0, 0)
1 == nth_root(1, 0)
4 == nth_root(4096, 6)
13 == nth_root(18446744073709551614, 17) // 64-bit 20 digits
11 == nth_root(340282366920938463463374607431768211454, 37) // 128-bit 39 digits
The default guess is the variable a, set to n.
Math:
If you have an equation like this:
x = 3 mod 7
x could be ... -4, 3, 10, 17, ..., or more generally:
x = 3 + k * 7
where k can be any integer. I don't know of a modulo operation is defined for math, but the factor ring certainly is.
Python:
In Python, you will always get non-negative values when you use % with a positive m:
#!/usr/bin/python
# -*- coding: utf-8 -*-
m = 7
for i in xrange(-8, 10 + 1):
print(i % 7)
Results in:
6 0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3
C++:
#include <iostream>
using namespace std;
int main(){
int m = 7;
for(int i=-8; i <= 10; i++) {
cout << (i % m) << endl;
}
return 0;
}
Will output:
-1 0 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 0 1 2 3
ISO/IEC 14882:2003(E) - 5.6 Multiplicative operators:
The binary / operator yields the quotient, and the binary % operator
yields the remainder from the division of the first expression by the
second. If the second operand of / or % is zero the behavior is
undefined; otherwise (a/b)*b + a%b is equal to a. If both operands are
nonnegative then the remainder is nonnegative; if not, the sign of the
remainder is implementation-defined 74).
and
74) According to work underway toward the revision of ISO C, the
preferred algorithm for integer division follows the rules defined in
the ISO Fortran standard, ISO/IEC 1539:1991, in which the quotient is
always rounded toward zero.
Source: ISO/IEC 14882:2003(E)
(I couldn't find a free version of ISO/IEC 1539:1991. Does anybody know where to get it from?)
The operation seems to be defined like this:
Question:
Does it make sense to define it like that?
What are arguments for this specification? Is there a place where the people who create such standards discuss about it? Where I can read something about the reasons why they decided to make it this way?
Most of the time when I use modulo, I want to access elements of a datastructure. In this case, I have to make sure that mod returns a non-negative value. So, for this case, it would be good of mod always returned a non-negative value.
(Another usage is the Euclidean algorithm. As you could make both numbers positive before using this algorithm, the sign of modulo would matter.)
Additional material:
See Wikipedia for a long list of what modulo does in different languages.
On x86 (and other processor architectures), integer division and modulo are carried out by a single operation, idiv (div for unsigned values), which produces both quotient and remainder (for word-sized arguments, in AX and DX respectively). This is used in the C library function divmod, which can be optimised by the compiler to a single instruction!
Integer division respects two rules:
Non-integer quotients are rounded towards zero; and
the equation dividend = quotient*divisor + remainder is satisfied by the results.
Accordingly, when dividing a negative number by a positive number, the quotient will be negative (or zero).
So this behaviour can be seen as the result of a chain of local decisions:
Processor instruction set design optimises for the common case (division) over the less common case (modulo);
Consistency (rounding towards zero, and respecting the division equation) is preferred over mathematical correctness;
C prefers efficiency and simplicitly (especially given the tendency to view C as a "high level assembler"); and
C++ prefers compatibility with C.
Back in the day, someone designing the x86 instruction set decided it was right and good to round integer division toward zero rather than round down. (May the fleas of a thousand camels nest in his mother's beard.) To keep some semblance of math-correctness, operator REM, which is pronounced "remainder", had to behave accordingly. DO NOT read this: https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzatk/REM.htm
I warned you. Later someone doing the C spec decided it would be conforming for a compiler to do it either the right way or the x86 way. Then a committee doing the C++ spec decided to do it the C way. Then later yet, after this question was posted, a C++ committee decided to standardize on the wrong way. Now we are stuck with it. Many a programmer has written the following function or something like it. I have probably done it at least a dozen times.
inline int mod(int a, int b) {int ret = a%b; return ret>=0? ret: ret+b; }
There goes your efficiency.
These days I use essentially the following, with some type_traits stuff thrown in. (Thanks to Clearer for a comment that gave me an idea for an improvement using latter day C++. See below.)
<strike>template<class T>
inline T mod(T a, T b) {
assert(b > 0);
T ret = a%b;
return (ret>=0)?(ret):(ret+b);
}</strike>
template<>
inline unsigned mod(unsigned a, unsigned b) {
assert(b > 0);
return a % b;
}
True fact: I lobbied the Pascal standards committee to do mod the right way until they relented. To my horror, they did integer division the wrong way. So they do not even match.
EDIT: Clearer gave me an idea. I am working on a new one.
#include <type_traits>
template<class T1, class T2>
inline T1 mod(T1 a, T2 b) {
assert(b > 0);
T1 ret = a % b;
if constexpr ( std::is_unsigned_v<T1>)
{
return ret;
} else {
return (ret >= 0) ? (ret) : (ret + b);
}
}
What are arguments for this specification?
One of the design goals of C++ is to map efficiently to hardware. If the underlying hardware implements division in a way that produces negative remainders, then that's what you'll get if you use % in C++. That's all there is to it really.
Is there a place where the people who create such standards discuss about it?
You will find interesting discussions on comp.lang.c++.moderated and, to a lesser extent, comp.lang.c++
Others have described the why well enough and unfortunately the question which asks for a solution is marked a duplicate of this one and a comprehensive answer on that aspect seems to be missing. There seem to be 2 commonly used general solutions and one special-case I would like to include:
// 724ms
inline int mod1(int a, int b)
{
const int r = a % b;
return r < 0 ? r + b : r;
}
// 759ms
inline int mod2(int a, int b)
{
return (a % b + b) % b;
}
// 671ms (see NOTE1!)
inline int mod3(int a, int b)
{
return (a + b) % b;
}
int main(int argc, char** argv)
{
volatile int x;
for (int i = 0; i < 10000000; ++i) {
for (int j = -argc + 1; j < argc; ++j) {
x = modX(j, argc);
if (x < 0) return -1; // Sanity check
}
}
}
NOTE1: This is not generally correct (i.e. if a < -b). The reason I included it is because almost every time I find myself taking the modulus of a negative number is when doing math with numbers that are already modded, for example (i1 - i2) % n where the 0 <= iX < n (e.g. indices of a circular buffer).
As always, YMMV with regards to timing.
I am trying to convert a double to a string in a native NT application, i.e. an application that only depends on ntdll.dll. Unfortunately, ntdll's version of vsnprintf does not support %f et al., forcing me to implement the conversion on my own.
The aforementioned ntdll.dll exports only a few of the math.h functions (floor, ceil, log, pow, ...). However, I am reasonably sure that I can implement any of the unavailable math.h functions if necessary.
There is an implementation of floating point conversion in GNU's libc, but the code is extremely dense and difficult to comprehent (the GNU indentation style does not help here).
I've already implemented the conversion by normalizing the number (i.e. multiplying/dividing the number by 10 until it's in the interval [1, 10)) and then generating each digit by cutting the integral part off with modf and multiplying the fractional part by 10. This works, but there is a loss of precision (only the first 15 digits are correct). The loss of precision is, of course, inherent to the algorithm.
I'd settle with 17 digits, but an algorithm that would be able to generate an arbitrary number of digits correctly would be preferred.
Could you please suggest an algorithm or point me to a good resource?
Double-precision numbers do not have more than 15 significant (decimal) figures of precision. There is absolutely no way you can get "an arbitrary number of digits correctly"; doubles are not bignums.
Since you say you're happy with 17 significant figures, use long double; on Windows, I think, that will give you 19 significant figures.
I've thought about this a bit more. You lose precision because you normalize by multiplying by some power of 10 (you chose [1,10) rather than [0,1), but that's a minor detail). If you did so with a power of 2, you'd lose no precision, but then you'd get "decimal digits"*2^e; you could implement bcd arithmetic and compute the product yourself, but that doesn't sound like fun.
I'm pretty confident that you could split the double g=m*2^e into two parts: h=floor(g*10^k) and i=modf(g*10^k) for some k, and then separately convert to decimal digits and then stitch them together, but how about a simpler approach: use "long double" (80 bits, but I've heard that Visual C++ may not support it?) with your current approach and stop after 17 digits.
_gcvt should do it (edit - it's not in ntdll.dll, it's in some msvcrt*.dll?)
As for decimal digits of precision, IEEE binary64 has 52 binary digits. 52*log10(2)=15.65... (edit: as you pointed out, to round trip, you need more than 16 digits)
After a lot of research, I found a paper titled Printing Floating-Point Numbers Quickly and Accurately. It uses exact rational arithmetic to avoid precision loss. It cites a little older paper: How to Print Floating-Point Numbers Accurately, which however seems to require ACM subscription to access.
Since the former paper was reprinted in 2006, I am inclined to believe that it is still current. The exact rational arithmetic (which requires dynamic allocation) seems to be a necessary evil.
A complete implementation of the C code for the fastest known (as of today) algorithm:
http://code.google.com/p/double-conversion/downloads/list
It even includes a test suite.
This is the C code behind the algorithm described in this PDF:
Printing Floating-Point Numbers Quickly and Accurately
http://www.cs.indiana.edu/~burger/FP-Printing-PLDI96.pdf
#include <cstdint>
// --------------------------------------------------------------------------
// Return number of decimal-digits of a given unsigned-integer
// N is unit8_t/uint16_t/uint32_t/uint64_t
template <class N> inline uint8_t GetUnsignedDecDigits(const N n)
{
static_assert(std::numeric_limits<N>::is_integer && !std::numeric_limits<N>::is_signed,
"GetUnsignedDecDigits: unsigned integer type expected" );
const uint8_t anMaxDigits[]= {3, 5, 8, 10, 13, 15, 17, 20};
const uint8_t nMaxDigits = anMaxDigits[sizeof(N)-1];
uint8_t nDigits= 1;
N nRoof = 10;
while ((n >= nRoof) && (nDigits<nMaxDigits))
{
nDigits++;
nRoof*= 10;
}
return nDigits;
}
// --------------------------------------------------------------------------
// Convert floating-point value to NULL-terminated string represention
TCHAR* DoubleToStr(double f , // [i ]
TCHAR* pczStr , // [i/o] caller should allocate enough space
int nDigitsI, // [i ] digits of integer part including sign / <1: auto
int nDigitsF ) // [i ] digits of fractional part / <0: auto
{
switch (_fpclass(f))
{
case _FPCLASS_SNAN:
case _FPCLASS_QNAN: _tcscpy_s(pczStr, 5, _T("NaN" )); return pczStr;
case _FPCLASS_NINF: _tcscpy_s(pczStr, 5, _T("-INF")); return pczStr;
case _FPCLASS_PINF: _tcscpy_s(pczStr, 5, _T("+INF")); return pczStr;
}
if (nDigitsI> 18) nDigitsI= 18; if (nDigitsI< 1) nDigitsI= -1;
if (nDigitsF> 18) nDigitsF= 18; if (nDigitsF< 0) nDigitsF= -1;
bool bNeg= (f<0);
if (f<0)
f= -f;
int nE= 0; // exponent (displayed if != 0)
if ( ((-1 == nDigitsI) && (f >= 1e18 )) || // large value: switch to scientific representation
((-1 != nDigitsI) && (f >= pow(10., nDigitsI))) )
{
nE= (int)log10(f);
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
else if (f>0)
if ((-1 == nDigitsF) && (f <= 1e-10)) // small value: switch to scientific representation
{
nE= (int)log10(f)-1;
f/= (double)pow(10., nE);
if (-1 != nDigitsF)
nDigitsF= __max(nDigitsF, nDigitsI+nDigitsF-(bNeg?2:1)-4);
nDigitsI= (bNeg?2:1);
}
double fI;
double fF= modf(f, &fI); // fI: integer part, fF: fractional part
if (-1 == nDigitsF) // figure out number of meaningfull digits in fF
{
double fG, fGI, fGF;
do
{
nDigitsF++;
fG = fF*pow(10., nDigitsF);
fGF= modf(fG, &fGI);
}
while (fGF > 1e-10);
}
const double afPower10[20]= {1e0 , 1e1 , 1e2 , 1e3 , 1e4 , 1e5 , 1e6 , 1e7 , 1e8 , 1e9 ,
1e10, 1e11, 1e12, 1e13, 1e14, 1e15, 1e16, 1e17, 1e18, 1e19 };
uint64_t uI= (uint64_t)round(fI );
uint64_t uF= (uint64_t)round(fF*afPower10[nDigitsF]);
if (uF)
if (GetUnsignedDecDigits(uF) > nDigitsF) // X.99999 was rounded to X+1
{
uF= 0;
uI++;
if (nE)
{
uI/= 10;
nE++;
}
}
uint8_t nRealDigitsI= GetUnsignedDecDigits(uI);
if (bNeg)
nRealDigitsI++;
int nPads= 0;
if (-1 != nDigitsI)
{
nPads= nDigitsI-nRealDigitsI;
for (int i= nPads-1; i>=0; i--) // leading spaces
pczStr[i]= _T(' ');
}
if (bNeg) // minus sign
{
pczStr[nPads]= _T('-');
nRealDigitsI--;
nPads++;
}
for (int j= nRealDigitsI-1; j>=0; j--) // digits of integer part
{
pczStr[nPads+j]= (uint8_t)(uI%10) + _T('0');
uI /= 10;
}
nPads+= nRealDigitsI;
if (nDigitsF)
{
pczStr[nPads++]= _T('.'); // decimal point
for (int k= nDigitsF-1; k>=0; k--) // digits of fractional part
{
pczStr[nPads+k]= (uint8_t)(uF%10)+ _T('0');
uF /= 10;
}
}
nPads+= nDigitsF;
if (nE)
{
pczStr[nPads++]= _T('e'); // exponent sign
if (nE<0)
{
pczStr[nPads++]= _T('-');
nE= -nE;
}
else
pczStr[nPads++]= _T('+');
for (int l= 2; l>=0; l--) // digits of exponent
{
pczStr[nPads+l]= (uint8_t)(nE%10) + _T('0');
nE /= 10;
}
pczStr[nPads+3]= 0;
}
else
pczStr[nPads]= 0;
return pczStr;
}
Does vsnprintf supports I64?
double x = SOME_VAL; // allowed to be from -1.e18 to 1.e18
bool sign = (SOME_VAL < 0);
if ( sign ) x = -x;
__int64 i = static_cast<__int64>( x );
double xm = x - static_cast<double>( i );
__int64 w = static_cast<__int64>( xm*pow(10.0, DIGITS_VAL) ); // DIGITS_VAL indicates how many digits after the decimal point you want to get
char out[100];
vsnprintf( out, sizeof out, "%s%I64.%I64", (sign?"-":""), i, w );
Another option is to try to find implementation of gcvt.
Have you looked at the uClibc implementation of printf?
The problem is to derive a formula for determining number of digits a given decimal number could have in a given base.
For example: The decimal number 100006 can be represented by 17,11,9,8,7,6,8 digits in bases 2,3,4,5,6,7,8 respectively.
Well the formula I derived so far is like this : (log10(num) /log10(base)) + 1.
in C/C++ I used this formula to compute the above given results.
long long int size = ((double)log10(num) / (double)log10(base)) + 1.0;
But sadly the formula is not giving correct answer is some cases,like these :
Number 8 in base 2 : 1,0,0,0
Number of digits: 4
Formula returned: 3
Number 64 in base 2 : 1,0,0,0,0,0,0
Number of digits: 7
Formula returned: 6
Number 64 in base 4 : 1,0,0,0
Number of digits: 4
Formula returned: 3
Number 125 in base 5 : 1,0,0,0
Number of digits: 4
Formula returned: 3
Number 128 in base 2 : 1,0,0,0,0,0,0,0
Number of digits: 8
Formula returned: 7
Number 216 in base 6 : 1,0,0,0
Number of digits: 4
Formula returned: 3
Number 243 in base 3 : 1,0,0,0,0,0
Number of digits: 6
Formula returned: 5
Number 343 in base 7 : 1,0,0,0
Number of digits: 4
Formula returned: 3
So the error is by 1 digit.I just want somebody to help me to correct the formula so that it work for every possible cases.
Edit : As per the input specification I have to deal with cases like 10000000000, i.e 10^10,I don't think log10() in either C/C++ can handle such cases ? So any other procedure/formula for this problem will be highly appreciated.
There are fast floating operations in your compiler settings. You need precise floation operations. The thing is that log10(8)/log10(2) is always 3 in math. But may be your result is 2.99999, for expample. It is bad. You must add small additive, but not 0.5. It should be about .00001 or something like that.
Almost true formula:
int size = static_cast<int>((log10((double)num) / log10((double)base)) + 1.00000001);
Really true solution
You should check the result of your formula. Compexity is O(log log n) or O(log result)!
int fast_power(int base, int s)
{
int res = 1;
while (s) {
if (s%2) {
res*=base;
s--;
} else {
s/=2;
base*=base;
}
}
return res;
}
int digits_size(int n, int base)
{
int s = int(log10(1.0*n)/log10(1.0*base)) + 1;
return fast_power(base, s) > n ? s : s+1;
}
This check is better than Brute-force test with base multiplications.
Either of the following will work:
>>> from math import *
>>> def digits(n, b=10):
... return int(1 + floor(log(n, b))) if n else 1
...
>>> def digits(n, b=10):
... return int(ceil(log(n + 1, b))) if n else 1
...
The first version is explained at mathpath.org. In the second version the + 1 is necessary to yield the correct answer for any number n that is the smallest number with d digits in base b. That is, those numbers which are written 10...0 in base b. Observe that input 0 must be treated as a special case.
Decimal examples:
>>> digits(1)
1
>>> digits(9)
1
>>> digits(10)
2
>>> digits(99)
2
>>> digits(100)
3
Binary:
>>> digits(1, 2)
1
>>> digits(2, 2)
2
>>> digits(3, 2)
2
>>> digits(4, 2)
3
>>> digits(1027, 2)
11
Edit: The OP states that the log solution may not work for large inputs. I don't know about that, but if so, the following code should not break down, because it uses integer arithmetic only (this time in C):
unsigned int
digits(unsigned long long n, unsigned long long b)
{
unsigned int d = 0;
while (d++, n /= b);
return d;
}
This code will probably be less efficient. And yes, it was written for maximum obscurity points. It simply uses the observation that every number has at least one digit, and that every divison by b which does not yield 0 implies the existence of an additional digit. A more readable version is the following:
unsigned int
digits(unsigned long long n, unsigned long long b)
{
unsigned int d = 1;
while (n /= b) {
d++;
}
return d;
}
Number of digits of a numeral in a given base
Since your formula is correct (I just tried it), I would think that it's a rounding error in your division, causing the number to be just slightly less than the integer value it should be. So when you truncate to an integer, you lose 1. Try adding an additional 0.5 to your final value (so that truncating is actually a round operation).
What you want is ceiling ( = smallest integer not greater than) logb (n+1), rather than what you're calculating right now, floor(1+logb(n)).
You might try:
int digits = (int) ceil( log((double)(n+1)) / log((double)base) );
As others have pointed out, you have rounding error, but the proposed solutions simply move the danger zone or make it smaller, they don't eliminate it. If your numbers are integers then you can verify -- using integer arithmetic -- that one power of the base is less than or equal to your number, and the next is above it (the first power is the number of digits). But if you use floating point arithmetic anywhere in the chain then you will be vulnerable to error (unless your base is a power of two, and maybe even then).
EDIT:
Here is crude but effective solution in integer arithmetic. If your integer classes can hold numbers as big as base*number, this will give the correct answer.
size = 0, k = 1;
while(k<=num)
{
k *= base;
size += 1;
}
Using your formula,
log(8)/log(2) + 1 = 4
the problem is in the precision of the logarithm calculation. Using
ceil(log(n+1)/log(b))
ought to resolve that problem. This isn't quite the same as
ceil(log(n)/log(b))
because this gives the answer 3 for n=8 b=2, nor is it the same as
log(n+1)/log(b) + 1
because this gives the answer 4 for n=7 b=2 (when calculated to full precision).
I actually get some curious resulting implementing and compiling the first form with g++:
double n = double(atoi(argv[1]));
double b = double(atoi(argv[2]));
int i = int(std::log(n)/std::log(b) + 1.0);
fails (IE gives the answer 3), while,
double v = std::log(n)/std::log(b) + 1.0;
int i = int(v);
succeeds (gives the answer 4). Looking at it some more I think a third form
ceil(log(n+0.5)/log(b))
would be more stable, because it avoids the "critical" case when n (or n+1 for the second form) is an integer power of b (for integer values of n).
It may be beneficial to wrap a rounding function (e.g. + 0.5) into your code somewhere: it's quite likely that the division is producing (e.g.) 2.99989787, to which 1.0 is added, giving 3.99989787 and when that's converted to an int, it gives 3.
Looks like the formula is right to me:
Number 8 in base 2 : 1,0,0,0
Number of digits: 4
Formula returned: 3
log10(8) = 0.903089
log10(2) = 0.301029
Division => 3
+1 => 4
So it's definitely just a rounding error.
Floating point rounding issues.
log10(216) / log10(6) = 2.9999999999999996
But you cannot add 0.5 as suggested, because it would not work for the following
log10(1295) = log10(6) = 3.9995691928566091 // 5, 5, 5, 5
log10(1296) = log10(6) = 4.0 // 1, 0, 0, 0, 0
Maybe using the log(value, base) function would avoid these rounding errors.
I think that the only way to get the rounding error eliminated without producing other errors is to use or implement integer logarithms.
Here is a solution in bash:
% digits() { echo $1 $2 opq | dc | sed 's/ .//g;s/.//' | wc -c; }
% digits 10000000000 42
7
static int numInBase(int num, int theBase)
{
if(num == 0) return 0;
if (num == theBase) return 1;
return 1 + numInBase(num/theBase,theBase);
}