Density of doubles between 2 given numbers - c++

Important Edit: The original question was about getting the density of both doubles and fractions. As i get the answer for doubles and not for fractions, I'm changing the topic to close this question. The other half of the original question is here
New question
I want to find the density of doubles between 2 given numbers but I can't think of a good way. So I'm looking for a closed-form expressions doublesIn(a,b). Or some code that does the work in a reasonable time.
With doubles i should use some formula with mantissa and exponent I'm not aware of. I already have a code using nextafter and it's awfully slow close to [-1,1] (below 1e6 is very slow)
.
Any ideas? Thanks in advance! :)
PS: If you want to know, I'm coding some math stuff for myself and I want to find how useful would be to replace double with a fraction (long,long or similar) on certain algorithms (like Gaussian elimination, newton's method for finding roots, etc), and for that I want to have some measures.

In what follows, including the program, I am assuming double is represented by IEEE 754 64-bit binary floating point. That is the most likely case, but not guaranteed by the C++ standard.
You can count doubles in a range in constant time, because you can count unsigned integers in a range in constant time by subtracting the start from the end and adjusting for whether the range is open or closed.
The doubles in a finite non-negative range have bit patterns that form a consecutive sequence of integers. For example, the range [1.0,2.0] contains one double for each integer in the range [0x3ff0_0000_0000_0000, 0x4000_0000_0000_0000].
Finite non-positive ranges of doubles behave the same way except the unsigned bit patterns increase in value as the doubles become more negative.
If your range includes both positive and negative numbers, split it at zero, so that you deal with one non-negative range and another non-positive range.
Most of the complications arise when you want to get the count exactly right. In that case, you need to adjust for whether the range is open or closed, and to count zero exactly once.
For your purpose, being off by one or two in a few hundred million may not matter much.
Here is a simple program that demonstrates the idea. It has received little error checking, so use at your own risk.
#include <iostream>
#include <cmath>
using namespace std;
uint64_t count(double start, double end);
void testit(uint64_t expected, double start, double end) {
cout << hex << "Should be " << expected << ": " << count(start, end)
<< endl;
}
double increment(double data, int count) {
int i;
for (i = 0; i < count; i++) {
data = nextafter(data, INFINITY);
}
return data;
}
double decrement(double data, int count) {
int i;
for (i = 0; i < count; i++) {
data = nextafter(data, -INFINITY);
}
return data;
}
int main() {
testit((uint64_t) 1 << 52, 1.0, 2.0);
testit(5, 3.0, increment(3.0, 5));
testit(2, decrement(0, 1), increment(0, 1));
testit((uint64_t) 1 << 52, -2.0, -1.0);
testit(1, -0.0, increment(0, 1));
testit(10, decrement(0,10), -0.0);
return 0;
}
// Return the bit pattern representing a double as
// a 64-bit unsigned integer.
uint64_t toInteger(double data) {
return *reinterpret_cast<uint64_t *>(&data);
}
// Count the doubles in a range, assuming double
// is IEEE 754 64-bit binary.
// Counts [start,end), including start but excluding end
uint64_t count(double start, double end) {
if (!(isfinite(start) && isfinite(end) && start <= end)) {
// Insert real error handling here
cerr << "error" << endl;
return 0;
}
if (start < 0) {
if (end < 0) {
return count(fabs(end), fabs(start));
} else if (end == 0) {
return count(0, fabs(start));
} else {
return count(start, 0) + count(0, end);
}
}
if (start == -0.0) {
start = 0.0;
}
return toInteger(end) - toInteger(start);
}

Related

Binary Conversion with recursive fucntion is giving an odd value

I am making a function that converts numbers to bianry with a recursive function, although when I write big values, it is giving me an odd value, for example, when I write 2000 it gives me the result of -1773891888. When I follow the function with the debugger it gives me the correct value of 2000 in binary until the last second.
Thank you!!
#include <iostream>
int Binary(int n);
int main() {
int n;
std::cin >> n;
std::cout << n << " = " << Binary(n) << std::endl;
}
int Binary(int n) {
if (n == 0)return 0;
if (n == 1)return 1;
return Binary(n / 2)*10 + n % 2;
}
Integer values in C++ can only store values in a bounded range (usually -231 to +231 - 1), which maxes out around two billion and change. That means that if you try storing in an integer a binary value with more than ten digits, you’ll overflow this upper limit. On most systems, this causes the value to wrap around, hence the negative outputs.
To fix this, I’d recommend having your function return a std::string storing the bits rather than an integer, since logically speaking the number you’re returning really isn’t a base-10 integer that you’d like to do further arithmetic operations on. This will let you generate binary sequences of whatever length you need without risking an integer overflow.
At least your logic is correct!

Truncating a double floating point at a certain number of digits

I have written the following routine, which is supposed to truncate a C++ double at the n'th decimal place.
double truncate(double number_val, int n)
{
double factor = 1;
double previous = std::trunc(number_val); // remove integer portion
number_val -= previous;
for (int i = 0; i < n; i++) {
number_val *= 10;
factor *= 10;
}
number_val = std::trunc(number_val);
number_val /= factor;
number_val += previous; // add back integer portion
return number_val;
}
Usually, this works great... but I have found that with some numbers, most notably those that do not seem to have an exact representation within double, have issues.
For example, if the input is 2.0029, and I want to truncate it at the fifth place, internally, the double appears to be stored as something somewhere between 2.0028999999999999996 and 2.0028999999999999999, and truncating this at the fifth decimal place gives 2.00289, which might be right in terms of how the number is being stored, but is going to look like the wrong answer to an end user.
If I were rounding instead of truncating at the fifth decimal, everything would be fine, of course, and if I give a double whose decimal representation has more than n digits past the decimal point it works fine as well, but how do I modify this truncation routine so that inaccuracies due to imprecision in the double type and its decimal representation will not affect the result that the end user sees?
I think I may need some sort of rounding/truncation hybrid to make this work, but I'm not sure how I would write it.
Edit: thanks for the responses so far but perhaps I should clarify that this value is not producing output necessarily but this truncation operation can be part of a chain of many different user specified actions on floating point numbers. Errors that accumulate within the double precision over multiple operations are fine, but no single operation, such as truncation or rounding, should produce a result that differs from its actual ideal value by more than half of an epsilon, where epsilon is the smallest magnitude represented by the double precision with the current exponent. I am currently trying to digest the link provided by iinspectable below on floating point arithmetic to see if it will help me figure out how to do this.
Edit: well the link gave me one idea, which is sort of hacky but it should probably work which is to put a line like number_val += std::numeric_limits<double>::epsilon() right at the top of the function before I start doing anything else with it. Dunno if there is a better way, though.
Edit: I had an idea while I was on the bus today, which I haven't had a chance to thoroughly test yet, but it works by rounding the original number to 16 significant decimal digits, and then truncating that:
double truncate(double number_val, int n)
{
bool negative = false;
if (number_val == 0) {
return 0;
} else if (number_val < 0) {
number_val = -number_val;
negative = true;
}
int pre_digits = std::log10(number_val) + 1;
if (pre_digits < 17) {
int post_digits = 17 - pre_digits;
double factor = std::pow(10, post_digits);
number_val = std::round(number_val * factor) / factor;
factor = std::pow(10, n);
number_val = std::trunc(number_val * factor) / factor;
} else {
number_val = std::round(number_val);
}
if (negative) {
number_val = -number_val;
}
return number_val;
}
Since a double precision floating point number only can have about 16 digits of precision anyways, this just might work for all practical purposes, at a cost of at most only one digit of precision that the double would otherwise perhaps support.
I would like to further note that this question differs from the suggested duplicate above in that a) this is using C++, and not Java... I don't have a DecimalFormatter convenience class, and b) I am wanting to truncate, not round, the number at the given digit (within the precision limits otherwise allowed by the double datatype), and c) as I have stated before, the result of this function is not supposed to be a printable string... it is supposed to be a native floating point number that the end user of this function might choose to further manipulate. Accumulated errors over multiple operations due to imprecision in the double type are acceptable, but any single operation should appear to perform correctly to the limits of the precision of the double datatype.
OK, if I understand this right, you've got a floating point number and you want to truncate it to n digits:
10.099999
^^ n = 2
becomes
10.09
^^
But your function is truncating the number to an approximately close value:
10.08999999
^^
Which is then displayed as 10.08?
How about you keep your truncate formula, which does truncate as well as it can, and use std::setprecision and std::fixed to round the truncated value to the required number of decimal places? (Assuming it is std::cout you're using for output?)
#include <iostream>
#include <iomanip>
using std::cout;
using std::setprecision;
using std::fixed;
using std::endl;
int main() {
double foo = 10.08995; // let's imagine this is the output of `truncate`
cout << foo << endl; // displays 10.0899
cout << setprecision(2) << fixed << foo << endl; // rounds to 10.09
}
I've set up a demo on wandbox for this.
I've looked into this. It's hard because you have inaccuracies due to the floating point representation, then further inaccuracies due to the decimal. 0.1 cannot be represented exactly in binary floating point. However you can use the built-in function sprintf with a %g argument that should round accurately for you.
char out[64];
double x = 0.11111111;
int n = 3;
double xrounded;
sprintf(out, "%.*g", n, x);
xrounded = strtod(out, 0);
Get double as a string
If you are looking just to print the output, then it is very easy and straightforward using stringstream:
#include <cmath>
#include <iostream>
#include <iomanip>
#include <limits>
#include <sstream>
using namespace std;
string truncateAsString(double n, int precision) {
stringstream ss;
double remainder = static_cast<double>((int)floor((n - floor(n)) * precision) % precision);
ss << setprecision(numeric_limits<double> ::max_digits10 + __builtin_ctz(precision))<< floor(n);
if (remainder)
ss << "." << remainder;
cout << ss.str() << endl;
return ss.str();
}
int main(void) {
double a = 9636346.59235;
int precision = 1000; // as many digits as you add zeroes. 3 zeroes means precision of 3.
string s = truncateAsString(a, precision);
return 0;
}
Getting the divided floating point with an exact value
Maybe you are looking for true value for your floating point, you can use boost multiprecision library
The Boost.Multiprecision library can be used for computations requiring precision exceeding that of standard built-in types such as float, double and long double. For extended-precision calculations, Boost.Multiprecision supplies a template data type called cpp_dec_float. The number of decimal digits of precision is fixed at compile-time via template parameter.
Demonstration
#include <boost/math/constants/constants.hpp>
#include <boost/multiprecision/cpp_dec_float.hpp>
#include <iostream>
#include <limits>
#include <cmath>
#include <iomanip>
using boost::multiprecision::cpp_dec_float_50;
cpp_dec_float_50 truncate(cpp_dec_float_50 n, int precision) {
cpp_dec_float_50 remainder = static_cast<cpp_dec_float_50>((int)floor((n - floor(n)) * precision) % precision) / static_cast<cpp_dec_float_50>(precision);
return floor(n) + remainder;
}
int main(void) {
int precision = 100000; // as many digits as you add zeroes. 5 zeroes means precision of 5.
cpp_dec_float_50 n = 9636346.59235789;
n = truncate(n, precision); // first part is remainder, floor(n) is int value truncated.
cout << setprecision(numeric_limits<cpp_dec_float_50> ::max_digits10 + __builtin_ctz(precision)) << n << endl; // __builtin_ctz(precision) will equal the number of trailing 0, exactly the precision we need!
return 0;
}
Output:
9636346.59235
NB: Requires sudo apt-get install libboost-all-dev

if(a == b) doesn't work for doubles in a for loop

I am at the moment trying to code a titration curve simulator. But I am running into some trouble with comparing two values.
I have created a small working example that perfectly replicates the bug that I encounter:
#include <iostream>
#include <math.h>
using namespace std;
int main()
{
double a, b;
a = 5;
b = 0;
for(double i = 0; i<=(2*a); i+=0.1){
b = i;
cout << "a=" << a << "; b="<<b;
if(a==b)
cout << "Equal!" << endl;
else
cout << endl;
}
return 0;
}
The output at the relevant section is
a=5; b=5
However, if I change the iteration increment from i+=0.1 to i+=1 or i+=0.5 I get an output of
a=5; b=5Equal!
as you would expect.
I am compiling with g++ on linux using no further flags and I am frankly at a loss how to solve this problem. Any pointers (or even a full-blown solution to my problem) are very appreciated.
Unlike integers, multiplying floats/doubles and adding them up doesn't produce exactly the same results.
So the best practice is find if the abs of their difference is small enough.
If you have some idea on the size of the numbers, you can use a constant:
if (fabs(a - b) < EPS) // equal
If you don't (much slower!):
float a1 = fabs(a), b1 = fabs(b);
float mn = min(a1,b1), mx = max(a1,b1);
if (mn / mx > (1- EPS)) // equal
Note:
In your code, you can use std::abs instead. Same for std::min/max. The code is clearer/shorter when using the C functions.
I would recommend restructuring your loop to iterate using integers and then converting the integers into doubles, like this:
double step = 0.1;
for(int i = 0; i*step<=2*a; ++i){
b = i*step;
cout << "a=" << a << "; b="<<b;
if(a==b)
cout << "Equal!" << endl;
else
cout << endl;
}
This still isn't perfect. You possibly have some loss of precision in the multiplication; however, the floating point errors don't accumulate like they do when iterating using floating point values.
Floating point arithmetic is... interesting. Testing equality is annoying with floats/doubles in most languages because it is impossible to accurately represent many numbers in IEEE floating point math. Basically, where you might compute an expression to be 5.0, the compiler might compute it to be 4.9999999, because it's the closest representable number in the IEEE standard.
Because these numbers are slightly different, you end up with an inequality. Because it's unmaintainble to try and predict which number you will see at compile time, you can't/shouldn't attempt to hard code either one of them into your source to test equality with. As a hard rule, avoid directly checking equality of floating point numbers.
Instead, test that they are extremely close to being equal with something like the following:
template<typename T>
bool floatEqual(const T& a, const T& b) {
auto delta = a * 0.03;
auto minAccepted = a - delta;
auto maxAccepted = a + delta;
return b > minAccepted && b < maxAccepted;
}
This checks whether b is within a range of + or - 3% of the value of a.

How to generate random double numbers with high precision in C++?

I am trying to generate a number of series of double random numbers with high precision. For example, 0.856365621 (has 9 digits after decimal).
I've found some methods from internet, however, they do generate double random number, but the precision is not as good as I request (only 6 digits after the decimal).
Thus, may I know how to achieve my goal?
In C++11 you can using the <random> header and in this specific example using std::uniform_real_distribution I am able to generate random numbers with more than 6 digits. In order to see set the number of digits that will be printed via std::cout we need to use std::setprecision:
#include <iostream>
#include <random>
#include <iomanip>
int main()
{
std::random_device rd;
std::mt19937 e2(rd());
std::uniform_real_distribution<> dist(1, 10);
for( int i = 0 ; i < 10; ++i )
{
std::cout << std::fixed << std::setprecision(10) << dist(e2) << std::endl ;
}
return 0 ;
}
you can use std::numeric_limits::digits10 to determine the precision available.
std::cout << std::numeric_limits<double>::digits10 << std::endl;
In a typical system, RAND_MAX is 231-1 or something similar to that. So your "precision" from using a method like:L
double r = rand()/RAND_MAX;
would be 1/(2<sup>31</sup)-1 - this should give you 8-9 digits "precision" in the random number. Make sure you print with high enough precision:
cout << r << endl;
will not do. This will work better:
cout << fixed << sprecision(15) << r << endl;
Of course, there are some systems out there with much smaller RAND_MAX, in which case the results may be less "precise" - however, you should still get digits down in the 9-12 range, just that they are more likely to be "samey".
Why not create your value out of multiple calls of the random function instead?
For instance:
const int numDecimals = 9;
double result = 0.0;
double div = 1.0;
double mul = 1.0;
for (int n = 0; n < numDecimals; ++n)
{
int t = rand() % 10;
result += t * mul;
mul *= 10.0;
div /= 10.0;
}
result = result * div;
I would personally try a new implementation of the rand function though or at least multiply with the current time or something..
In my case, I'm using MQL5, a very close derivative of C++ for a specific market, whose only random generator produces a random integer from 0 to 32767 (= (2^15)-1). Far too low precision.
So I've adapted his idea -- randomly generate a string of digits any length I want -- to solve my problem, more reliably (and arguably more randomly also), than anything else I can find or think of. My version builds a string and converts it to a double at the end -- avoids any potential math/rounding errors along the way (because we all know 0.1 + 0.2 != 0.3 😉 )
Posting it here in case it helps anyone.
(Disclaimer: The following is valid MQL5. MQL5 and C++ are very close, but some differences. eg. No RAND_MAX constant (so I've hard-coded the 32767). I'm not entirely sure of all the differences, so there may be C++ syntax errors here. Please adapt accordingly).
const int RAND_MAX_INCL = 32767;
const int RAND_MAX_EXCL = RAND_MAX_INCL + 1;
int iRandomDigit() {
const double dRand = rand()/RAND_MAX_EXCL; // double 0.0 <= dRand < 1.0
return (int)(dRand * 10); // int 0 <= result < 10
};
double dRandom0IncTo1Exc(const int iPrecisionDigits) {
int iPrecisionDigits2 = iPrecisionDigits;
if ( iPrecisionDigits > DBL_DIG ) { // DBL_DIG == "Number of significant decimal digits for double type"
Print("WARNING: Can't generate random number with precision > ", DBL_DIG, ". Adjusted precision to ", DBL_DIG, " accordingly.");
iPrecisionDigits2 = DBL_DIG;
};
string sDigits = "";
for (int i = 0; i < iPrecisionDigits2; i++) {
sDigits += (string)iRandomDigit();
};
const string sResult = "0." + sDigits;
const double dResult = StringToDouble(sResult);
return dResult;
}
Noted in a comment on #MasterPlanMan's answer -- the other answers use more "official" methods designed for the question, from standard library, etc. However, I think conceptually it's a good solution when faced with limitations that the other answers can't address.

Force sum of weights in vector to be 100

I need to populate a vector so that it holds a sum of weights, where the sum must be 100. In other words, the number of items is equal to the divisor, and its values are the quotient, to ensure (force) the sum of the vector to equal 100.
Something like this: 100/3=3.333333...
vector[0]=33.33
vector[1]=33.34
vector[2]=33.33
The sum of this needs to be exactly 100 (some sort of selective rounding?)
Another example: 100/6 = 16.66666667
vector[0]=16.67
vector[1]=16.67
vector[2]=16.66
vector[3]=16.67
vector[4]=16.67
vector[5]=16.66
I've seen something like this done in grocery stores where something on sale might be 3 for $11, so the register displays the prices like 3.67, 3.66, and so on.
The values must add up to exactly 100 though I was thinking of doing this with an epsilon but that wouldn't work.
const int divisor = 6;
const int dividend = 10;
std::vector<double> myVec;
myVec.resize(6);
for (int i = 0; i < divisor; ++i)
{
...some magic that I don't know how to do
}
EDIT: The client wants the values stored (and displayed) in values fixed at two decimal places to visually see they add to 100.
Like the comments say, store money in terms of cents.
#include <vector>
#include <iostream>
#include <iomanip>
std::vector<int> function(int divisor, int total) {
std::vector<int> myVec(divisor);
for (int i = 0; i < divisor; ++i) {
myVec[i] = total/divisor; //rounding down
if (i < total%divisor) //for each leftover
myVec[i] += 1; //add one of the leftovers
}
return myVec;
}
void print_dollars(int cents) {
std::cout << (cents/100) << '.';
std::cout << std::setw(2) << std::setfill('0') << (cents%100) << ' ';
}
int main() {
std::vector<int> r = function(6, 10000);
int sum=0;
for(int i=0; i<r.size(); ++i) {
print_dollars(r[i]);
sum += r[i];
}
std::cout << '\n';
print_dollars(sum);
}
//16.67 16.67 16.67 16.67 16.66 16.66
//100.00
When you divide 100 by 6, you get 16, with 4 leftover. This will put those 4 leftover in each of the first four slots of the vector. Proof of compilation: http://ideone.com/jrInai
There's no one "correct" way to do this. A place to start would be to add up the contents of the vector, and find the difference between 100 and that result you obtained. How you'd fold that in to the individual items would inherently be a heuristic. There are a couple of routes you could take:
Add the difference you found divided by the number of elements in the vector to each element in the vector. This has the advantage that it'll affect an individual value by the smallest amount possible in order to achieve your constraint.
You might want to just add the difference to the first or last element in the vector. This has the advantage that the fewest number of elements in the vector are modified.
You might want to list a separate rounding error element in the vector, which will just be the difference. This gives the most "correct" answer, but might not be what your users want.
Only you can decide what kind of heuristic to use based on the application you're building.
It should be noted that using floating point numbers (e.g. float, double, and long double) may result in errors when storing money values -- you should use fixed point decimal arithmetic for such calculations, because that's how money calculations are done in "the real world". Because floating point uses the base 2 number system internally (on most systems), there will be small rounding errors induced in the conversion from decimal to binary and back. You'll likely have no problem with small values, but if the dollar value is large you'll start seeing problems with the number of digits of precision available in a double.
You can divide into whatever is left as you go, subtracting the last value from the remaining amount.
const int divisor = 6;
const int dividend = 10;
std::vector<double> myVec;
myVec.reserve(6);
double remain = 100.0;
for (int i = divisor; i >= 1; --i)
{
double val = remain / (double)i;
remain -= val;
myVec.push_back(val);
}
In your example,
100/6=16.67 (rounded)
Then you just multiply it by 6-1=5 and get 83.35
And now you know that in order to make the sum to be exactly 100, you need to make the price of the last element to be equal to
100 - 83.35 = 16.65