How to obtain a value based on a certain probability - c++

I have some functions which generate double, float, short, long random values. I have another function to which I pass the datatype and which should return a random value. Now I need to choose in that function the return value based on the passed datatype. For example, if I pass float, I need:
the probability that the return is a float is 70%, the probability that the return is a double, short or long is 10% each. I can make calls to the other function for generating the corresponding random values, but how do I fit in the probabilistic weights for the final return? My code is in C++.
Some pointers are appreciated.
Thanks.

C++ random numbers have uniform distribution. If you need random variables of another distribution you need to base its mathematical formula on uniform distribution.
If you don't have a mathematical formula for your random variable you can do something like this:
int x = rand() % 10;
if (x < 7)
{
// return float
}
else (if x == 7)
{
// return double
}
else (if x == 8)
{
// return short
}
else (if x == 9)
{
// return long
}

This can serve as an alternative for future references which can
get the probability of precise values such as 99.999% or 0.0001%
To get probability(real percentage) do as such:
//70%
double probability = 0.7;
double result = rand() / RAND_MAX;
if(result < probability)
//do something
I have used this method to create very large percolated grids and it works like a charm for precision values.

I do not know if I understand correctly what you want to do, but if you just want to assure that the probabilities are 70-10-10-10, do the following:
generate a random number r in (1,2,3,4,5,6,7,8,9,10)
if r <= 7: float
if r == 8: short
if r == 9: double
if r == 10: long
I think you recognize and can adapt the pattern to arbitrary probability values.

mmonem has a nice probabilistic switch, but returning different types isn't trivial either. You need a single type that may adequately (for your purposes) encode any of the values - check out boost::any, boost::variant, union, or convert to the most capable type (probably double), or a string representation.

Related

Inputting Huge Binary Numbers C++

So I am working on a competitive programming problem where you take two numbers base two and base three, and then do some operations on them. I got the code implementing correctly, however it doesn't work with large inputs. For example, when I try to input 10010111011100101101100011011 (base two) and 211010102022001220 (base three). This is because I am inputting them as a regular integer, then converting them to their actual base 10 value.
This is my conversion function (just for bases 2 and 3)
int conv(int base, ll n){
int result = 0;
if(base == 2){
int a = 0;
while(n > 0){
if(n % 2 == 1){
result += pow(2, a);
}
a++;
n /= 10;
}
return result;
}
else if(base == 3){
int a = 0;
while(n > 0){
result += (n%10)%3 * pow(3, a);
a++;
n /= 10;
}
return result;
}
return result;
When I run this function on really small numbers like conv(2, 1010), it works; I get 10 (converting 1010 base 2 into base 10)
However, if I want to take in 10010111011100101101100011011(base 2), my code doesn't seem to work. Is there any other way I can do this?
Depending on what particular operations you want to perform on your base-2 and base-3 numbers, it may be feasible to just operate on string values. The advantage is that you are virtually unbounded.
Read your input numbers as strings. Write a function to verify that a given string is a valid number in a given base. Then write those operations to work on strings. It will be easy for addition and subtraction, but will require a few more steps for multiplication and division. If you need to handle trigonometry and transcendental functions, then it will be a difficult but doable math problem. Basically you want to implement exactly the steps you undergo when you perform the operations by hand, complete with carrying and borrowing.
There are different tricks of dealing with large int in cpp. In general you can use boost cpp_int.
cpp_int does not have a [theoretical] maximum value. Since C++ can directly use pointers, so the maximum size of cpp_int in C++ is proportional to the maximum memory range that a pointer can address—which in 64-bit architectures is usually-but-not-always 2^64-1; or in other words, the maximum value of cpp_int is 2^64-1, give-or-take an order of magnitude depending on how they implement the sign.

Geting a float [0,1] uniform distribution from xorshift64* (uint64_t)

I´m implementing the Xorshift generators and others to compare their performances on my system - Windows and Linux.
https://en.wikipedia.org/wiki/Xorshift
http://xoroshiro.di.unimi.it/
I´m just now checking the generators with 64 bit states, like the xorshift64star from the wikipedia (here with my changes to trace the error)
double xorshift64star() {
uint64_t x = global_state[0]; /* The state must be seeded with a nonzero value. */
x ^= x >> 12; // a
x ^= x << 25; // b
x ^= x >> 27; // c
global_state[0] = x;
auto u64val = x * 0x2545F4914F6CDD1D;
double dval = (double)u64val;
return dval;
}
However, running on an online compiler https://www.onlinegdb.com/ the double value returned is always 0 or 3.1148823182455562e-317
I haven´t been able to find a solution on how to make the output from this function to be normalize into a [0,1] uniform distribution, without losing much precision and entropy.
What is the "corect" transformation I would have to do the output?
SOLVED!
Thank you #RetiredNinja.
The generator was already normalizing the uint64 value. However only casting it to double don´t seem to work for that particular compiler.
The solution was to use the custom cast from http://xoroshiro.di.unimi.it/
static inline double to_double(uint64_t x) {
const union { uint64_t i; double d; } u = {.i = UINT64_C(0x3FF) << 52 | x >> 12 };
return u.d - 1.0;
}
Assuming that your u64val is uniform between 0 and numeric_limits<uint64_t>::max, the obvious transform is u64val/numeric_limits<uint64_t>::max.
This is not the most accurate transform, though. The problem here is that you end up generating multiples of 1.0/numeric_limits<uint64_t>::max. This obviously gives many small values a probability of zero. But consider this: the probabilities of all numbers between 0 and 1e-100 combined has to be 1e-100. That means you'd need to generate about 1e100 numbers to get any one of these numbers.
This basically means we've got an underspecified engineering problem here. Exactly how close should the approximation to uniform be?

Calculating Probability C++ Bernoulli Trials

The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.
You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)
You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.

Modulo operator usage while dealing with doubles

I have to find the value of ( 1+sqrt(3) )^n where n < 10^9.As this number can be very large we have to print the ans%1000000007.
I have written the following function for this.
double power(double x, int y)
{
double temp;
if( y == 0)
return 1;
temp = power(x, y/2);
if (y%2 == 0)
return temp*temp;
else
{
if(y > 0)
return x*temp*temp;
else
return (temp*temp)/x;
}
}
Now, I unable to understand how to take care of the modulo condition.Can somebody please help.
You can't do that. You could use fmod, but since sqrt(3) cannot be exactly represented, you'd get bogus values for large exponents.
I'm rather confident that you actually need integer results ((1 + sqrt(3))^n + (1 - sqrt(3))^n), so you should use integer math, exponentiation by squaring with a modulo operation at each step. cf. this question
This approach is infeasible. As shown in this question, you would need hundreds of millions of decimal digits more precision than the double type supplies. The problem you are actually trying to solve is discussed here. Are you two in the same class?
modulo needs integer type, you could use union for your double type unioned with an integer to use the modulo(if this is C)

C++: how can I test if a number is power of ten?

I want to test if a number double x is an integer power of 10. I could perhaps use cmath's log10 and then test if x == (int) x?
edit: Actually, my solution does not work because doubles can be very big, much bigger than int, and also very small, like fractions.
A lookup table will be by far the fastest and most precise way to do this; only about 600 powers of 10 are representable as doubles. You can use a hash table, or if the table is ordered from smallest to largest, you can rapidly search it with binary chop.
This has the advantage that you will get a "hit" if and only if your number is exactly the closest possible IEEE double to some power of 10. If this isn't what you want, you need to be more precise about exactly how you would like your solution to handle the fact that many powers of 10 can't be exactly represented as doubles.
The best way to construct the table is probably to use string -> float conversion; that way hopefully your library authors will already have solved the problem of how to do the conversion in a way that gives the most precise answer possible.
Your solution sounds good but I would replace the exact comparison with a tolerance one.
double exponent = log10(value);
double rounded = floor(exponent + 0.5);
if (fabs(exponent - rounded) < some_tolerance) {
//Power of ten
}
I am afraid you're in for a world of hurt. There is no way to cast down a very large or very small floating point number to a BigInt class because you lost precision when using the small floating point number.
For example float only has 6 digits of precision. So if you represent 109 as a float chances are it will be converted back as 1 000 000 145 or something like that: nothing guarantees what the last digits will be, they are off the precision.
You can of course use a much more precise representation, like double which has 15 digits of precision. So normally you should be able to represent integers from 0 to 1014 faithfully.
Finally some platforms may have a long long type with an ever greater precision.
But anyway, as soon as your value exceed the number of digits available to be converted back to an integer without loss... you can't test it for being a power of ten.
If you really need this precision, my suggestion is not to use a floating point number. There are mathematical libraries available with BigInt implementations or you can roll your own (though efficiency is difficult to achieve).
bool power_of_ten(double x) {
if(x < 1.0 || x > 10E15) {
warning("IEEE754 doubles can only precisely represent powers "
"of ten between 1 and 10E15, answer will be approximate.");
}
double exponent;
// power of ten if log10 of absolute value has no fractional part
return !modf(log10(fabs(x)), &exponent);
}
Depending on the platform your code needs to run on the log might be very expensive.
Since the amount of numbers that are 10^n (where n is natural) is very small,
it might be faster to just use a hardcoded lookup table.
(Ugly pseudo code follows:)
bool isPowerOfTen( int16 x )
{
if( x == 10 // n=1
|| x == 100 // n=2
|| x == 1000 // n=3
|| x == 10000 ) // n=4
return true;
return false;
}
This covers the whole int16 range and if that is all you need might be a lot faster.
(Depending on the platform.)
How about a code like this:
#include <stdio.h>
#define MAX 20
bool check_pow10(double num)
{
char arr[MAX];
sprintf(arr,"%lf",num);
char* ptr = arr;
bool isFirstOne = true;
while (*ptr)
{
switch (*ptr++)
{
case '1':
if (isFirstOne)
isFirstOne = false;
else
return false;
break;
case '0':
break;
case '.':
break;
default:
return false;
}
}
return true;
}
int main()
{
double number;
scanf("%lf",&number);
printf("isPower10: %s\n",check_pow10(number)?"yes":"no");
}
That would not work for negative powers of 10 though.
EDIT: works for negative powers also.
if you don't need it to be fast, use recursion. Pseudocode:
bool checkifpoweroften(double Candidadte)
if Candidate>=10
return (checkifpoweroften(Candidadte/10)
elsif Candidate<=0.1
return (checkifpoweroften(Candidadte*10)
elsif Candidate == 1
return 1
else
return 0
You still need to choose between false positives and false negatives and add tolerances accordingly, as other answers pointed out. The tolerances should apply to all comparisons, or else, for exemple, 9.99999999 would fail the >=10 comparison.
how about that:
bool isPow10(double number, double epsilon)
{
if (number > 0)
{
for (int i=1; i <16; i++)
{
if ( (number >= (pow((double)10,i) - epsilon)) &&
(number <= (pow((double)10,i) + epsilon)))
{
return true;
}
}
}
return false;
}
I guess if performance is an issue the few values could be precomputed, with or without the epsilon according to the needs.
A variant of this one:
double log10_value= log10(value);
double integer_value;
double fractional_value= modf(log10_value, &integer_value);
return fractional_value==0.0;
Note that the comparison to 0.0 is exact rather than within a particular epsilon since you want to ensure that log10_value is an integer.
EDIT: Since this sparked a bit of controversy due to log10 possibly being imprecise and the generic understanding that you shouldn't compare doubles without an epsilon, here's a more precise way of determining if a double is a power of 10 using only properties of powers of 10 and IEEE 754 doubles.
First, a clarification: a double can represent up to 1E22, as 1e22 has only 52 significant bits. Luckily, 5^22 also only has 52 significant bits, so we can determine if a double is (2*5)^n for n= [0, 22]:
bool is_pow10(double value)
{
int exponent;
double mantissa= frexp(value, &exponent);
int exponent_adjustment= exponent/10;
int possible_10_exponent= (exponent - exponent_adjustment)/3;
if (possible_10_exponent>=0 &&
possible_10_exponent<=22)
{
mantissa*= pow(2.0, exponent - possible_10_exponent);
return mantissa==pow(5.0, possible_10_exponent);
}
else
{
return false;
}
}
Since 2^10==1024, that adds an extra bit of significance that we have to remove from the possible power of 5.