Best way of testing randomized function

Best way of testing randomized function - unit-testing

Let's say I have a random number generator for 3 categories:
Prob
Yield
0.1
10
0.2
5
0.7
2
The expected yield is 1 + 1 + 1.4 = 3.4
Currently I have something like this
sum = 0
N = 10000
for i in 1 to N:
sum += getYeild()
assert(sum / N - 3.14 < some threshold)
So what is the best criterion I can use for this unit test to make an assertion?

Related

prime numbers algorithm efficiency

I have a question about prime numbers algorithm.
why in the following pseudo code i increases by 6 and not by 2 every iteration?
function is_prime(n)
if n ≤ 1
return false
else if n ≤ 3
return true
else if n mod 2 = 0 or n mod 3 = 0
return false
let i ← 5
while i * i ≤ n
if n mod i = 0 or n mod (i + 2) = 0
return false
i ← i + 6
return true
Thanks!

If it increased by 2 it would be testing almost everything twice, that wouldn't make any sense. So I assume you mean: how can it get away with not testing every odd number?
This is because every prime p greater than 3 is of the form 6n±1. Proof:
Consider the remainder r = p mod 6. Obviously r must be odd. Notice also that r cannot be 3, because then p would be divisible by 3, making it not a prime. This leaves only the possibilities 1 and 5, which correspond p being of the form 6n+1 or the form 6n-1 respectively.
The effect is that it avoid testing multiples of 3. Dividing by a multiple of 3 is redundant, because we already know that n is not a multiple of 3, so it cannot be the multiple of a multiple of 3 either.

The assignment in the loop body is i <- i + 6, not i <- i + 2. In the if statement the expression i + 2 just becomes a new value. There is no assignment operator in that expression.

The algorithm is based on the fact that prime numbers can be predicted using the formula 6k ± 1 and this does not apply on 2 and 3.
For instance
(6 * 1) - 1 = 5
(6 * 2) - 1 = 11
(6 * 3) - 1 = 17
The list goes on and on.

Big O Notation Confusion (C++)

int f(const std::vector<int>& v) {
int result = 0;
for (int i = 0; i < v.size(); ++i) { O(N)
for (int j = v.size(); j >= 0; j -= 2) { O(N/2)
result += v.at(i) * j;
}
}
return result;
}
The inner for loop is O(N/2), however I am wondering why this is because
For example, if v.size() is 10, then
10 >= 0 ✓
8 >= 0 ✓
6 >= 0 ✓
4 >= 0 ✓
2>= 0 ✓
0 >= 0 ✓
-2 Fails
The inner for loop could be executed 6 times with an input size of 10
What am I missing?
EDIT* I understand that only highest magnitude is taken into consideration. This question was more about coming up with the original O(N/2 + 1)

Complexity gives you a way to assess the magnitude of time it would take an input of certain size to complete, not the accurate time it would perform with.
Therefore, when dealing with complexity, you should only consider the highest magnitude, without constant multipliers:
O(N/2 + 1) = O(N/2) = O(N)

In a comment, you said:
I understand this, but I am just curious as to how O(N/2) is obtained
Take a look at the following table:
Size of vector Number of time the inner loop is executed:
0 1
1 1
2 2
3 2
...
100 51
101 51
...
2x x + 1
2x + 1 x + 1
If you take the constant 1 out of that equation, the inner loop is O(N/2).

How should I go about solving this recursion without trial and error

int sum_down(int x)
{
if (x >= 0)
{
x = x - 1;
int y = x + sum_down(x);
return y + sum_down(x);
}
else
{
return 1;
}
}
What is this smallest integer value of the parameter x, so that the returned value is greater than 1.000.000 ?
Right now I am just doing it by trial and error and since this question is asked via a paper format. I don't think I will have enough time to do trial and error. Question is, how do you guys visualise this quickly such that it can be solved easily. Thanks guys and I am new to programming so thanks in advance!

The recursion logic:
x = x - 1;
int y = x + sum_down(x);
return y + sum_down(x);
can be simplified to:
x = x - 1;
int y = x + sum_down(x) + sum_down(x);
return y;
which can be simplified to:
int y = (x-1) + sum_down(x-1) + sum_down(x-1);
return y;
which can be simplified to:
return (x-1) + 2*sum_down(x-1);
Put in mathematical form,
f(N) = (N-1) + 2*f(N-1)
with the recursion terminating when N is -1. f(-1) = 1.
Hence,
f(0) = -1 + 2*1 = 1
f(1) = 0 + 2*1 = 2
f(2) = 1 + 2*2 = 5
...
f(18) = 17 + 2*f(17) = 524269
f(19) = 18 + 2*524269 = 1048556

Your program can be written this way (sorry about c#):
public static void Main()
{
int i = 0;
int j = 0;
do
{
i++;
j = sum_down(i);
Console.Out.WriteLine("j:" + j);
} while (j < 1000000);
Console.Out.WriteLine("i:" + i);
}
static int sum_down(int x)
{
if (x >= 0)
{
return x - 1 + 2 * sum_down(x - 1);
}
else
{
return 1;
}
}
So at first iteration you'll get 2, then 5, then 12... So you can neglect the x-1 part since it'll stay little compared to the multiplication.
So we have:
i = 1 => sum_down ~= 4 (real is 2)
i = 2 => sum_down ~= 8 (real is 5)
i = 3 => sum_down ~= 16 (real is 12)
i = 4 => sum_down ~= 32 (real is 27)
i = 5 => sum_down ~= 64 (real is 58)
So we can say that sum_down(x) ~= 2^x+1. Then it's just basic math with 2^x+1 < 1 000 000 which is 19.

A bit late, but it's not that hard to get an exact non-recursive formula.
Write it up mathematically, as explained in other answers already:
f(-1) = 1
f(x) = 2*f(x-1) + x-1
This is the same as
f(-1) = 1
f(x+1) = 2*f(x) + x
(just switched from x and x-1 to x+1 and x, difference 1 in both cases)
The first few x and f(x) are:
x: -1 0 1 2 3 4
f(x): 1 1 2 5 12 27
And while there are many arbitrary complicated ways to transform this into a non-recursive formula, with easy ones it often helps to write up what the difference is between each two elements:
x: -1 0 1 2 3 4
f(x): 1 1 2 5 12 27
0 1 3 7 15
So, for some x
f(x+1) - f(x) = 2^(x+1) - 1
f(x+2) - f(x) = (f(x+2) - f(x+1)) + (f(x+1) - f(x)) = 2^(x+2) + 2^(x+1) - 2
f(x+n) - f(x) = sum[0<=i<n](2^(x+1+i)) - n
With eg. a x=0 inserted, to make f(x+n) to f(n):
f(x+n) - f(x) = sum[0<=i<n](2^(x+1+i)) - n
f(0+n) - f(0) = sum[0<=i<n](2^(0+1+i)) - n
f(n) - 1 = sum[0<=i<n](2^(i+1)) - n
f(n) = sum[0<=i<n](2^(i+1)) - n + 1
f(n) = sum[0<i<=n](2^i) - n + 1
f(n) = (2^(n+1) - 2) - n + 1
f(n) = 2^(n+1) - n - 1
No recursion anymore.

How about this :
int x = 0;
while (sum_down(x) <= 1000000)
{
x++;
}
The loop increments x until the result of sum_down(x) is superior to 1.000.000.
Edit : The result would be 19.
While trying to understand and simplify the recursion logic behind the sum_down() function is enlightening and informative, this snippet tend to be logical and pragmatic in that it does not try and solve the problem in terms of context, but in terms of results.

Two lines of Python code to answer your question:
>>> from itertools import * # no code but needed for dropwhile() and count()
Define the recursive function (See R Sahu's answer)
>>> f = lambda x: 1 if x<0 else (x-1) + 2*f(x-1)
Then use the dropwhile() function to remove elements from the list [0, 1, 2, 3, ....] for which f(x)<=1000000, resulting in a list of integers for which f(x) > 1000000. Note: count() returns an infinite "list" of [0, 1, 2, ....]
The dropwhile() function returns a Python generator so we use next() to get the first value of the list:
>>> next(dropwhile(lambda x: f(x)<=1000000, count()))
19

How to efficiently cycle consecutive numbers c++

I am looking for a solution for cycling through consecutive numbers based on an input value. Similar to modulo, but different for negative numbers. Is there a better solution compared to the inefficient code below? Here is some input/output examples:
Numbers range 0 to 2
-2 -> 1
-1 -> 2
0 -> 0
1 -> 1
2 -> 2
3 -> 0
4 -> 1
//Inefficient Code example
int getConsecutiveVal(int min, int max, int input) //Inclusive in this scenario
{
while (input>max)
input -= (1+max-min);
while (input<min)
input += (1+max-min);
return input;
}
//Incorrect Code example since func(0,2,-1) returns 2
int getConsecutiveVal(int min, int max, int input)
{
return (input % (1+max-min))+min;
}

To be able to increment or decrement, I used the following function. It's more than 1 line, but fewer math operations. It's similar in spirit to the original poster's format. Tested for positive and negative cases.
int16_t cycleIncDec(int16_t x, int16_t dir, int16_t xmin, int16_t xmax) {
// inc/dec with constrained range
// the supplied xmax must be greater than xmin
x += dir;
if (x > xmax) x = xmin;
else if (x < xmin) x = xmax;
return x;
}
Output of cycleIncDec() with various start values and step sizes
x: 11: +1 0 1 2 3 4 5 6 0 1 2 3
x: 4: -1 3 2 1 0 -1 -2 -3 -4 -5 -6 -7
x: -8: -1 -13 -12 -11 -10 -9 -8 -13 -12 -11 -10 -9
x:-190: -2 -192 -194 -196 -198 -200 -170 -172 -174 -176 -178 -180

In principle, you need the modulo operator. The problem is that in C it doesn't work as expected for negative numbers.
If you know the minimum input value, you can just add a positive number x big enough to transform all negative numbers to positive. It won't affect the result if x % R = 0 (in your example R=3.)
In your example, if you add, say, 3*10 to all inputs and perform the modulo operation you'll get the desired result:
mod(3*10+[-2 -1 0 1 2 3 4], 3)
= 1 2 0 1 2 0 1
(the above is matlab notation and is specialized to the example you have presented. I'll leave it to you to extend it to arbitrary min/max)
A specific formula for the case you have presented:
You have suggested using
((input+abs(input)*(1+max-min)) % (1+max-min))+min
However, this formula does not work. For two reasons:
First, if input=0, the abs() returns 0 and you get the minimum value as output (This is not always what your explicit while-based loop produces)
Second, you forgot to subtract min from the input before the operation.
So the correct formula is the following (using x for input):
(x - xmin + (1+abs(x))*(1+xmax-xmin)) % (1+xmax-xmin) + xmin

You can call % twice to get you the right behaviour, since a%b, for positive b, is guaranteed to lie in [-b+1, b+1].
int getConsecutiveVal(int min, int max, int input)
{
int range_len = (1 + max - min);
input -= min;
return (((input % range_len) + range_len) % range_len) + min;
}

C/C++: 1.00000 <= 1.0f = False

Can someone explain why 1.000000 <= 1.0f is false?
The code:
#include <iostream>
#include <stdio.h>
using namespace std;
int main(int argc, char **argv)
{
float step = 1.0f / 10;
float t;
for(t = 0; t <= 1.0f; t += step)
{
printf("t = %f\n", t);
cout << "t = " << t << "\n";
cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n";
}
printf("t = %f\n", t );
cout << "t = " << t << "\n";
cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n";
cout << "\n(1.000000 <= 1.0f) = " << (1.000000 <= 1.0f) << "\n";
}
The result:
t = 0.000000
t = 0
(t <= 1.0f) = 1
t = 0.100000
t = 0.1
(t <= 1.0f) = 1
t = 0.200000
t = 0.2
(t <= 1.0f) = 1
t = 0.300000
t = 0.3
(t <= 1.0f) = 1
t = 0.400000
t = 0.4
(t <= 1.0f) = 1
t = 0.500000
t = 0.5
(t <= 1.0f) = 1
t = 0.600000
t = 0.6
(t <= 1.0f) = 1
t = 0.700000
t = 0.7
(t <= 1.0f) = 1
t = 0.800000
t = 0.8
(t <= 1.0f) = 1
t = 0.900000
t = 0.9
(t <= 1.0f) = 1
t = 1.000000
t = 1
(t <= 1.0f) = 0
(1.000000 <= 1.0f) = 1

As correctly pointed out in the comments, the value of t is not actually the same 1.00000 that you are defining in the line below.
Printing t with higher precision with std::setprecision(20) will reveal its actual value: 1.0000001192092895508.
The common way to avoid these kinds of issues is to compare not with 1, but with 1 + epsilon, where epsilon is a very small number, that is maybe one or two magnitudes greater than your floating point precision.
So you would write your for loop condition as
for(t = 0; t <= 1.000001f; t += step)
Note that in your case, epsilon should be atleast ten times greater than the maximum possible floating point error, as the float is added ten times.
As pointed out by Muepe and Alain, the reason for t != 1.0f is that 1/10 can not be precisely represented in binary floating point numbers.

Floating point types in C++ (and most other languages) are implemented using an approach that uses the available bytes (for example 4 or 8) for the following 3 components:
Sign
Exponent
Mantissa
Lets have a look at it for a 32 bit (4 byte) type which often is what you have in C++ for float.
The sign is just a simple bit beeing 1 or 0 where 0 could mean its positive and 1 that its negative. If you leave every standardization away that exists you could also say 0 -> negative, 1 -> positive.
The exponent could use 8 bits. Opposed to our daily life this exponent is not ment to be used to the base 10 but base 2. That means 1 as an exponent does not correspond to 10 but to 2, and the exponent 2 means 4 (=2^2) and not 100 (=10^2).
Another important part is, that for floating point variables we also might want to have negative exponents like 2^-1 beeing 0.5, 2^-2 for 0.25 and so on. Thus we define a bias value that gets subtracted from the exponent and yields the real value. In this case with 8 bits we'd choose 127 meaning that an exponent of 0 gives 2^-127 and an exponent of 255 means 2^128. But, there is an exception to this case. Usually two values of the exponent are used to mark NaN and infinity. Therefore the real exponent is from 0 to 253 giving a range from 2^-127 to 2^126.
The mantissa obviously now fills up the remaining 23 bits. If we see the mantissa as a series of 0 and 1 you can imagine its value to be like 1.m where m is the series of those bits, but not in powers of 10 but in powers of 2. So 1.1 would be 1 * 2^0 + 1 * 2^-1 = 1 + 0.5 = 1.5. As an example lets have a look at the following mantissa (a very short one):
m = 100101 -> 1.100101 to base 2 -> 1 * 2^0 + 1 * 2^-1 + 0 * 2^-2 + 0 * 2^-3 + 1 * 2^-4 + 0 * 2^-5 + 1 * 2^-6 = 1 * 1 + 1 * 0.5 + 1 * 1/16 + 1 * 1/64 = 1.578125
The final result of a float is then calculated using:
e * 1.m * (sign ? -1 : 1)
What exactly is going wrong in your loop: Your step is 0.1! 0.1 is a very bad number for floating point numbers to base 2, lets have a look why:
sign -> 0 (as its non-negative)
exponent -> The first value smaller than 0.1 is 2^-4. So the exponent should be -4 + 127 = 123
mantissa -> For this we check how many times the exponent is 0.1 and then try to convert the fraction to a mantissa. 0.1 / (2^-4) = 0.1/0.0625 = 1.6. Considering the mantissa gives 1.m our mantissa should be 0.6. So lets convert that to binary:
0.6 = 1 * 2^-1 + 0.1 -> m = 1
0.1 = 0 * 2^-2 + 0.1 -> m = 10
0.1 = 0 * 2^-3 + 0.1 -> m = 100
0.1 = 1 * 2^-4 + 0.0375 -> m = 1001
0.0375 = 1 * 2^-5 + 0.00625 -> m = 10011
0.00625 = 0 * 2^-6 + 0.00625 -> m = 100110
0.00625 = 0 * 2^-7 + 0.00625 -> m = 1001100
0.00625 = 1 * 2^-8 + 0.00234375 -> m = 10011001
We could continue like thiw until we have our 23 mantissa bits but i can tell you that you get:
m = 10011001100110011001...
Therefore 0.1 in a binary floating point environment is like 1/3 is in a base 10 system. Its a periodic infinite number. As the space in a float is limited there comes the 23rd bit where it just has to cut of, and therefore 0.1 is a tiny bit greater than 0.1 as there are not all infinite parts of the number in the float and after 23 bits there would be a 0 but it gets rounded up to a 1.

The reason is that 1.0/10.0 = 0.1 can not be represented exactly in binary, just as 1.0/3.0 = 0.333.. can not be represented exactly in decimals.
If we use
float step = 1.0f / 8;
for example, the result is as expected.
To avoid such problems, use a small offset as shown in the answer of mic_e.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Best way of testing randomized function - unit-testing

Related

prime numbers algorithm efficiency

Big O Notation Confusion (C++)

How should I go about solving this recursion without trial and error

How to efficiently cycle consecutive numbers c++

C/C++: 1.00000 <= 1.0f = False

Categories

Resources