I have implemented a c++ method that calculates the maximum ulp error between an approximation and a reference function on a given interval. The approximation as well as the reference are calculated as single-precision floating point values. The method starts with the low bound of the interval and iterates over each existing single-precision value within the range.
Since there are a lot of existing values depending on the range that is chosen, I would like to estimate the total runtime of this method, and print it to the user.
I tried to execute the comparison several times to calculate the runtime of one iteration. My approach was to multiply the duration of one iteration with the total number of floats existing in the range. But obviously the execution time for one iteration is not constant but depends on the number of iterations, therefore my estimated duration is not accurate at all... Maybe one could adapt the total runtime calculation in the main loop?
My question is: Is there any other way to estimate the total runtime for this particular case?
Here is my code:
void FloatEvaluateMaxUlp(float(*testFunction)(float), float(*referenceFunction)(float), float lowBound, float highBound)
{
/*initialization*/
float x = lowBound, output, output_ref;
int ulp = 0;
long long duration = 0, numberOfFloats=0;
/*calculate number of floats between lowBound and highBound*/
numberOfFloats = *(int*)&highBound - *(int*)&lowBound;
/*measure execution time of 10 iterations*/
int iterationsToEstimateTime = 1000;
auto t1 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterationsToEstimateTime; i++)
{
printProgressInteger(i+1, iterationsToEstimateTime);
output = testFunction(x);
output_ref = referenceFunction(x);
int ulp_local = FloatCompareULP(output, output_ref);
if (abs(ulp_local) > abs(ulp))
ulp = ulp_local;
x= std::nextafter(x, highBound + 0.001f);
}
auto t2 = std::chrono::high_resolution_clock::now();
duration = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count();
duration /= iterationsToEstimateTime;
x = lowBound;
/*output of estimated time*/
std::cout <<std::endl<<std::endl<< " Number of floats: " << numberOfFloats << " Time per iteration: " << duration << " Estimated total time: " << numberOfFloats * duration << std::endl;
std::cout << " Starting test in range [" << lowBound << "," << highBound << "]." << std::endl;
long long count = 0;
/*record start time*/
t1 = std::chrono::high_resolution_clock::now();
for (count; x < highBound; count++)
{
printProgressInteger(count, numberOfFloats);
output = testFunction(x);
output_ref = referenceFunction(x);
int ulp_local = FloatCompareULP(output, output_ref);
if (abs(ulp_local) > abs(ulp))
ulp = ulp_local;
x = std::nextafter(x, highBound + 0.001f);
}
/*record stop time and compute duration*/
t2 = std::chrono::high_resolution_clock::now();
duration = std::chrono::duration_cast<std::chrono::microseconds>(t2 - t1).count();
/*result output*/
std::cout <<std::endl<< std::endl << std::endl << std::endl << "*********************************************************" << std::endl;
std::cout << " RESULT " << std::endl;
std::cout << "*********************************************************" << std::endl;
std::cout << " Iterations: " << count << " Total execution time: " << duration << std::endl;
std::cout << " Max ulp: " << ulp <<std::endl;
std::cout << "*********************************************************" << std::endl;
}
Related
I ran this code for a preliminary benchmark, which compares the time taken to generate a certain number of random states using the scale a double random value and using a Bernoulli distribution. The code is below:
int main()
{
std::random_device s;
std::mt19937 engine(s());
std::bernoulli_distribution bernp50(0.5000000000000000);
std::uniform_real_distribution<double> d;
long int limit = 10000000000; //10^10
int counter[2] = {0};
{
Timer bernstate("Bern Two States");
for(int i = limit; i>0; i--)
{
int tmp = bernp50(engine);
//Implicit bool to int conversion
counter[tmp]++;
}
}
cout << " Bern Two States - 0,1 \n\nCounter:\n" << "0: " <<
counter[0] <<"\n1: " << counter[1]<<"\n"
<< "Counter additions: " << counter[0] + counter[1] << "\n\n"
<< "\n0: " << (double)((double)counter[0]*100/(double)limit) << "%"
<< "\n1: " << (double)((double)counter[1]*100/(double)limit) << "%"
<< "\n\n" << endl;
counter[0]=0;
counter[1]=0;
{
Timer double_comp("Two State - Double");
for(int i = limit; i>0; i--)
{
double temp = d(engine)*2;
if(temp < 1)
{
counter[0]++;
}
else
{
counter[1]++;
}
}
}
cout << " Double Two States - 0,1 \n\nCounter:\n" << "0: " <<
counter[0] <<"\n1: " << counter[1]<<"\n"
<< "Counter additions: " << counter[0] + counter[1] << "\n\n"
<< "\n0: " << (double)((double)counter[0]*100/(double)limit) << "%"
<< "\n1: " << (double)((double)counter[1]*100/(double)limit) << "%"
<< "\n\n" << endl;
} //End of Main()
For limit = 10^10 I get the result, where the counter additions is greater than the limit variable. Same for 10^11:
Timer Object: Bern Two States Timer Object Destroyed: Bern Two States Duration Elapsed: 85.9409 s
Bern Two States - 0,1
Counter: 0: 705044031 1: 705021377 Counter additions: 1410065408
0: 7.05044% 1: 7.05021%
Timer Object: Two State - Double Timer Object Destroyed: Two State - Double Duration Elapsed: 87.6082 s
Double Two States - 0,1
Counter: 0: 705029886 1: 705035522 Counter additions: 1410065408
0: 7.0503% 1: 7.05036%
However, for limit = 10^9, the results are fine:
Timer Object: Bern Two States
Timer Object Destroyed: Bern Two States
Duration Elapsed: 62.5088 s
Bern Two States - 0,1
Counter:
0: 500005067
1: 499994933
Counter additions: 1000000000
0: 50.0005%
1: 49.9995%
Timer Object: Two State - Double
Timer Object Destroyed: Two State - Double
Duration Elapsed: 62.6709 s
Double Two States - 0,1
Counter:
0: 500015398
1: 499984602
Counter additions: 1000000000
0: 50.0015%
1: 49.9985%
Resolved: I actually used long int for the counters as well, but the problem was with the loop range iterator which was a 4-byte integer. The loop was actually messing up.
I am trying to compute the time history of the velocity described by the equation:
dV/dt = g − (C_d/m) * V^2. g = 9.81, m = 1.0, and C_d = 1.5.
To do this I need to create a program in c++ that uses the Euler explicit method to numerically solve the equation. I am trying to find the velocity from t = 0 to t = 1 seconds with three different step sizes of delta_t = 0.05, 0.1, and 0.2 seconds. And then you are supposed to show your percent error to the analytical solution given as: V(t) = sqrt((m*g)/C_d) * tanh(sqrt((g*C_d)/m) * t).
My problem is I am not sure how to iterate through Euler's method multiple times with different time intervals. So far I have solved the analytical equation, but am unsure where to go from here. If anyone could help point me in the right direction it would be greatly appreciated.
#include <iomanip>
#include <cmath>
#include <math.h>
using namespace std;
int main() {
double m = 1.0; // units in [kg]
double g = 9.81; // units in [m/s^2]
double C_d = 1.5; // units in [kg/m]
double t; // units in [s]
double v; // units in [m/s]
cout << "The velocity will be examined from the time t = 0 to t = 1 seconds." << endl;
cout << "Please select either 0.05, 0.1, or 0.2 to be the time interval:" << endl;
cin >> t;
cout << "You have chosen the time interval of: " << t << " seconds." << endl;
v = sqrt((m * g) / C_d) * tanh(sqrt((g * C_d) / m) * t);
cout << "The velecity at a time of "<< t << " seconds is equal to: " << v << " m/s." << endl;
return 0;
} ```
If you want to iterate over t with increments of A, calculating the result of the formula with each t, you would write a for loop.
#include <iostream>
int main()
{
double m = 1.0; // units in [kg]
double g = 9.81; // units in [m/s^2]
double C_d = 1.5; // units in [kg/m]
std::cout << "The velocity will be examined from the time t = 0 to t = 1 seconds." << std::endl;
std::cout << "Please select the time interval:" << std::endl;
std::cout << "1: 0.05" << std::endl;
std::cout << "2: 0.1" << std::endl;
std::cout << "3: 0.2" << std::endl;
double A = 0; // increment in for loop
int x;
std::cin >> x;
switch (x) { // check what the input is equal to
case 1: A = 0.05; break;
case 2: A = 0.1; break;
case 3: A = 0.2; break;
default: std::cout << "Unknown option!" << std::endl; return 1;
}
std::cout << "You have chosen the time interval of: " << A << " seconds." << std::endl;
std::cout << "Results of V(t):" << std::endl;
// this initializes a variable t as 0,
//and while t is lower than or equal to 1,
//it will increment it by a and execute the logic within the scope of the loop.
for (double t = 0; t < (1 + A); t += A) {
std::cout << "at t = " << t << ": " << sqrt((m*g) / C_d) * tanh(sqrt((g*C_d) / m) * t) << std::endl;
}
return 0;
}
Refer to https://beginnersbook.com/2017/08/cpp-for-loop/ for more information. Note: I've also introduced a switch statement into the code to prevent unknown values from being input. https://beginnersbook.com/2017/08/cpp-switch-case/
This question already has answers here:
Is cpu clock time returned by have to be exactly same among runs?
(3 answers)
Closed 5 years ago.
I got result of time measurements below for repeated computations for simple summation from my Windows machine with 3.2Ghz quad-core CPU and 24GB RAM.
The code is following.
From the result, the summation takes less than 3 ms most of time but sometimes it can take 20 times more. I can understand the large maximum because distribution of the time measurements is exponential having very long right tail.
But what I am not sure of are:
What is cause of the randomness (variation)? Note that I ran the application while CPU usage was 2-4% and memory was 10%.
Possible solution for the randomness. Is there any way to avoid the rare maximum duration?
Results
Time Statistics (ms)
N : 10000
Minimum: 2.31406
Maximum: 64.7171
Mean : 2.43556
Std : 0.676273
M+6Std : 3.11184
Code:
#include "stdafx.h"
#include <Windows.h>
#include <iostream>
int main()
{
LARGE_INTEGER t_start, t_end, Frequency;
double tdiff,minx=1e+307,maxx=-1e+307,meanx=0,stdx=0;
int niter = 10000;
for (int j = 0;j < niter;j++)
{
QueryPerformanceFrequency(&Frequency);
QueryPerformanceCounter(&t_start);
double s = 0;
for (int i = 0;i < 1000000;i++) s += i;
QueryPerformanceCounter(&t_end);
tdiff = (double)(t_end.QuadPart - t_start.QuadPart) / (double)Frequency.QuadPart * 1000;
minx = min(minx, tdiff);
maxx = max(maxx, tdiff);
meanx += tdiff;
stdx += tdiff*tdiff;
//std::cout << "Iteration: " << j << " Time (ms): " << tdiff << std::endl;
}
meanx /= (double)niter;
stdx = sqrt((stdx - (double)niter*meanx*meanx) / (double)(niter - 1));
std::cout << "Time Statistics (ms) " << std::endl << std::endl;
std::cout << "N : " << niter << std::endl;
std::cout << "Minimum: " << minx << std::endl;
std::cout << "Maximum: " << maxx << std::endl;
std::cout << "Mean : " << meanx << std::endl;
std::cout << "Std : " << stdx << std::endl;
std::cout << "M+6Std : " << meanx+stdx << std::endl;
return 0;
}
A general-purpose computing system has many tasks going on. At any moment, the system may have to respond to I/O interrupts (disk drive completion notices, timer interrupts, network activity,…) and run various housekeeping tasks (background backups, check for scheduled events, indexing user files,…).
The times at which they occur are effectively random. Measuring execution time repeatedly and discarding outliers is a common technique.
+++ See update below +++
This is a code for reverse printing the content of an array. I used 3 slightly different methods for doing it: directly putting the dimension of the array in the for loop, using iterator and using reverse_iterator and measured the execution time of printing the for loop.
#include <iostream>
#include <vector>
#include <chrono>
using get_time = std::chrono::high_resolution_clock;
int main() {
std::cout << "Enter the array dimension:";
int N;
std::cin >> N;
//Read the array elements
std::cout << "Enter the array elements:" <<'\n';
std::vector <int> v;
int input;
for(size_t i=0; i<N; i++){
std::cin >> input;
v.push_back(input);
}
auto start = get_time::now();
for(int i=N-1; i>=0; i--){
std::cout << v[i] <<" ";
}
auto finish = get_time::now();
auto time_diff=finish-start;
std::cout << "Elapsed time,non-iterator= " << std::chrono::duration<double>
(time_diff).count() << " Seconds" << '\n';
auto start2 = get_time::now();
std::vector <int>::reverse_iterator ri;
for(ri=v.rbegin(); ri!=v.rend(); ri++){
std::cout << *ri <<" ";
}
auto finish2 = get_time::now();
auto time_diff2=finish2-start2;
std::cout << "Elapsed time, reverse iterator= " << std::chrono::duration<double>
(time_diff2).count() << " Seconds" << '\n';
auto start3 = get_time::now();
std::vector <int>::iterator i;
for(i=v.end()-1; i>=v.begin(); i--){
std::cout << *i <<" ";
}
auto finish3 = get_time::now();
auto time_diff3=finish3-start3;
std::cout << "Elapsed time, iterator= " << std::chrono::duration<double>
(time_diff3).count() << " Seconds" << '\n';
return 0;
}
The output is as follows:
Output:
5 4 3 2 1 Elapsed time,non-iterator= 2.7913e-05 Seconds
5 4 3 2 1 Elapsed time, reverse iterator= 5.57e-06 Seconds
5 4 3 2 1 Elapsed time, iterator= 4.56e-06 Seconds
My question is:
Why the direct method is almost 5 times slower than both iterator and reverse_iterator methods? Also, is this faster execution of iterator machine dependent?
This is a prototype, but I will need to deal with much bigger matrices; that is why I am asking this question. Thank you.
+++ Update +++
I am posting the updated results after incorporating the comments. It was too big for a comment.
I changed the for loop to evaluate the sum of an array with 100000 elements. I evaluated the same sum using the above mentioned methods (compiled with -O3 in clang++) and I have averaged the execution time for 3 methods over 10000 runs. Here are the results:
Average (10000 runs) elapsed time, non-iterator= 2.50183e-05
Average (10000 runs) elapsed time, reverse-iterator= 3.48299e-05
Average (10000 runs) elapsed time, iterator= 7.35307e-05
The results are much more uniform now, and now the non-iterator method is the fastest! Any insights? Or even this result is meaningless and I should do some more test?
the updated code:
#include <iostream>
#include <vector>
#include <chrono>
using get_time = std::chrono::high_resolution_clock;
int main() {
double time1,time2,time3;
int run=10000;
for(int k=0; k<run; k++){
//Read the array elements
std::vector <int> v;
int input,N=100000;
for(size_t i=0; i<N; i++){
v.push_back(i);
}
int sum1{0},sum2{0},sum3{0};
auto start = get_time::now();
for(int i=N-1; i>=0; i--){
sum1+=v[i];
}
auto finish = get_time::now();
auto time_diff=finish-start;
std::cout << "Sum= " << sum1 << " " << "Elapsed time,non-iterator= " << std::chrono::duration<double>
(time_diff).count() << " Seconds" << '\n';
auto start2 = get_time::now();
std::vector <int>::reverse_iterator ri;
for(ri=v.rbegin(); ri!=v.rend(); ri++){
sum2+=*ri;
}
auto finish2 = get_time::now();
auto time_diff2=finish2-start2;
std::cout << "Sum= " << sum2 <<" Elapsed time, reverse iterator= " << std::chrono::duration<double>
(time_diff2).count() << " Seconds" << '\n';
auto start3 = get_time::now();
std::vector <int>::iterator i;
for(i=v.end()-1; i>=v.begin(); i--){
sum3+=*i;
}
auto finish3 = get_time::now();
auto time_diff3=finish3-start3;
std::cout << "Sum= " <<sum3 << " Elapsed time, iterator= " << std::chrono::duration<double>
(time_diff3).count() << " Seconds" << '\n';
time1+=std::chrono::duration<double>(time_diff).count();
time2+=std::chrono::duration<double>(time_diff2).count();
time3+=std::chrono::duration<double>(time_diff3).count();
}
std::cout << "Average (" << run << " runs)" << " elapsed time, non-iterator= " << time1/double(run) <<'\n';
std::cout << "Average (" << run << " runs)" << " elapsed time, reverse-iterator= " << time2/double(run) <<'\n';
std::cout << "Average (" << run << " runs)" << " elapsed time, iterator= " << time3/double(run) <<'\n';
return 0;
}
I am having an issue with primitive types using built in operators. All of my operators work for all datatypes except for float and (un)signed long long int.
Why is it wrong even when multiplying by one? Also, why does +10 and -10 give the same number as +1, -1, /1, and *1.
The number 461168601 was chosen because it fits within the max float and max signed long long int.
Ran the following code and got the following output:
fmax : 340282346638528859811704183484516925440
imax : 9223372036854775807
i : 461168601
f : 10
f2 : 1
461168601 / 10 = 46116860
461168601 + 10 = 461168608
461168601 - 10 = 461168608
461168601 * 1 = 461168608
461168601 / 1 = 461168608
461168601 + 1 = 461168608
461168601 - 1 = 461168608
The following code can be ran here.
#include <iostream>
#include <sstream>
#include <iomanip>
#include <limits>
#define fmax std::numeric_limits<float>::max()
#define imax std::numeric_limits<signed long long int>::max()
int main()
{
signed long long int i = 461168601;
float f = 10;
float f2 = 1;
std::cout << std::setprecision(40);
std::cout <<"fmax : " << fmax << std::endl;
std::cout <<"imax : " << imax << std::endl;
std::cout <<"i : " << i << std::endl;
std::cout <<"f : " << f << std::endl;
std::cout <<"f2 : " << f2 << std::endl;
std::cout <<std::endl;
std::cout << i << " / " << f << " = " << i / f << std::endl;
std::cout << i << " + " << f << " = " << i + f << std::endl;
std::cout << i << " - " << f << " = " << i - f << std::endl;
std::cout <<std::endl;
std::cout << i << " * " << f2 << " = " <<i * f2 << std::endl;
std::cout << i << " / " << f2 << " = " << i / f2 << std::endl;
std::cout << i << " + " << f2 << " = " << i + f2 << std::endl;
std::cout << i << " - " << f2 << " = " << i - f2 << std::endl;
}
The error is caused by the too big difference between 4611686018427387904 and 1 or 10. You should never sum numbers with a such difference, because actual difference between two closest floating point numbers grows with exponent value.
When two floating point numbers are added, the first of all they are aligned to the same exponent value (the bigger one), so before operation you have e.g. 1e10 and 1e-10 and after alignment you have 1e10 and 0e10 the result is 1e10.
Dug around some and found this article.
Casting opens up its own can of worms. You have to be careful, because your float might not have enough precision to preserve an entire integer. A 32-bit integer can represent any 9-digit decimal number, but a 32-bit float only offers about 7 digits of precision. So if you have large integers, making this conversion will clobber them. Thankfully, doubles have enough precision to preserve a whole 32-bit integer (notice, again, the analogy between floating point precision and integer dynamic range). Also, there is some overhead associated with converting between numeric types, going from float to int or between float and double.
So, essentially once the whole part of a number reaches about more than seven digits, the float begins to shift the number to keep the whole part of the number about seven digits. When this shifting of the decimal place occurs, the number begins to reach the floating point inaccuracy.