I am writing an in-depth test program for a data structure I had to write for a class. I am trying to time how long it takes functions to execute and store them in an array for later printing. To double check that it was working I decided to print it immediately, and I found out it is not working.
Here is the code where I get the times and store them in an array that is in a struct.
void test1(ArrayLinkedBag<ItemType> &bag,TestAnalytics &analytics){
clock_t totalStart;
clock_t incrementalStart;
clock_t stop; //Both timers stop at the same time;
// Start TEST 1
totalStart = clock();
bag.debugPrint();
cout << "Bag Should Be Empty, Checking..." << endl;
incrementalStart = clock();
checkEmpty<ItemType>(bag);
stop = clock();
analytics.test1Times[0] = analytics.addTimes(incrementalStart,stop);
analytics.test1Times[1] = analytics.addTimes(totalStart,stop);
cout << analytics.test1Times[0] << setprecision(5) << "ms" << endl;
std::cout << "Time: "<< setprecision(5) << (stop - totalStart) / (double)(CLOCKS_PER_SEC / 1000) << " ms" << std::endl;
cout << "===========================================" << endl; //So I can find the line easier
}
Here is the code where I am doing the calculation that I am putting in the array, this function is located in a TestAnalytics struct
double addTimes(double start, double stop){
return (stop - start)/ (double)(CLOCKS_PER_SEC/1000);
}
Here is a snippet of the output I am getting:
Current Head: -1
Current Size: 0
Cell: 1, Index: 0, Item: 6317568, Next Index: -2
Cell: 2, Index: 1, Item: 4098, Next Index: -2
Cell: 3, Index: 2, Item: 6317544, Next Index: -2
Cell: 4, Index: 3, Item: -683175280, Next Index: -2
Cell: 5, Index: 4, Item: 4201274, Next Index: -2
Cell: 6, Index: 5, Item: 6317536, Next Index: -2
Bag Should Be Empty, Checking...
The Bag Is Empty
0ms
Time: 0 ms
===========================================
I am trying to calculate the time as per a different post on this site.
I am using clang compiler on an UNIX system. Is it possible that the number is still too small to show above 0?
Unless you're stuck with an old (pre-C++ 11) compiler/library, I'd use the functions from the <chrono> header:
template <class ItemType>
void test1(ArrayLinkedBag<ItemType> &bag){
using namespace std::chrono;
auto start = high_resolution_clock::now();
bag.debugPrint();
auto first = high_resolution_clock::now();
checkEmpty(bag);
auto stop = high_resolution_clock::now();
std::cout << " first time: " << duration_cast<microseconds>(first - start).count() << " us\n";
std::cout << "second time: " << duration_cast<microseconds>(stop - start).count() << " us\n";
}
Some parts are a bit verbose (to put it nicely) but it still works reasonably well. duration_cast supports difference types down to (at least) nanoseconds, which is typically sufficient for timing even relatively small/fast pieces of code (though it's not guaranteed that it uses a timer with nanosecond precision).
In addition to Jerry's good answer (which I've upvoted), I wanted to add just a little more information that might be helpful.
For timing I recommend steady_clock over high_resolution_clock because steady_clock is guaranteed to not be adjusted (especially backwards) during your timing. Now on Visual Studio and clang, this can't possibly happen because high_resolution_clock and steady_clock are exactly the same type. However if you're using gcc, high_resolution_clock is the same type as system_clock, which is subject to being adjusted at any time (say by an NTP correction).
But if you use steady_clock, then on every platform you have a stop-watch-like timer: Not good for telling you the time of day, but not subject to being corrected at an inopportune moment.
Also, if you use my free, open-source, header-only <chrono> extension library, it can stream out durations in a much more friendly manner, without having to use duration_cast nor .count(). It will print out the duration units right along with the value.
Finally, if you call steady_clock::now() multiple times in a row (with nothing in between), and print out that difference, then you can get a feel for how precisely your implementation is able to time things. Can it time something as short as femtoseconds? Probably not. Is it as coarse as milliseconds? We hope not.
Putting this all together, the following program was compiled like this:
clang++ test.cpp -std=c++14 -O3 -I../date/include
The program:
#include "date/date.h"
#include <iostream>
int
main()
{
using namespace std::chrono;
using date::operator<<;
for (int i = 0; i < 100; ++i)
{
auto t0 = steady_clock::now();
auto t1 = steady_clock::now();
auto t2 = steady_clock::now();
auto t3 = steady_clock::now();
auto t4 = steady_clock::now();
auto t5 = steady_clock::now();
auto t6 = steady_clock::now();
std::cout << t1-t0 << '\n';
std::cout << t2-t1 << '\n';
std::cout << t3-t2 << '\n';
std::cout << t4-t3 << '\n';
std::cout << t5-t4 << '\n';
std::cout << t6-t5 << '\n';
}
}
And output for me on macOS:
150ns
80ns
69ns
53ns
63ns
64ns
88ns
54ns
66ns
66ns
59ns
56ns
59ns
69ns
76ns
74ns
73ns
73ns
64ns
60ns
58ns
...
Related
I'm just comparing the speed of a couple Fibonacci functions, one gives an output almost immediately and reads it got done in 500 nanoseconds, while the other, depending on the depth, may sit there loading for many seconds, yet when it is done, it will read that it took it only 100 nanoseconds... After I just sat there and waited like 20 seconds for it.
It's not a big deal as I can prove the other is slower just with raw human perception, but why would chrono not be working? Something to do with recursion?
PS I know that fibonacci2() doesn't give the correct output on odd numbered depths, I'm just testing some things and the output is actually just there so the compiler doesn't optimize it away or something. Go ahead and just copy this code and you'll see fibonacci2() immediately output but you'll have to wait like 5 seconds for fibonacci(). Thank you.
#include <iostream>
#include <chrono>
int fibonacci2(int depth) {
static int a = 0;
static int b = 1;
if (b > a) {
a += b; //std::cout << a << '\n';
}
else {
b += a; //std::cout << b << '\n';
}
if (depth > 1) {
fibonacci2(depth - 1);
}
return a;
}
int fibonacci(int n) {
if (n <= 1) {
return n;
}
return fibonacci(n - 1) + fibonacci(n - 2);
}
int main() {
int f = 0;
auto start2 = std::chrono::steady_clock::now();
f = fibonacci2(44);
auto stop2 = std::chrono::steady_clock::now();
std::cout << f << '\n';
auto duration2 = std::chrono::duration_cast<std::chrono::nanoseconds>(stop2 - start2);
std::cout << "faster function time: " << duration2.count() << '\n';
auto start = std::chrono::steady_clock::now();
f = fibonacci(44);
auto stop = std::chrono::steady_clock::now();
std::cout << f << '\n';
auto duration = std::chrono::duration_cast<std::chrono::nanoseconds>(stop - start);
std::cout << "way slower function with incorrect time: " << duration.count() << '\n';
}
I don't know what compiler you are using and with which compiler options, but I tested x64 msvc v19.28 using /O2 in godbolt. Here the compiled instructions are reordered such that it queries the perf_counter twice before invoking the fibonacci(int) function, which in code would look like
auto start = ...;
auto stop = ...;
f = fibonacci(44);
A solution to disallow this reordering might be to use a atomic_thread_fence just before and after the fibonacci function call.
As Mestkon answered the compiler can reorder your code.
Examples of how to prevent the compiler from reordering Memory Ordering - Compile Time Memory Barrier
It would be beneficial in the future if you provided information on what compiler you were using.
gcc 7.5 with -O2 for example does not reorder the timer instructions in this given scenario.
I'm trying to figure out how to time the execution of part of my program, but when I use the following code, all I ever get back is 0. I know that can't be right. The code I'm timing recursively implements mergesort of a large array of ints. How do I get the time it takes to execute the program in milliseconds?
//opening input file and storing contents into array
index = inputFileFunction(inputArray);
clock_t time = clock();//start the clock
//this is what needs to be timed
newRecursive.mergeSort(inputArray, 0, index - 1);
//getting the difference
time = clock() - time;
double ms = double(time) / CLOCKS_PER_SEC * 1000;
std::cout << "\nTime took to execute: " << std::setprecision(9) << ms << std::endl;
You can use the chrono library in C++11. Here's how you can modify your code:
#include <chrono>
//...
auto start = std::chrono::steady_clock::now();
// do whatever you're timing
auto end = std::chrono::steady_clock::now();
auto durationMS = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
std::cout << "\n Time took " << durationMS.count() << " ms" << std::endl;
If you're developing on OSX, this blog post from Apple may be useful. It contains code snippets that should give you the timing resolution you need.
While I realize this is probably one of many identical questions, I can't seem to figure out how to properly use std::chrono. This is the solution I cobbled together.
#include <stdlib.h>
#include <iostream>
#include <chrono>
typedef std::chrono::high_resolution_clock Time;
typedef std::chrono::milliseconds ms;
float startTime;
float getCurrentTime();
int main () {
startTime = getCurrentTime();
std::cout << "Start Time: " << startTime << "\n";
while(true) {
std::cout << getCurrentTime() - startTime << "\n";
}
return EXIT_SUCCESS;
}
float getCurrentTime() {
auto now = Time::now();
return std::chrono::duration_cast<ms>(now.time_since_epoch()).count() / 1000;
}
For some reason, this only ever returns integer values as the difference, which increments upwards at rate of 1 per second, but starting from an arbitrary, often negative, value.
What am I doing wrong? Is there a better way of doing this?
Don't escape the chrono type system until you absolutely have to. That means don't use .count() except for I/O or interacting with legacy API.
This translates to: Don't use float as time_point.
Don't bother with high_resolution_clock. This is always a typedef to either system_clock or steady_clock. For more portable code, choose one of the latter.
.
#include <iostream>
#include <chrono>
using Time = std::chrono::steady_clock;
using ms = std::chrono::milliseconds;
To start, you're going to need a duration with a representation of float and the units of seconds. This is how you do that:
using float_sec = std::chrono::duration<float>;
Next you need a time_point which uses Time as the clock, and float_sec as its duration:
using float_time_point = std::chrono::time_point<Time, float_sec>;
Now your getCurrentTime() can just return Time::now(). No fuss, no muss:
float_time_point
getCurrentTime() {
return Time::now();
}
Your main, because it has to do the I/O, is responsible for unpacking the chrono types into scalars so that it can print them:
int main () {
auto startTime = getCurrentTime();
std::cout << "Start Time: " << startTime.time_since_epoch().count() << "\n";
while(true) {
std::cout << (getCurrentTime() - startTime).count() << "\n";
}
}
This program does a similar thing. Hopefully it shows some of the capabilities (and methodology) of std::chrono:
#include <iostream>
#include <chrono>
#include <thread>
int main()
{
using namespace std::literals;
namespace chrono = std::chrono;
using clock_type = chrono::high_resolution_clock;
auto start = clock_type::now();
for(;;) {
auto first = clock_type::now();
// note use of literal - this is c++14
std::this_thread::sleep_for(500ms);
// c++11 would be this:
// std::this_thread::sleep_for(chrono::milliseconds(500));
auto last = clock_type::now();
auto interval = last - first;
auto total = last - start;
// integer cast
std::cout << "we just slept for " << chrono::duration_cast<chrono::milliseconds>(interval).count() << "ms\n";
// another integer cast
std::cout << "also known as " << chrono::duration_cast<chrono::nanoseconds>(interval).count() << "ns\n";
// floating point cast
using seconds_fp = chrono::duration<double, chrono::seconds::period>;
std::cout << "which is " << chrono::duration_cast<seconds_fp>(interval).count() << " seconds\n";
std::cout << " total time wasted: " << chrono::duration_cast<chrono::milliseconds>(total).count() << "ms\n";
std::cout << " in seconds: " << chrono::duration_cast<seconds_fp>(total).count() << "s\n";
std::cout << std::endl;
}
return 0;
}
example output:
we just slept for 503ms
also known as 503144616ns
which is 0.503145 seconds
total time wasted: 503ms
in seconds: 0.503145s
we just slept for 500ms
also known as 500799185ns
which is 0.500799 seconds
total time wasted: 1004ms
in seconds: 1.00405s
we just slept for 505ms
also known as 505114589ns
which is 0.505115 seconds
total time wasted: 1509ms
in seconds: 1.50923s
we just slept for 502ms
also known as 502478275ns
which is 0.502478 seconds
total time wasted: 2011ms
in seconds: 2.01183s
I have following C code:
uint64_t combine(uint32_t const sec, uint32_t const usec){
return (uint64_t) sec << 32 | usec;
};
uint64_t now3(){
struct timeval tv;
gettimeofday(&tv, NULL);
return combine((uint32_t) tv.tv_sec, (uint32_t) tv.tv_usec);
}
What this do it combine 32 bit timestamp, and 32 bit "something", probably micro/nanoseconds into single 64 bit integer.
I have really hard time to rewrite it with C++11 chrono.
This is what I did so far, but I think this is wrong way to do it.
auto tse = std::chrono::system_clock::now().time_since_epoch();
auto dur = std::chrono::duration_cast<std::chrono::nanoseconds>( tse ).count();
uint64_t time = static_cast<uint64_t>( dur );
Important note - I only care about first 32 bit to be "valid" timestamp.
Second 32 bit "part" can be anything - nano or microseconds - everything is good as long as two sequential calls of this function give me different second "part".
i want seconds in one int, milliseconds in another.
Here is code to do that:
#include <chrono>
#include <iostream>
int
main()
{
auto now = std::chrono::system_clock::now().time_since_epoch();
std::cout << now.count() << '\n';
auto s = std::chrono::duration_cast<std::chrono::seconds>(now);
now -= s;
auto ms = std::chrono::duration_cast<std::chrono::milliseconds>(now);
int si = s.count();
int msi = ms.count();
std::cout << si << '\n';
std::cout << msi << '\n';
}
This just output for me:
1447109182307707
1447109182
307
The C++11 chrono types use only one number to represent a time since a given Epoch, unlike the timeval (or timespec) structure which uses two numbers to precisely represent a time. So with C++11 chrono you don't need the combine() method.
The content of the timestamp returned by now() depends on the clock you use; there are tree clocks, described in http://en.cppreference.com/w/cpp/chrono :
system_clock wall clock time from the system-wide realtime clock
steady_clock monotonic clock that will never be adjusted
high_resolution_clock the clock with the shortest tick period available
If you want successive timestamps to be always different, use the steady clock:
auto t1 = std::chrono::steady_clock::now();
...
auto t2 = std::chrono::steady_clock::now();
assert (t2 > t1);
Edit: answer to comment
#include <iostream>
#include <chrono>
#include <cstdint>
int main()
{
typedef std::chrono::duration< uint32_t, std::ratio<1> > s32_t;
typedef std::chrono::duration< uint32_t, std::milli > ms32_t;
s32_t first_part;
ms32_t second_part;
auto t1 = std::chrono::nanoseconds( 2500000000 ); // 2.5 secs
first_part = std::chrono::duration_cast<s32_t>(t1);
second_part = std::chrono::duration_cast<ms32_t>(t1-first_part);
std::cout << "first part = " << first_part.count() << " s\n"
<< "seconds part = " << second_part.count() << " ms" << std::endl;
auto t2 = std::chrono::nanoseconds( 2800000000 ); // 2.8 secs
first_part = std::chrono::duration_cast<s32_t>(t2);
second_part = std::chrono::duration_cast<ms32_t>(t2-first_part);
std::cout << "first part = " << first_part.count() << " s\n"
<< "seconds part = " << second_part.count() << " ms" << std::endl;
}
Output:
first part = 2 s
seconds part = 500 ms
first part = 2 s
seconds part = 800 ms
Here it says that sleep_for "Blocks the execution of the current thread for at least the specified sleep_duration."
Here it says that sleep_until "Blocks the execution of the current thread until specified sleep_time has been reached."
So with this in mind I was simply doing my thing, until I noticed that my code was sleeping a lot shorter than the specified time. To make sure it was the sleep_ code being odd instead off me making dumb code again, I created this online example: https://ideone.com/9a9MrC
(code block below the edit line) When running the online example, it does exactly what it should be doing, but running the exact same code sample on my machine gives me this output: Output on Pastebin
Now I'm truly confused and wondering what the bleep is going wrong on my machine. I'm using Code::Blocks as IDE on a Win7 x64 machine in combination with This toolchain containing GCC 4.8.2.
*I have tried This toolchain before the current one, but this one with GCC 4.8.0 strangely enough wasn't even able to compile the example code.
What could create this weird behaviour? My machine? Windows? GCC? Something else in the toolchain?
p.s. The example also works as it should on Here, which states that it uses GCC version 4.7.2
p.p.s. using #include <windows.h> and Sleep( 1 ); also sleeps a lot shorter than the specified 1 millisecond on my machine.
EDIT: code from example:
#include <iostream> // std::cout, std::fixed
#include <iomanip> // std::setprecision
//#include <string> // std::string
#include <chrono> // C++11 // std::chrono::steady_clock
#include <thread> // C++11 // std::this_thread
std::chrono::steady_clock timer;
auto startTime = timer.now();
auto endTime = timer.now();
auto sleepUntilTime = timer.now();
int main() {
for( int i = 0; i < 10; ++i ) {
startTime = timer.now();
sleepUntilTime = startTime + std::chrono::nanoseconds( 1000000 );
std::this_thread::sleep_until( sleepUntilTime );
endTime = timer.now();
std::cout << "Start time: " << std::chrono::duration_cast<std::chrono::nanoseconds>( startTime.time_since_epoch() ).count() << "\n";
std::cout << "End time: " << std::chrono::duration_cast<std::chrono::nanoseconds>( endTime.time_since_epoch() ).count() << "\n";
std::cout << "Sleep till: " << std::chrono::duration_cast<std::chrono::nanoseconds>( sleepUntilTime.time_since_epoch() ).count() << "\n";
std::cout << "It took: " << std::chrono::duration_cast<std::chrono::nanoseconds>( endTime - startTime ).count() << " nanoseconds. \n";
std::streamsize prec = std::cout.precision();
std::cout << std::fixed << std::setprecision(9);
std::cout << "It took: " << ( (float) std::chrono::duration_cast<std::chrono::nanoseconds>( endTime - startTime ).count() / 1000000 ) << " milliseconds. \n";
std::cout << std::setprecision( prec );
}
std::cout << "\n\n";
for( int i = 0; i < 10; ++i ) {
startTime = timer.now();
std::this_thread::sleep_for( std::chrono::nanoseconds( 1000000 ) );
endTime = timer.now();
std::cout << "Start time: " << std::chrono::duration_cast<std::chrono::nanoseconds>( startTime.time_since_epoch() ).count() << "\n";
std::cout << "End time: " << std::chrono::duration_cast<std::chrono::nanoseconds>( endTime.time_since_epoch() ).count() << "\n";
std::cout << "It took: " << std::chrono::duration_cast<std::chrono::nanoseconds>( endTime - startTime ).count() << " nanoseconds. \n";
std::streamsize prec = std::cout.precision();
std::cout << std::fixed << std::setprecision(9);
std::cout << "It took: " << ( (float) std::chrono::duration_cast<std::chrono::nanoseconds>( endTime - startTime ).count() / 1000000 ) << " milliseconds. \n";
std::cout << std::setprecision( prec );
}
return 0;
}
Nothing is wrong with your machine, it is your assumptions that are wrong.
Sleeping is a very system-dependent and unreliable thing. Generally, on most operating systems, you have a more-or-less-guarantee that a call to sleep will delay execution for at least the time you ask for. The C++ thread library necessarily uses the facilities provided by the operating system, hence the wording in the C++ standard that you quoted.
You will have noted the wording "more-or-less-guarantee" in the above paragraph. First of all, the way sleeping works is not what you might think. It generally does not block until a timer fires and then resumes execution. Instead, it merely marks the thread as "not ready", and additionally does something so this can be undone later (what exactly this is isn't defined, it might be setting a timer or something else).
When the time is up, the operating system will set the thread to "ready to run" again. This doesn't mean it will run, it only means it is a candidate to run, whenever the OS can be bothered and whenever a CPU core is free (and nobobdy with higher priority wants it).
On traditional non-tickless operating systems, this will mean that the thread will probably (or more precisely, maybe) run at the next scheduler tick. That is, if CPU is available at all. On more modern operating systems (Linux 3.x or Windows 8) which are "tickless", you're a bit closer to reality, but you still do not have any hard guarantees.
Further, under Unix-like systems, sleep may be interrupted by a signal and may actually wait less than the specified time. Also, under Windows, the interval at which the scheduler runs is configurable, and to make it worse, different Windows versions behave differently [1] [2] in respect of whether they round the sleep time up or down.
Your system (Windows 7) rounds down, so indeed yes, you may actually wait less than what you expected.
tl;dr
Sleep is unreliable and only a very rough "hint" (not a requirement) that you wish to pass control back to the operating system for some time.