How to create an effecient portable timer in C++? - c++

For a school project I need to (re)create a fully functional multi-player version of R-Type without the use of the following external libraries:
Boost
SFML/SDL
Qt
Use of C++11 not allowed
Moreover, this game must be fully portable between Fedora(Linux) and Windows. I am in charge of the server so the use of any graphic libraries is strictly prohibited.
In order to create a correct game loop I need a correct Timer class, similar as those found in the SDL which implements GetTicks() or GetElapsedTime() methods. But I asked myself what would be the best way to create such a Class, so far this is how I would start:
Creating a threaded-class using pthread(which is portable)
Using the functions time() and difftime() in a loop to determine how much time was elapsed since the last tick.
Knowing that this class will be used by dozens of instances playing at the same time, should I use the Singleton Design Pattern? Will this methods be accurate?
EDIT: Changed the explanation of my question to fit more my needs and to be more accurate on what I am allowed to use or not.

There's not an easy way to do what you're thinking. Luckily, there are easy ways to do what you want.
First: Using the functions time() and difftime() in a loop to determine how much time was elapsed That's a terrible idea. That will use 100% of one of your CPUs and thus slow your program to a crawl. If you want to wait a specific amount of time (a "tick" of 1/60 of a second, or 1/10 of a second), then just wait. Don't spin a thread.
header:
long long get_time();
long long get_freq();
void wait_for(long long nanoseconds);
cpp:
#ifdef _MSC_VER //windows compiler for windows machines
long long get_time() {
LARGE_INTEGER r;
QueryPerformanceCounter(r);
return r.QuadPart;
}
long long get_freq() {
LARGE_INTEGER r;
QueryPerformanceFrequency(r);
return r.QuadPart;
}
void wait_for(long long nanoseconds)
{
Sleep(nanoseconds / 1000000);
}
#endif
#ifdef __GNUC__ //linux compiler for linux machines
long long get_time() {
timespec r
clock_gettime(CLOCK_MONOTONIC, &r);
return long long(r.seconds)*1000000000 + r.nanoseconds;
}
long long get_freq() {
timespec r
clock_getres(CLOCK_MONOTONIC, &r);
return r.nanoseconds;
}
void wait_for(long long nanoseconds)
{
timespec r = {nanoseconds/1000000000, nanoseconds%1000000000};
nanosleep(&r, NULL);
}
#endif
None of this is perfect (especially since I don't code for linux), but this is the general concept whenever you have to deal with the OS (since it isn't in the standard and you cant use libraries). The Windows and GCC implementations can be in separate files if you like

Given the spec pthreads are out, not going to run on windows, not included in the standard.
If you can use C++11 you can use std::chrono for the timer this is a high precision timer, with a fairly intuitive interface. It has basically been lifted from boost (as has thread), so most of the documentation for boost translate to std::chrono.
(or for low precision just use the C time library) and for threads you can use std::thread.
N.B. these elements of the standard library and you just create test on your platforms to make sure the stdlib you are using supports them (you will need to enable c++11 - usually --std=c++0x)
I know for sure that gcc 4.6 has the majority of thread and chrono in and seems to be stable.

You probably want to create a wrapper around gettimeofday for Linux which returns the number of microseconds since the Epoch, and GetTickCount for Windows which returns the number of milliseconds since the system was started.
You can also use clock() on Windows which will return seconds * CLOCKS_PER_SEC (yes, wall-clock time not CPU time) since the process started.

To get a wall time you could use QueryPerformanceCounter on Windows and clock_gettime() with CLOCK_MONOTONIC on POSIX systems (CLOCK_MONOTONIC_RAW Linux 2.6.28+).

Related

clock_t() overflow on 32-bit machine

For statistical purposes I want to accumulate the whole CPU-time used for a function of a program, in microseconds. It must work in two systems, one where sizeof(clock_t) = 8 (RedHat) and another one where sizeof(clock_t) = 4 (AIX). In both machines clock_t is a signed integer type and CLOCKS_PER_SEC = 1000000 (= one microsecond, but I don't do such assumption in code and use the macro instead).
What I have is equivalent to something like this (but encapsulated in some fancy classes):
typedef unsigned long long u64;
u64 accum_ticks = 0;
void f()
{
clock_t beg = clock();
work();
clock_t end = clock();
accum_ticks += (u64)(end - beg); // (1)
}
u64 elapsed_CPU_us()
{
return accum_tick * 1e+6 / CLOCKS_PER_SEC;
}
But, in the 32-bit AIX machine where clock_t is an int, it will overflow after 35m47s. Suppose that in some call beg equals 35m43s since the program started, and work() takes 10 CPU-seconds, causing end to overflow. Can I trust line (1) for this and subsequental calls to f() from now on? f() is guaranteed to never take more than 35 minutes of execution, of course.
In case I can't trust line (1) at all even in my particular machine, what alternatives do I have that doesn't imply importing any third-party library? (I can't copy-paste libraries to the system and I can't use <chrono> because in our AIX machines it isn't available).
NOTE: I can use kernel headers and the precision I need is in microseconds.
An alternate suggestion: Don't use clock. It's so underspecified it's nigh impossible to write code that will work fully portably, handling possible wraparound for 32 bit integer clock_t, integer vs. floating point clock_t, etc. (and by the time you write it, you've written so much ugliness you've lost whatever simplicity clock provided).
Instead, use getrusage. It's not perfect, and it might do a little more than you strictly need, but:
The times it returns are guaranteed to operate relative to 0 (where the value returned by clock at the beginning of a program could be anything)
It lets you specify if you want to include stats from child processes you've waited on (clock either does or doesn't, in a non-portable fashion)
It separates the user and system CPU times; you can use either one, or both, your choice
Each time is expressed explicitly in terms of a pair of values, a time_t number of seconds, and a suseconds_t number of additional microseconds. Since it doesn't try to encode a total microsecond count into a single time_t/clock_t (which might be 32 bits), wraparound can't occur until you've hit at least 68 years of CPU time (if you manage that, on a system with 32 bit time_t, I want to know your IT folks; only way I can imagine hitting that is on a system with hundreds of cores, running weeks, and any such system would be 64 bit at this point).
The parts of the result you need are specified by POSIX, so it's portable to just about everywhere but Windows (where you're stuck writing preprocessor controlled code to switch to GetProcessTimes when compiled for Windows)
Conveniently, since you're on POSIX systems (I think?), clock is already expressed as microseconds, not real ticks (POSIX specifies that CLOCKS_PER_SEC equals 1000000), so the values already align. You can just rewrite your function as:
#include <sys/time.h>
#include <sys/resource.h>
static inline u64 elapsed(const struct timeval *beg, const struct timeval *end)
{
return (end->tv_sec - beg->tv_sec) * 1000000ULL + (end->tv_usec - beg->tv_usec);
}
void f()
{
struct rusage beg, end;
// Not checking return codes, because only two documented failure cases are passing
// an unmapped memory address for the struct addr or an invalid who flag, neither of which
// we're doing, easily verified by inspection
getrusage(RUSAGE_SELF, &beg);
work();
getrusage(RUSAGE_SELF, &end);
accum_ticks += elapsed(&beg.ru_utime, &end.ru_utime);
// And if you want to include system time as well, add:
accum_ticks += elapsed(&beg.ru_stime, &end.ru_stime);
}
u64 elapsed_CPU_us()
{
return accum_ticks; // It's already stored natively in microseconds
}
On Linux 2.6.26+, you can replace RUSAGE_SELF with RUSAGE_THREAD to limit to the resources used solely by the calling thread alone, not just the calling process (which might help if other threads are doing unrelated work and you don't want their stats polluting yours), in exchange for less portability.
Yes, it's a little more work to compute the time (two additions/subtractions, one multiplications by a constant, doubled if you want both user and system time, where clock in the simplest usage is a single subtraction), but:
Handling clock wraparound adds more work (and branches work, which this code doesn't have; admittedly, it's a fairly predictable branch), narrowing the gap
Integer multiplication is roughly as cheap as addition and subtraction on modern chips (the latest x86-64 chips perform integer multiply in a single clock cycle), so you're not adding orders of magnitude more work, and in exchange, you get more control, more guarantees, and greater portability
Note: You might see code using clock_gettime with clock ID CLOCK_PROCESS_CPUTIME_ID, which would simplify your code when you just want total CPU time, not split up by user vs. system, without all the other stuff getrusage provides (perhaps it would be faster, simply by virtue of retrieving less data). Unfortunately, while clock_gettime is guaranteed by POSIX, the CLOCK_PROCESS_CPUTIME_ID clock ID is not, so you can't use it reliably on all POSIX systems (FreeBSD at least seems to lack it). All the parts of getrusage we're relying on are fully standard, so it's safe.

How to time event in C++?

I'd like to be able to get number of nanoseconds it takes to do something in my C++ program.
Object creation, time for a function to do it's thing etc.
In Java, we'd do something along the lines of:
long now = System.currentTimeMillis();
// stuff
long diff = (System.currentTimeMillis() - now);
How would you do the same in C++?
The <chrono> library in standard C++ provides the best way to do this. This library provides a standard, type safe, generic API for clocks.
#include <chrono>
#include <iostream>
int main() {
using std::chrono::duration_cast;
using std::chrono::nanoseconds;
typedef std::chrono::high_resolution_clock clock;
auto start = clock::now();
// stuff
auto end = clock::now();
std::cout << duration_cast<nanoseconds>(end-start).count() << "ns\n";
}
The actual resolution of the clock will vary between implementations, but this code will always show results in nanoseconds, as accurately as possible given the implementation's tick period.
In C++11 you can do it using chrono library where -
Class template std::chrono::duration represents a time interval.
It consists of a count of ticks of type Rep and a tick period, where the tick period is a compile-time rational constant representing the number of seconds from one tick to the next.
Currently implemented in GCC 4.5.1. (not yet in VC++). See sample code from cppreference.com on Ideone.com execution time of a function call
Take a look at clock and clock_t. For the resolution you're talking about, I don't think there's native support in C++. To get meaningful values, you'll have to time multiple calls or constructions, or use a profiler (desired).
I asked this exact question earlier today. The best solution I have at the moment is to use SDL, and call:
uint32 a_time = SDL_GetTicks(); // Return uint32 count of milliseconds since SDL_Init was called
Although this is probably going to give you lots of overhead, even if you just init SDL with the timer functionality. (SDL_Init(SDL_INIT_TIMER).
Hope this helps you - I settled for this as a solution because it is portable.
Asked and answered many times.
How do I do High Resolution Timing in C++ on Windows?
C++ obtaining milliseconds time on Linux — clock() doesn't seem to work properly
High Resolution Timing Part of Your Code
High resolution timer with C++ and Linux?
If you're using C++11 you can consider chrono.

What is the best, most accurate timer in C++?

What is the best, most accurate timer in C++?
In C++11 you can portably get to the highest resolution timer with:
#include <iostream>
#include <chrono>
#include "chrono_io"
int main()
{
typedef std::chrono::high_resolution_clock Clock;
auto t1 = Clock::now();
auto t2 = Clock::now();
std::cout << t2-t1 << '\n';
}
Example output:
74 nanoseconds
"chrono_io" is an extension to ease I/O issues with these new types and is freely available here.
There is also an implementation of <chrono> available in boost (might still be on tip-of-trunk, not sure it has been released).
The answer to this is platform-specific. The operating system is responsible for keeping track of timing and consequently, the C++ language itself provides no language constructs or built-in functions for doing this.
However, here are some resources for platform-dependent timers:
Windows API - SetTimer: http://msdn.microsoft.com/en-us/library/ms644906(v=vs.85).aspx
Unix - setitimer: http://linux.die.net/man/2/setitimer
A cross-platform solution might be boost::asio::deadline_timer.
Under windows it would be QueryPerformanceCounter, though seeing as you didn't specify any conditions it possible to have an external ultra high resolution timer that has a c++ interface for the driver
The C++ standard doesn't say a whole lot about time. There are a few features inherited from C via the <ctime> header.
The function clock is the only way to get sub-second precision, but precision may be as low as one second (it is defined by the macro CLOCKS_PER_SEC). Also, it does not measure real time at all, but processor time.
The function time measures real time, but (usually) only to the nearest second.
To measure real time with subsecond precision, you need a nonstandard library.

clock() vs getsystemtime()

I developed a class for calculations on multithreads and only one instance of this class is used by a thread. Also I want to measure the duration of calculations by iterating over a container of this class from another thread. The application is win32. The thing is I have read QueryPerformanceCounter is useful when comparing the measuremnts on a single thread. Because I can not use it my problem, I think of clock() or GetSystemTime(). It is sad that both methods have a 'resolution' of milliseconds (since CLOCKS_PER_SEC is 1000 on win32). Which method should I use or to generalize, is there a better option for me?
As a rule I have to take the measurements outside the working thread.
Here is some code as an example.
unsinged long GetCounter()
{
SYSTEMTIME ww;
GetSystemTime(&ww);
return ww.wMilliseconds + 1000 * ww.wSeconds;
// or
return clock();
}
class WorkClass
{
bool is_working;
unsigned long counter;
HANDLE threadHandle;
public:
DoWork()
{
threadHandle = GetCurrentThread();
is_working = true;
counter = GetCounter();
// Do some work
is_working = false;
}
};
void CheckDurations() // will work on another thread;
{
for(size_t i =0;i < vector_of_workClass.size(); ++i)
{
WorkClass & wc = vector_of_workClass[i];
if(wc.is_working)
{
unsigned long dur = GetCounter() - wc.counter;
ReportDuration(wc,dur);
if( dur > someLimitValue)
TerminateThread(wc.threadHandle);
}
}
}
QueryPerformanceCounter is fine for multithreaded applications. The processor instruction that may be used (rdtsc) can potentially provide invalid results when called on different processors.
I recommend reading "Game Timing and Multicore Processors".
For your specific application, the problem it appears you are trying to solve is using a timeout on some potentially long-running threads. The proper solution to this would be to use the WaitForMultipleObjects function with a timeout value. If the time expires, then you can terminate any threads that are still running - ideally by setting a flag that each thread checks, but TerminateThread may be suitable.
both methods have a precision of milliseconds
They don't. They have a resolution of a millisecond, the precision is far worse. Most machines increment the value only at intervals of 15.625 msec. That's a heckofalot of CPU cycles, usually not good enough to get any reliable indicator of code efficiency.
QPF does much better, no idea why you couldn't use it. A profiler is a the standard tool to measure code efficiency. Beats taking dependencies you don't want.
QueryPerformanceCounter should give you the best precision, but there is issues when the function get run on different processors (you get a different result for each processor). So when running in a thread you will experience shifts when the thread switch processor. To solve this you can set processor affinity for the thread that measures time.
GetSystemTime gets an absolute time, clock is a relative time but both measure elapsed time, not CPU time related to the actual thread/process.
Of course clock() is more portable. Having said that I use clock_gettime on Linux because I can get both elapsed and thread CPU time with that call.
boost has some time functions that you could use that will run on multiple platforms if you want platform independent code.

C++ high precision time measurement in Windows

I'm interested in measuring a specific point in time down to the nanosecond using C++ in Windows. Is this possible? If it isn't, is it possible to get the specific time in microseconds at least?. Any library should do, unless I suppose it's possible with managed code.
thanks
If you have a threaded application running on a multicore computer QueryPerformanceCounter can (and will) return different values depending on which core the code is executing on. See this MSDN article. (rdtsc has the same problem)
This is not just a theoretical problem; we ran into it with our application and had to conclude that the only reliable time source is timeGetTime which only has ms precision (which fortunately was sufficient in our case). We also tried fixating the thread affinity for our threads to guarantee that each thread always got a consistent value from QueryPerformanceCounter, this worked but it absolutely killed the performance in the application.
To sum things up there isn't a reliable timer on windows that can be used to time thing with micro second precision (at least not when running on a multicore computer).
Windows has a high-performance counter API.
You need to get the ticks form QueryPerformanceCounter and divide by the frequency of the processor, provided by QueryPerformanceFrequency.
LARGE_INTEGER frequency;
if (::QueryPerformanceFrequency(&frequency) == FALSE)
throw "foo";
LARGE_INTEGER start;
if (::QueryPerformanceCounter(&start) == FALSE)
throw "foo";
// Calculation.
LARGE_INTEGER end;
if (::QueryPerformanceCounter(&end) == FALSE)
throw "foo";
double interval = static_cast<double>(end.QuadPart - start.QuadPart) / frequency.QuadPart;
This interval should be in seconds.
For future reference, with Windows Vista, 2008 and higher, Windows requires the hardware support "HPET". This operates independently of the CPU and its clock and frequency. It is possible to obtain times with accuracies to the sub-microsecond.
In order to implement this, you DO need to use QPC/QPF. The problem is that QPF (frequency) is a NOMINAL value, so using the raw calls will cause time drifts that can exceed minutes per day. In order to accound for this, you have to measure the actual frequency and check for its drift over time as heat and other physical operating conditions will affect it.
An article that describes this can be found on MSDN (circa 2004!) at this link.
http://msdn.microsoft.com/en-us/magazine/cc163996.aspx
I did implement something similar to this myself (and just found the above link today!) but prefer not to use "microsecond time" because the QPC call itself is rather lengthy compared to other Windows calls such as GetSystemTimeAsFileTime, and synchronization adds more overhead. So I prefer to use millisecond timestamps (approx 70% less call time than using QPC) especially when I'm trying to get the time hundreds of thousands of times per second.
The best choice are the functions QueryPerformanceCounter and QueryPerformanceFrequency.
Microsoft has just recently (2014) released more detailed information about QueryPerformanceCounter:
See Acquiring high-resolution time stamps (MSDN 2014) for the details.
This is a comprehensive article with lots of examples and detailed description. A must read for users of QPC.
I think microseconds is a bit unreasonable (without hardware assistance). Milliseconds is doable, but even then not that accurate due to various nefarious counter resolution issues. Regardless, I include my own timer class (based on std::chrono) for your consideration:
#include <type_traits>
#include <chrono>
class Stopwatch final
{
public:
using elapsed_resolution = std::chrono::milliseconds;
Stopwatch()
{
Reset();
}
void Reset()
{
reset_time = clock.now();
}
elapsed_resolution Elapsed()
{
return std::chrono::duration_cast<elapsed_resolution>(clock.now() - reset_time);
}
private:
std::chrono::high_resolution_clock clock;
std::chrono::high_resolution_clock::time_point reset_time;
};
Note that under the hood on Windows std::chrono::high_resolution_clock is using QueryPerformanceCounter, so it's just the same but portable.
MSDN claims that -
A Scenario object is a highly-accurate timer that logs ETW events
(Event Tracing for Windows) when you start and stop it. It's designed
to be used for performance instrumentation and benchmarking, and comes
in both C# and C++ versions. ... As a rule of thumb on modern
hardware, a call to Begin() or End() takes on the order of a
microsecond, and the resulting timestamps are accurate to 100ns (i.e.
0.1 microseconds). ... Versions are available for both .NET 3.5 (written in C#), and native C++, and run on both x86 and x64
platforms. The Scenario class was originally developed using Visual
Studio 2008, but is now targeted at developers using Visual Studio
2010.]
From Scenario Home Page. As far as i know, it was provided by the same people as PPL.
Addionaly you can read this High Resolution Clocks and Timers for Performance Measurement in Windows.
In newer Windows versions you probably want GetSystemTimePreciseAsFileTime. See Acquiring high resolution timestamps.
Lots of this varies a rather unfortunate amount based on hardware and OS version.
If you can use the Visual Studio compiler 2012 or higher, you can well use the std::chrono standard library.
#include <chrono>
::std::chrono::steady_clock::time_point time = std::chrono::steady_clock::now();
Note that the MSVC 2012 version may be only 1ms accurate. Newer versions should be accurate up to a microsecond.
You can use the Performance Counter API as Konrad Rudolf proposed, but should be warned that it is based on the CPU frequency. This frequency is not stable when e.g. a power save mode is enabled. If you want to use this API, make sure the CPU is at a constant frequency.
Otherwise, you can create some kind of 'statistical' system, correlating the CPU ticks to the PC BIOS clock. The latter is way less precise, but constant.
using QueryPerformanceCounter (for windows)
With respect to Konrad Rudolph's answer, note that in my experience the frequency of the performance counter is around 3.7MHz, so sub-microsecond, but certainly not nanosecond precision. The actual frequency is hardware (and power-save mode) dependent. Nanosecond precision is somewhat unreasonable in any case since interrupt latencies and process/thread context switching times are far longer than that, and that is also the order of magnitude of individual machine instructions.
rdtsc instruction is the most accurate.
Here is a Timer class that will work both for Windows and Linux :
#ifndef INCLUDE_CTIMER_HPP_
#define INCLUDE_CTIMER_HPP_
#if defined(_MSC_VER)
# define NOMINMAX // workaround a bug in windows.h
# include <windows.h>
#else
# include <sys/time.h>
#endif
namespace Utils
{
class CTimer
{
private:
# if defined(_MSC_VER)
LARGE_INTEGER m_depart;
# else
timeval m_depart;
# endif
public:
inline void start()
{
# if defined(_MSC_VER)
QueryPerformanceCounter(&m_depart);
# else
gettimeofday(&m_depart, 0);
# endif
};
inline float GetSecondes() const
{
# if defined(_MSC_VER)
LARGE_INTEGER now;
LARGE_INTEGER freq;
QueryPerformanceCounter(&now);
QueryPerformanceFrequency(&freq);
return (now.QuadPart - m_depart.QuadPart) / static_cast<float>(freq.QuadPart);
# else
timeval now;
gettimeofday(&now, 0);
return now.tv_sec - m_depart.tv_sec + (now.tv_usec - m_depart.tv_usec) / 1000000.0f;
# endif
};
};
}
#endif // INCLUDE_CTIMER_HPP_
Thanks for the input...though I couldn't get nano, or microsecond resolution which would have been nice, I was however able to come up with this...maybe someone else will find it usefull.
class N_Script_Timer
{
public:
N_Script_Timer()
{
running = false;
milliseconds = 0;
seconds = 0;
start_t = 0;
end_t = 0;
}
void Start()
{
if(running)return;
running = true;
start_t = timeGetTime();
}
void End()
{
if(!running)return;
running = false;
end_t = timeGetTime();
milliseconds = end_t - start_t;
seconds = milliseconds / (float)1000;
}
float milliseconds;
float seconds;
private:
unsigned long start_t;
unsigned long end_t;
bool running;
};