C++14 High Precision of time Measurements? [duplicate]

C++14 High Precision of time Measurements? [duplicate] - c++

I'm looking to implement a simple timer mechanism in C++. The code should work in Windows and Linux. The resolution should be as precise as possible (at least millisecond accuracy). This will be used to simply track the passage of time, not to implement any kind of event-driven design. What is the best tool to accomplish this?

Updated answer for an old question:
In C++11 you can portably get to the highest resolution timer with:
#include <iostream>
#include <chrono>
#include "chrono_io"
int main()
{
typedef std::chrono::high_resolution_clock Clock;
auto t1 = Clock::now();
auto t2 = Clock::now();
std::cout << t2-t1 << '\n';
}
Example output:
74 nanoseconds
"chrono_io" is an extension to ease I/O issues with these new types and is freely available here.
There is also an implementation of <chrono> available in boost (might still be on tip-of-trunk, not sure it has been released).
Update
This is in response to Ben's comment below that subsequent calls to std::chrono::high_resolution_clock take several milliseconds in VS11. Below is a <chrono>-compatible workaround. However it only works on Intel hardware, you need to dip into inline assembly (syntax to do that varies with compiler), and you have to hardwire the machine's clock speed into the clock:
#include <chrono>
struct clock
{
typedef unsigned long long rep;
typedef std::ratio<1, 2800000000> period; // My machine is 2.8 GHz
typedef std::chrono::duration<rep, period> duration;
typedef std::chrono::time_point<clock> time_point;
static const bool is_steady = true;
static time_point now() noexcept
{
unsigned lo, hi;
asm volatile("rdtsc" : "=a" (lo), "=d" (hi));
return time_point(duration(static_cast<rep>(hi) << 32 | lo));
}
private:
static
unsigned
get_clock_speed()
{
int mib[] = {CTL_HW, HW_CPU_FREQ};
const std::size_t namelen = sizeof(mib)/sizeof(mib[0]);
unsigned freq;
size_t freq_len = sizeof(freq);
if (sysctl(mib, namelen, &freq, &freq_len, nullptr, 0) != 0)
return 0;
return freq;
}
static
bool
check_invariants()
{
static_assert(1 == period::num, "period must be 1/freq");
assert(get_clock_speed() == period::den);
static_assert(std::is_same<rep, duration::rep>::value,
"rep and duration::rep must be the same type");
static_assert(std::is_same<period, duration::period>::value,
"period and duration::period must be the same type");
static_assert(std::is_same<duration, time_point::duration>::value,
"duration and time_point::duration must be the same type");
return true;
}
static const bool invariants;
};
const bool clock::invariants = clock::check_invariants();
So it isn't portable. But if you want to experiment with a high resolution clock on your own intel hardware, it doesn't get finer than this. Though be forewarned, today's clock speeds can dynamically change (they aren't really a compile-time constant). And with a multiprocessor machine you can even get time stamps from different processors. But still, experiments on my hardware work fairly well. If you're stuck with millisecond resolution, this could be a workaround.
This clock has a duration in terms of your cpu's clock speed (as you reported it). I.e. for me this clock ticks once every 1/2,800,000,000 of a second. If you want to, you can convert this to nanoseconds (for example) with:
using std::chrono::nanoseconds;
using std::chrono::duration_cast;
auto t0 = clock::now();
auto t1 = clock::now();
nanoseconds ns = duration_cast<nanoseconds>(t1-t0);
The conversion will truncate fractions of a cpu cycle to form the nanosecond. Other rounding modes are possible, but that's a different topic.
For me this will return a duration as low as 18 clock ticks, which truncates to 6 nanoseconds.
I've added some "invariant checking" to the above clock, the most important of which is checking that the clock::period is correct for the machine. Again, this is not portable code, but if you're using this clock, you've already committed to that. The private get_clock_speed() function shown here gets the maximum cpu frequency on OS X, and that should be the same number as the constant denominator of clock::period.
Adding this will save you a little debugging time when you port this code to your new machine and forget to update the clock::period to the speed of your new machine. All of the checking is done either at compile-time or at program startup time. So it won't impact the performance of clock::now() in the least.

For C++03:
Boost.Timer might work, but it depends on the C function clock and so may not have good enough resolution for you.
Boost.Date_Time includes a ptime class that's been recommended on Stack Overflow before. See its docs on microsec_clock::local_time and microsec_clock::universal_time, but note its caveat that "Win32 systems often do not achieve microsecond resolution via this API."
STLsoft provides, among other things, thin cross-platform (Windows and Linux/Unix) C++ wrappers around OS-specific APIs. Its performance library has several classes that would do what you need. (To make it cross platform, pick a class like performance_counter that exists in both the winstl and unixstl namespaces, then use whichever namespace matches your platform.)
For C++11 and above:
The std::chrono library has this functionality built in. See this answer by #HowardHinnant for details.

Matthew Wilson's STLSoft libraries provide several timer types, with congruent interfaces so you can plug-and-play. Amongst the offerings are timers that are low-cost but low-resolution, and ones that are high-resolution but have high-cost. There are also ones for measuring pre-thread times and for measuring per-process times, as well as all that measure elapsed times.
There's an exhaustive article covering it in Dr. Dobb's from some years ago, although it only covers the Windows ones, those defined in the WinSTL sub-project. STLSoft also provides for UNIX timers in the UNIXSTL sub-project, and you can use the "PlatformSTL" one, which includes the UNIX or Windows one as appropriate, as in:
#include <platformstl/performance/performance_counter.hpp>
#include <iostream>
int main()
{
platformstl::performance_counter c;
c.start();
for(int i = 0; i < 1000000000; ++i);
c.stop();
std::cout << "time (s): " << c.get_seconds() << std::endl;
std::cout << "time (ms): " << c.get_milliseconds() << std::endl;
std::cout << "time (us): " << c.get_microseconds() << std::endl;
}
HTH

The StlSoft open source library provides a quite good timer on both windows and linux platforms. If you want it to implement on your own, just have a look at their sources.

The ACE library has portable high resolution timers also.
Doxygen for high res timer:
http://www.dre.vanderbilt.edu/Doxygen/5.7.2/html/ace/a00244.html

I have seen this implemented a few times as closed-source in-house solutions .... which all resorted to #ifdef solutions around native Windows hi-res timers on the one hand and Linux kernel timers using struct timeval (see man timeradd) on the other hand.
You can abstract this and a few Open Source projects have done it -- the last one I looked at was the CoinOR class CoinTimer but there are surely more of them.

I highly recommend boost::posix_time library for that. It supports timers in various resolutions down to microseconds I believe

SDL2 has an excellent cross-platform high-resolution timer. If however you need sub-millisecond accuracy, I wrote a very small cross-platform timer library here.
It is compatible with both C++03 and C++11/higher versions of C++.

I found this which looks promising, and is extremely straightforward, not sure if there are any drawbacks:
https://gist.github.com/ForeverZer0/0a4f80fc02b96e19380ebb7a3debbee5
/* ----------------------------------------------------------------------- */
/*
Easy embeddable cross-platform high resolution timer function. For each
platform we select the high resolution timer. You can call the 'ns()'
function in your file after embedding this.
*/
#include <stdint.h>
#if defined(__linux)
# define HAVE_POSIX_TIMER
# include <time.h>
# ifdef CLOCK_MONOTONIC
# define CLOCKID CLOCK_MONOTONIC
# else
# define CLOCKID CLOCK_REALTIME
# endif
#elif defined(__APPLE__)
# define HAVE_MACH_TIMER
# include <mach/mach_time.h>
#elif defined(_WIN32)
# define WIN32_LEAN_AND_MEAN
# include <windows.h>
#endif
static uint64_t ns() {
static uint64_t is_init = 0;
#if defined(__APPLE__)
static mach_timebase_info_data_t info;
if (0 == is_init) {
mach_timebase_info(&info);
is_init = 1;
}
uint64_t now;
now = mach_absolute_time();
now *= info.numer;
now /= info.denom;
return now;
#elif defined(__linux)
static struct timespec linux_rate;
if (0 == is_init) {
clock_getres(CLOCKID, &linux_rate);
is_init = 1;
}
uint64_t now;
struct timespec spec;
clock_gettime(CLOCKID, &spec);
now = spec.tv_sec * 1.0e9 + spec.tv_nsec;
return now;
#elif defined(_WIN32)
static LARGE_INTEGER win_frequency;
if (0 == is_init) {
QueryPerformanceFrequency(&win_frequency);
is_init = 1;
}
LARGE_INTEGER now;
QueryPerformanceCounter(&now);
return (uint64_t) ((1e9 * now.QuadPart) / win_frequency.QuadPart);
#endif
}
/* ----------------------------------------------------------------------- */-------------------------------- */

The first answer to C++ library questions is generally BOOST: http://www.boost.org/doc/libs/1_40_0/libs/timer/timer.htm. Does this do what you want? Probably not but it's a start.
The problem is you want portable and timer functions are not universal in OSes.

STLSoft have a Performance Library, which includes a set of timer classes, some that work for both UNIX and Windows.

I am not sure about your requirement, If you want to calculate time interval please see thread below
Calculating elapsed time in a C program in milliseconds

Late to the party here, but I'm working in a legacy codebase that can't be upgraded to c++11 yet. Nobody on our team is very skilled in c++, so adding a library like STL is proving difficult (on top of potential concerns others have raised about deployment issues). I really needed an extremely simple cross platform timer that could live by itself without anything beyond bare-bones standard system libraries. Here's what I found:
http://www.songho.ca/misc/timer/timer.html
Reposting the entire source here just so it doesn't get lost if the site ever dies:
//////////////////////////////////////////////////////////////////////////////
// Timer.cpp
// =========
// High Resolution Timer.
// This timer is able to measure the elapsed time with 1 micro-second accuracy
// in both Windows, Linux and Unix system
//
// AUTHOR: Song Ho Ahn (song.ahn#gmail.com) - http://www.songho.ca/misc/timer/timer.html
// CREATED: 2003-01-13
// UPDATED: 2017-03-30
//
// Copyright (c) 2003 Song Ho Ahn
//////////////////////////////////////////////////////////////////////////////
#include "Timer.h"
#include <stdlib.h>
///////////////////////////////////////////////////////////////////////////////
// constructor
///////////////////////////////////////////////////////////////////////////////
Timer::Timer()
{
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceFrequency(&frequency);
startCount.QuadPart = 0;
endCount.QuadPart = 0;
#else
startCount.tv_sec = startCount.tv_usec = 0;
endCount.tv_sec = endCount.tv_usec = 0;
#endif
stopped = 0;
startTimeInMicroSec = 0;
endTimeInMicroSec = 0;
}
///////////////////////////////////////////////////////////////////////////////
// distructor
///////////////////////////////////////////////////////////////////////////////
Timer::~Timer()
{
}
///////////////////////////////////////////////////////////////////////////////
// start timer.
// startCount will be set at this point.
///////////////////////////////////////////////////////////////////////////////
void Timer::start()
{
stopped = 0; // reset stop flag
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceCounter(&startCount);
#else
gettimeofday(&startCount, NULL);
#endif
}
///////////////////////////////////////////////////////////////////////////////
// stop the timer.
// endCount will be set at this point.
///////////////////////////////////////////////////////////////////////////////
void Timer::stop()
{
stopped = 1; // set timer stopped flag
#if defined(WIN32) || defined(_WIN32)
QueryPerformanceCounter(&endCount);
#else
gettimeofday(&endCount, NULL);
#endif
}
///////////////////////////////////////////////////////////////////////////////
// compute elapsed time in micro-second resolution.
// other getElapsedTime will call this first, then convert to correspond resolution.
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInMicroSec()
{
#if defined(WIN32) || defined(_WIN32)
if(!stopped)
QueryPerformanceCounter(&endCount);
startTimeInMicroSec = startCount.QuadPart * (1000000.0 / frequency.QuadPart);
endTimeInMicroSec = endCount.QuadPart * (1000000.0 / frequency.QuadPart);
#else
if(!stopped)
gettimeofday(&endCount, NULL);
startTimeInMicroSec = (startCount.tv_sec * 1000000.0) + startCount.tv_usec;
endTimeInMicroSec = (endCount.tv_sec * 1000000.0) + endCount.tv_usec;
#endif
return endTimeInMicroSec - startTimeInMicroSec;
}
///////////////////////////////////////////////////////////////////////////////
// divide elapsedTimeInMicroSec by 1000
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInMilliSec()
{
return this->getElapsedTimeInMicroSec() * 0.001;
}
///////////////////////////////////////////////////////////////////////////////
// divide elapsedTimeInMicroSec by 1000000
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTimeInSec()
{
return this->getElapsedTimeInMicroSec() * 0.000001;
}
///////////////////////////////////////////////////////////////////////////////
// same as getElapsedTimeInSec()
///////////////////////////////////////////////////////////////////////////////
double Timer::getElapsedTime()
{
return this->getElapsedTimeInSec();
}
and the header file:
//////////////////////////////////////////////////////////////////////////////
// Timer.h
// =======
// High Resolution Timer.
// This timer is able to measure the elapsed time with 1 micro-second accuracy
// in both Windows, Linux and Unix system
//
// AUTHOR: Song Ho Ahn (song.ahn#gmail.com) - http://www.songho.ca/misc/timer/timer.html
// CREATED: 2003-01-13
// UPDATED: 2017-03-30
//
// Copyright (c) 2003 Song Ho Ahn
//////////////////////////////////////////////////////////////////////////////
#ifndef TIMER_H_DEF
#define TIMER_H_DEF
#if defined(WIN32) || defined(_WIN32) // Windows system specific
#include <windows.h>
#else // Unix based system specific
#include <sys/time.h>
#endif
class Timer
{
public:
Timer(); // default constructor
~Timer(); // default destructor
void start(); // start timer
void stop(); // stop the timer
double getElapsedTime(); // get elapsed time in second
double getElapsedTimeInSec(); // get elapsed time in second (same as getElapsedTime)
double getElapsedTimeInMilliSec(); // get elapsed time in milli-second
double getElapsedTimeInMicroSec(); // get elapsed time in micro-second
protected:
private:
double startTimeInMicroSec; // starting time in micro-second
double endTimeInMicroSec; // ending time in micro-second
int stopped; // stop flag
#if defined(WIN32) || defined(_WIN32)
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER startCount; //
LARGE_INTEGER endCount; //
#else
timeval startCount; //
timeval endCount; //
#endif
};
#endif // TIMER_H_DEF

If one is using the Qt framework in the project, the best solution is probably to use QElapsedTimer.

Related

OpenCL slower on desktop computer than macbook 13"

I've been writing an OpenCL program that I first wrote on my Macbook Pro and since my desktop computer is stronger i wanted to port the code and see if there is any improvement.
The same code ran for:
Mac: 0.055452s
Win7 :0.359s
The specifications of both computers are:
Mac : 2.6GHz Intel Core i5, 8GB 1600MHz DDR3, Intel Iris 1536MB
PC : 3.3GHz Intel Core i5-2500k, 8GB 1600MHz DDR3, AMD Radeon HD 6900 Series
Now as you can see the code ran on my Mac almost 10x faster than on my desktop PC.
I timed the code using
#include<ctime>
clock_t begin = clock();
....// Entire main file
float timeTaken = (float)(clock() - begin) / CLOCKS_PER_SEC;
cout << "Time taken: " << timeTaken << endl;
If I am not mistaken, both the CPU and the GPU are stronger on the PC. I was able to run Battlefield 3 on Ultra settings with this desktop computer.
Only difference might being that Visual Studio on the PC compiles with another compiler? I used g++ on my mac, not sure what Visual Studio uses.
These results don't make sense to me. What do you guys think? If you want to check out the code I can post the github link
EDIT: The following github link shows the code https://github.com/Batkow/OpenCL/tree/master/OpenCL . PSO_V2 uses the type of coding used in the tutorial from : https://www.fixstars.com/en/opencl/book/OpenCLProgrammingBook/introduction-to-parallelization/
And PSO simplifies the coding using the custom headers from this github repo: https://github.com/HandsOnOpenCL/Exercises-Solutions ..
I ran the code on my friends new i7 laptop with an NVidia Geforce 950M and the code was executed even slower than on my desktop PC.
I do realize that the code isn't optimized so any hints on stupid stuff I do, please call out on it. For instance having a while loop across three different kernel functions is kind of stupid right? I'm working on to implement all of it inside a kernel and loop inside of it, which should improve performance?
UPDATE: Ran the OpenCL/PSO code on the windows at home again. Timing the code before and after the while loop gives WINDOWS a faster performance yay!
clock_t Win7 = 0.027 and Mac = 0.036. Using the external .hpp with the Util::Timer class Win7 ran on :0.026s while Mac on 0.085s.
Timing from the start of the main file to right before the while loop (all of the initializations) then Mac scored better than Windows almost by 10 times using both clock_t and the Util::Timer. So the bottleneck seems to be at the initialization of the device?

Could be dozens of things - what the CL kernel does would be key, and how well that works on different types of GPUs. Or what compiler is used
However, I think the problem is how you are measuring time. clock() on Windows measure "wallclock-time" (in other words "elapsed time"), on OSX (and all other sane OS's), it reports CPU time for your process. If the OSX runs on the graphics processor [or in a separate process], it won't count as CPU-time, where Windows measures the overall time.
Either measure the CPU time using an appropriate CPU time measurement on Windows (using GetProcessTimes for example). Or use c++ std::chrono to measure wall-clock-time in both places.

Maybe the problem with way you measuring of elapsing time for example I made three different way to do so in my project for different OS:
#include <cstdio>
#include <iostream>
#if defined(_WIN32) || defined(_WIN64)
#include <windows.h>
#elif defined(__linux__)
#include <time.h>
#elif defined(__APPLE__)
#include <mach/mach.h>
#include <mach/mach_time.h>
#include <stddef.h>
#endif
#if defined(_WIN32) || defined(_WIN64)
LARGE_INTEGER frequency; // ticks per second
LARGE_INTEGER t0, t1, t2; // ticks
LARGE_INTEGER t3, t4;
#elif defined(__linux__)
timespec t0, t1, t2;
timespec t3, t4;
double us;
#elif defined(__APPLE__)
unsigned long t0, t1, t2;
#endif
double elapsedTime;
void refreshTime() {
#if defined(_WIN32) || defined(_WIN64)
QueryPerformanceFrequency(&frequency); // get ticks per second
QueryPerformanceCounter(&t1); // start timer
t0 = t1;
#elif defined(__linux__)
clock_gettime(CLOCK_MONOTONIC_RAW, &t1);
t0 = t1;
#elif defined(__APPLE__)
t1 = mach_absolute_time();
t0 = t1;
#endif
}
void watch_report(const char *str) {
#if defined(_WIN32) || defined(_WIN64)
QueryPerformanceCounter(&t2);
printf(str, (t2.QuadPart - t1.QuadPart) * 1000.0 / frequency.QuadPart);
t1 = t2;
elapsedTime = (t2.QuadPart - t0.QuadPart) * 1000.0 / frequency.QuadPart;
#elif defined(__linux__)
clock_gettime(CLOCK_MONOTONIC_RAW, &t2);
time_t sec = t2.tv_sec - t1.tv_sec;
long nsec;
if (t2.tv_nsec >= t1.tv_nsec) {
nsec = t2.tv_nsec - t1.tv_nsec;
} else {
nsec = 1000000000 - (t1.tv_nsec - t2.tv_nsec);
sec -= 1;
}
printf(str, (float)sec * 1000.f + (float)nsec / 1000000.f);
t1 = t2;
elapsedTime = (float)(t2.tv_sec - t0.tv_sec) * 1000.f +
(float)(t2.tv_nsec - t0.tv_nsec) / 1000000.f;
#elif defined(__APPLE__)
uint64_t elapsedNano;
static mach_timebase_info_data_t sTimebaseInfo;
if (sTimebaseInfo.denom == 0) {
(void)mach_timebase_info(&sTimebaseInfo);
}
t2 = mach_absolute_time();
elapsedNano = (t2 - t1) * sTimebaseInfo.numer / sTimebaseInfo.denom;
printf(str, (float)elapsedNano / 1000000.f);
t1 = t2;
elapsedNano = (t2 - t0) * sTimebaseInfo.numer / sTimebaseInfo.denom;
elapsedTime = (float)elapsedNano / 1000000.f;
#endif
}
/*This Function will work till you press q*/
void someFunction() {
while (1) {
char ch = std::cin.get();
if (ch == 'q')
break;
}
}
int main() {
refreshTime();
someFunction();
watch_report("some function was working: \t%9.3f ms\n");
}

Its because clock() function just counts the CPU clock cycles. Your host code could be invoking the kernel and then going to sleep. And this sleep time wont be counted by the clock function even if you kernel is executing. So the what this means is that the clock function considers only the execution time of the host code and not the openCL kernel. You need to use a function that counts the wall clock time rather than the CPU clock cycles.

Visual Studio will use MSVC compiler unless specified directly.
I think that the answer is hidden in you CPU's generations. On you PC it is Sandy Bridge (2nd gen) and on Mac - Haswell (4th gen). It is 2 generations difference.
OpenCL is one of the things that evolved significantly during these generations of Intel processors (excessive hardware support in Haswell).
Just to get the proof - find a friend with desktop equipped with Haswell CPU and run your tests. Desktop Haswell processor should beat your Mac's one (of course if the other hardware specs and the overall system load will match).

You've posted your timing functions but not the OpenCL code. What all is being timed? A big factor might be the kernel compilation time (clCreateProgramFromSource and clBuildProgram). Your PC might be using cached kernels while your Mac isn't. The proper way to measure OpenCL kernel execution time is to use OpenCL events.

Another possible reason why you might be having that results — maybe your program compiles to OpenCL ver.2.0.
On Windows, your GPU is from 2010, it only supports OpenCL 1.2.
On OSX, your Intel GPU supports OpenCL 2.0.

How do I get system up time in milliseconds in c++?

How do I get system up time since the start of the system? All I found was time since epoch and nothing else.
For example, something like time() in ctime library, but it only gives me a value of seconds since epoch. I want something like time() but since the start of the system.

It is OS dependant and already answered for several systems on stackoverflow.
#include<chrono> // for all examples :)
Windows ...
using GetTickCount64() (resolution usually 10-16 millisecond)
#include <windows>
// ...
auto uptime = std::chrono::milliseconds(GetTickCount64());
Linux ...
... using /proc/uptime
#include <fstream>
// ...
std::chrono::milliseconds uptime(0u);
double uptime_seconds;
if (std::ifstream("/proc/uptime", std::ios::in) >> uptime_seconds)
{
uptime = std::chrono::milliseconds(
static_cast<unsigned long long>(uptime_seconds*1000.0)
);
}
... using sysinfo (resolution 1 second)
#include <sys/sysinfo.h>
// ...
std::chrono::milliseconds uptime(0u);
struct sysinfo x;
if (sysinfo(&x) == 0)
{
uptime = std::chrono::milliseconds(
static_cast<unsigned long long>(x.uptime)*1000ULL
);
}
OS X ...
... using sysctl
#include <time.h>
#include <errno.h>
#include <sys/sysctl.h>
// ...
std::chrono::milliseconds uptime(0u);
struct timeval ts;
std::size_t len = sizeof(ts);
int mib[2] = { CTL_KERN, KERN_BOOTTIME };
if (sysctl(mib, 2, &ts, &len, NULL, 0) == 0)
{
uptime = std::chrono::milliseconds(
static_cast<unsigned long long>(ts.tv_sec)*1000ULL +
static_cast<unsigned long long>(ts.tv_usec)/1000ULL
);
}
BSD-like systems (or systems supporting CLOCK_UPTIME or CLOCK_UPTIME_PRECISE respectively) ...
... using clock_gettime (resolution see clock_getres)
#include <time.h>
// ...
std::chrono::milliseconds uptime(0u);
struct timespec ts;
if (clock_gettime(CLOCK_UPTIME_PRECISE, &ts) == 0)
{
uptime = std::chrono::milliseconds(
static_cast<unsigned long long>(ts.tv_sec)*1000ULL +
static_cast<unsigned long long>(ts.tv_nsec)/1000000ULL
);
}

+1 to the accepted answer. Nice survey. But the OS X answer is incorrect and I wanted to show the correction here.
The sysctl function with an input of { CTL_KERN, KERN_BOOTTIME } on OS X returns the Unix Time the system was booted, not the time since boot. And on this system (and every other system too), std::chrono::system_clock also measures Unix Time. So one simply has to subtract these two time_points to get the time-since-boot. Here is how you modify the accepted answer's OS X solution to do this:
std::chrono::milliseconds
uptime()
{
using namespace std::chrono;
timeval ts;
auto ts_len = sizeof(ts);
int mib[2] = { CTL_KERN, KERN_BOOTTIME };
auto constexpr mib_len = sizeof(mib)/sizeof(mib[0]);
if (sysctl(mib, mib_len, &ts, &ts_len, nullptr, 0) == 0)
{
system_clock::time_point boot{seconds{ts.tv_sec} + microseconds{ts.tv_usec}};
return duration_cast<milliseconds>(system_clock::now() - boot);
}
return 0ms;
}
Notes:
It is best to have chrono do your units conversions for you. If your code has 1000 in it (e.g. to convert seconds to milliseconds), rewrite it to have chrono do the conversion.
You can rely on implicit chrono duration unit conversions to be correct if they compile. If they don't compile, that means you're asking for truncation, and you can explicitly ask for truncation with duration_cast.
It's ok to use a using directive locally in a function if it makes the code more readable.

There is a boost example on how to customize logging messages.
In it the author is implementing a simple function unsigned int get_uptime() to get the system uptime for different platforms including Windows, OSx, Linux as well as BSD.

algorithm speed tester in C/C++

I have to calculate speed of my algorithm in base of milliseconds. In C++/C , how can i do this? I need to write smth before input and after output but what exactly??

You could use clock() function from <time.h>
clock() shows how many ticks have passed since your program started. The macro CLOCKS_PER_SEC contains the number of ticks per second, so you can actually get time.
//We start measuring here. Remember what was the amount of ticks in the
//beginning of the part of code you want to test:
int start = clock();
//<...>
//Do your stuff here
//<...>
int end = clock();//Now check what amount of ticks we have now.
//To get the time, just subtract start from end, and divide by CLOCKS_PER_SEC.
std::cout << "it took " << end - start << "ticks, or " << ((float)end - start)/CLOCKS_PER_SEC << "seconds." << std::endl;

There is no general way to measure the exact time or ticks. The method of measurement, the operating system and the other things happening on your computer (other application, graphical output, background processes) will influence the result. There is different ways to do "good-enough" (in many cases) measurements:
library functions
clock(...), clock_gettime(...)
from the standard lib (in time.h) and
gettimeofday(..) // for elapsed (wallclock) time
times(..) // process times
for linux and other unix systems (in sys/time.h) (edited according to Oleg's comment)
hardware counter:
__inline__ uint64_t rdtsc(void) {
uint32_t lo, hi;
__asm__ __volatile__( // serialize
"xorl %%eax,%%eax \n cpuid":::"%rax",
"%rbx", "%rcx", "%rdx");
__asm__ __volatile__("rdtsc":"=a"(lo), "=d"(hi));
return (uint64_t) hi << 32 | lo;
}
/*...*/
uint64_t t0 = rdtsc();
code_to_be_tested();
uint64_t t1 = rdtsc();
I prefer this method as it reads directly the hardware counter.
for C++11: std:chrono::highresolution_clock
typedef std::chrono::high_resolution_clock Clock;
auto t0 = Clock::now();
code_to_be_tested();
auto t1 = Clock::now();
Keep in mind, that the measurements will not be exact to the clockcycle. i.e. nanosecond. I always calculate microseconds (10e-6 s) as the smallest reasonable time unit.

Note that you can use date and time utilities from C++11 chrono library. From cppreference.com:
The chrono library defines three main types (durations, clocks, and time points) as well as utility functions and common typedefs.
See the sample from the article compiled in GCC 4.5.1 here

You can use this function library :
// clock.c
#include <time.h>
#include "clock.h"
struct clock { clock_t c1, c2; };
void start(clock *this) { this->c1 = clock(); }
void stop (clock *this) { this->c2 = clock(); }
double print(clock *this) { return (double)(c1 - c2) / CLOCKS_PER_SEC; }
// clock.h
#ifndef CLOCK_H_INCLUDED
# define CLOCK_H_INCLUDED
typedef struct clock clock;
extern void start(clock *);
extern void stop (clock *);
extern double print(clock *);
#endif // CLOCK_H_INCLUDED
But sometimes clock isn't very adapted: you can use your system functions, which can be more accurate.

c++ get milliseconds since some date

I need some way in c++ to keep track of the number of milliseconds since program execution. And I need the precision to be in milliseconds. (In my googling, I've found lots of folks that said to include time.h and then multiply the output of time() by 1000 ... this won't work.)

clock has been suggested a number of times. This has two problems. First of all, it often doesn't have a resolution even close to a millisecond (10-20 ms is probably more common). Second, some implementations of it (e.g., Unix and similar) return CPU time, while others (E.g., Windows) return wall time.
You haven't really said whether you want wall time or CPU time, which makes it hard to give a really good answer. On Windows, you could use GetProcessTimes. That will give you the kernel and user CPU times directly. It will also tell you when the process was created, so if you want milliseconds of wall time since process creation, you can subtract the process creation time from the current time (GetSystemTime). QueryPerformanceCounter has also been mentioned. This has a few oddities of its own -- for example, in some implementations it retrieves time from the CPUs cycle counter, so its frequency varies when/if the CPU speed changes. Other implementations read from the motherboard's 1.024 MHz timer, which does not vary with the CPU speed (and the conditions under which each are used aren't entirely obvious).
On Unix, you can use GetTimeOfDay to just get the wall time with (at least the possibility of) relatively high precision. If you want time for a process, you can use times or getrusage (the latter is newer and gives more complete information that may also be more precise).
Bottom line: as I said in my comment, there's no way to get what you want portably. Since you haven't said whether you want CPU time or wall time, even for a specific system, there's not one right answer. The one you've "accepted" (clock()) has the virtue of being available on essentially any system, but what it returns also varies just about the most widely.

See std::clock()

Include time.h, and then use the clock() function. It returns the number of clock ticks elapsed since the program was launched. Just divide it by "CLOCKS_PER_SEC" to obtain the number of seconds, you can then multiply by 1000 to obtain the number of milliseconds.

Some cross platform solution. This code was used for some kind of benchmarking:
#ifdef WIN32
LARGE_INTEGER g_llFrequency = {0};
BOOL g_bQueryResult = QueryPerformanceFrequency(&g_llFrequency);
#endif
//...
long long osQueryPerfomance()
{
#ifdef WIN32
LARGE_INTEGER llPerf = {0};
QueryPerformanceCounter(&llPerf);
return llPerf.QuadPart * 1000ll / ( g_llFrequency.QuadPart / 1000ll);
#else
struct timeval stTimeVal;
gettimeofday(&stTimeVal, NULL);
return stTimeVal.tv_sec * 1000000ll + stTimeVal.tv_usec;
#endif
}

The most portable way is using the clock function.It usually reports the time that your program has been using the processor, or an approximation thereof. Note however the following:
The resolution is not very good for GNU systems. That's really a pity.
Take care of casting everything to double before doing divisions and assignations.
The counter is held as a 32 bit number in GNU 32 bits, which can be pretty annoying for long-running programs.
There are alternatives using "wall time" which give better resolution, both in Windows and Linux. But as the libc manual states: If you're trying to optimize your program or measure its efficiency, it's very useful to know how much processor time it uses. For that, calendar time and elapsed times are useless because a process may spend time waiting for I/O or for other processes to use the CPU.

Here is a C++0x solution and an example why clock() might not do what you think it does.
#include <chrono>
#include <iostream>
#include <cstdlib>
#include <ctime>
int main()
{
auto start1 = std::chrono::monotonic_clock::now();
auto start2 = std::clock();
sleep(1);
for( int i=0; i<100000000; ++i);
auto end1 = std::chrono::monotonic_clock::now();
auto end2 = std::clock();
auto delta1 = end1-start1;
auto delta2 = end2-start2;
std::cout << "chrono: " << std::chrono::duration_cast<std::chrono::duration<float>>(delta1).count() << std::endl;
std::cout << "clock: " << static_cast<float>(delta2)/CLOCKS_PER_SEC << std::endl;
}
On my system this outputs:
chrono: 1.36839
clock: 0.36
You'll notice the clock() method is missing a second. An astute observer might also notice that clock() looks to have less resolution. On my system it's ticking by in 12 millisecond increments, terrible resolution.
If you are unable or unwilling to use C++0x, take a look at Boost.DateTime's ptime microsec_clock::universal_time().

This isn't C++ specific (nor portable), but you can do:
SYSTEMTIME systemDT;
In Windows.
From there, you can access each member of the systemDT struct.
You can record the time when the program started and compare the current time to the recorded time (systemDT versus systemDTtemp, for instance).
To refresh, you can call GetLocalTime(&systemDT);
To access each member, you would do systemDT.wHour, systemDT.wMinute, systemDT.wMilliseconds.
To get more information on SYSTEMTIME.

Do you want wall clock time, CPU time, or some other measurement? Also, what platform is this? There is no universally portable way to get more precision than time() and clock() give you, but...
on most Unix systems, you can use gettimeofday() and/or clock_gettime(), which give at least microsecond precision and access to a variety of timers;
I'm not nearly as familiar with Windows, but one of these functions probably does what you want.

You can try this code (get from StockFish chess engine source code (GPL)):
#include <iostream>
#include <stdio>
#if !defined(_WIN32) && !defined(_WIN64) // Linux - Unix
# include <sys/time.h>
typedef timeval sys_time_t;
inline void system_time(sys_time_t* t) {
gettimeofday(t, NULL);
}
inline long long time_to_msec(const sys_time_t& t) {
return t.tv_sec * 1000LL + t.tv_usec / 1000;
}
#else // Windows and MinGW
# include <sys/timeb.h>
typedef _timeb sys_time_t;
inline void system_time(sys_time_t* t) { _ftime(t); }
inline long long time_to_msec(const sys_time_t& t) {
return t.time * 1000LL + t.millitm;
}
#endif
struct Time {
void restart() { system_time(&t); }
uint64_t msec() const { return time_to_msec(t); }
long long elapsed() const {
return long long(current_time().msec() - time_to_msec(t));
}
static Time current_time() { Time t; t.restart(); return t; }
private:
sys_time_t t;
};
int main() {
sys_time_t t;
system_time(&t);
long long currentTimeMs = time_to_msec(t);
std::cout << "currentTimeMs:" << currentTimeMs << std::endl;
Time time = Time::current_time();
for (int i = 0; i < 1000000; i++) {
//Do something
}
long long e = time.elapsed();
std::cout << "time elapsed:" << e << std::endl;
getchar(); // wait for keyboard input
}

Timer function to provide time in nano seconds using C++

I wish to calculate the time it took for an API to return a value.
The time taken for such an action is in the space of nanoseconds. As the API is a C++ class/function, I am using the timer.h to calculate the same:
#include <ctime>
#include <iostream>
using namespace std;
int main(int argc, char** argv) {
clock_t start;
double diff;
start = clock();
diff = ( std::clock() - start ) / (double)CLOCKS_PER_SEC;
cout<<"printf: "<< diff <<'\n';
return 0;
}
The above code gives the time in seconds. How do I get the same in nano seconds and with more precision?

What others have posted about running the function repeatedly in a loop is correct.
For Linux (and BSD) you want to use clock_gettime().
#include <sys/time.h>
int main()
{
timespec ts;
// clock_gettime(CLOCK_MONOTONIC, &ts); // Works on FreeBSD
clock_gettime(CLOCK_REALTIME, &ts); // Works on Linux
}
For windows you want to use the QueryPerformanceCounter. And here is more on QPC
Apparently there is a known issue with QPC on some chipsets, so you may want to make sure you do not have those chipset. Additionally some dual core AMDs may also cause a problem. See the second post by sebbbi, where he states:
QueryPerformanceCounter() and
QueryPerformanceFrequency() offer a
bit better resolution, but have
different issues. For example in
Windows XP, all AMD Athlon X2 dual
core CPUs return the PC of either of
the cores "randomly" (the PC sometimes
jumps a bit backwards), unless you
specially install AMD dual core driver
package to fix the issue. We haven't
noticed any other dual+ core CPUs
having similar issues (p4 dual, p4 ht,
core2 dual, core2 quad, phenom quad).
EDIT 2013/07/16:
It looks like there is some controversy on the efficacy of QPC under certain circumstances as stated in http://msdn.microsoft.com/en-us/library/windows/desktop/ee417693(v=vs.85).aspx
...While QueryPerformanceCounter and QueryPerformanceFrequency typically adjust for
multiple processors, bugs in the BIOS or drivers may result in these routines returning
different values as the thread moves from one processor to another...
However this StackOverflow answer https://stackoverflow.com/a/4588605/34329 states that QPC should work fine on any MS OS after Win XP service pack 2.
This article shows that Windows 7 can determine if the processor(s) have an invariant TSC and falls back to an external timer if they don't. http://performancebydesign.blogspot.com/2012/03/high-resolution-clocks-and-timers-for.html Synchronizing across processors is still an issue.
Other fine reading related to timers:
https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
http://lwn.net/Articles/209101/
http://performancebydesign.blogspot.com/2012/03/high-resolution-clocks-and-timers-for.html
QueryPerformanceCounter Status?
See the comments for more details.

This new answer uses C++11's <chrono> facility. While there are other answers that show how to use <chrono>, none of them shows how to use <chrono> with the RDTSC facility mentioned in several of the other answers here. So I thought I would show how to use RDTSC with <chrono>. Additionally I'll demonstrate how you can templatize the testing code on the clock so that you can rapidly switch between RDTSC and your system's built-in clock facilities (which will likely be based on clock(), clock_gettime() and/or QueryPerformanceCounter.
Note that the RDTSC instruction is x86-specific. QueryPerformanceCounter is Windows only. And clock_gettime() is POSIX only. Below I introduce two new clocks: std::chrono::high_resolution_clock and std::chrono::system_clock, which, if you can assume C++11, are now cross-platform.
First, here is how you create a C++11-compatible clock out of the Intel rdtsc assembly instruction. I'll call it x::clock:
#include <chrono>
namespace x
{
struct clock
{
typedef unsigned long long rep;
typedef std::ratio<1, 2'800'000'000> period; // My machine is 2.8 GHz
typedef std::chrono::duration<rep, period> duration;
typedef std::chrono::time_point<clock> time_point;
static const bool is_steady = true;
static time_point now() noexcept
{
unsigned lo, hi;
asm volatile("rdtsc" : "=a" (lo), "=d" (hi));
return time_point(duration(static_cast<rep>(hi) << 32 | lo));
}
};
} // x
All this clock does is count CPU cycles and store it in an unsigned 64-bit integer. You may need to tweak the assembly language syntax for your compiler. Or your compiler may offer an intrinsic you can use instead (e.g. now() {return __rdtsc();}).
To build a clock you have to give it the representation (storage type). You must also supply the clock period, which must be a compile time constant, even though your machine may change clock speed in different power modes. And from those you can easily define your clock's "native" time duration and time point in terms of these fundamentals.
If all you want to do is output the number of clock ticks, it doesn't really matter what number you give for the clock period. This constant only comes into play if you want to convert the number of clock ticks into some real-time unit such as nanoseconds. And in that case, the more accurate you are able to supply the clock speed, the more accurate will be the conversion to nanoseconds, (milliseconds, whatever).
Below is example code which shows how to use x::clock. Actually I've templated the code on the clock as I'd like to show how you can use many different clocks with the exact same syntax. This particular test is showing what the looping overhead is when running what you want to time under a loop:
#include <iostream>
template <class clock>
void
test_empty_loop()
{
// Define real time units
typedef std::chrono::duration<unsigned long long, std::pico> picoseconds;
// or:
// typedef std::chrono::nanoseconds nanoseconds;
// Define double-based unit of clock tick
typedef std::chrono::duration<double, typename clock::period> Cycle;
using std::chrono::duration_cast;
const int N = 100000000;
// Do it
auto t0 = clock::now();
for (int j = 0; j < N; ++j)
asm volatile("");
auto t1 = clock::now();
// Get the clock ticks per iteration
auto ticks_per_iter = Cycle(t1-t0)/N;
std::cout << ticks_per_iter.count() << " clock ticks per iteration\n";
// Convert to real time units
std::cout << duration_cast<picoseconds>(ticks_per_iter).count()
<< "ps per iteration\n";
}
The first thing this code does is create a "real time" unit to display the results in. I've chosen picoseconds, but you can choose any units you like, either integral or floating point based. As an example there is a pre-made std::chrono::nanoseconds unit I could have used.
As another example I want to print out the average number of clock cycles per iteration as a floating point, so I create another duration, based on double, that has the same units as the clock's tick does (called Cycle in the code).
The loop is timed with calls to clock::now() on either side. If you want to name the type returned from this function it is:
typename clock::time_point t0 = clock::now();
(as clearly shown in the x::clock example, and is also true of the system-supplied clocks).
To get a duration in terms of floating point clock ticks one merely subtracts the two time points, and to get the per iteration value, divide that duration by the number of iterations.
You can get the count in any duration by using the count() member function. This returns the internal representation. Finally I use std::chrono::duration_cast to convert the duration Cycle to the duration picoseconds and print that out.
To use this code is simple:
int main()
{
std::cout << "\nUsing rdtsc:\n";
test_empty_loop<x::clock>();
std::cout << "\nUsing std::chrono::high_resolution_clock:\n";
test_empty_loop<std::chrono::high_resolution_clock>();
std::cout << "\nUsing std::chrono::system_clock:\n";
test_empty_loop<std::chrono::system_clock>();
}
Above I exercise the test using our home-made x::clock, and compare those results with using two of the system-supplied clocks: std::chrono::high_resolution_clock and std::chrono::system_clock. For me this prints out:
Using rdtsc:
1.72632 clock ticks per iteration
616ps per iteration
Using std::chrono::high_resolution_clock:
0.620105 clock ticks per iteration
620ps per iteration
Using std::chrono::system_clock:
0.00062457 clock ticks per iteration
624ps per iteration
This shows that each of these clocks has a different tick period, as the ticks per iteration is vastly different for each clock. However when converted to a known unit of time (e.g. picoseconds), I get approximately the same result for each clock (your mileage may vary).
Note how my code is completely free of "magic conversion constants". Indeed, there are only two magic numbers in the entire example:
The clock speed of my machine in order to define x::clock.
The number of iterations to test over. If changing this number makes your results vary greatly, then you should probably make the number of iterations higher, or empty your computer of competing processes while testing.

With that level of accuracy, it would be better to reason in CPU tick rather than in system call like clock(). And do not forget that if it takes more than one nanosecond to execute an instruction... having a nanosecond accuracy is pretty much impossible.
Still, something like that is a start:
Here's the actual code to retrieve number of 80x86 CPU clock ticks passed since the CPU was last started. It will work on Pentium and above (386/486 not supported). This code is actually MS Visual C++ specific, but can be probably very easy ported to whatever else, as long as it supports inline assembly.
inline __int64 GetCpuClocks()
{
// Counter
struct { int32 low, high; } counter;
// Use RDTSC instruction to get clocks count
__asm push EAX
__asm push EDX
__asm __emit 0fh __asm __emit 031h // RDTSC
__asm mov counter.low, EAX
__asm mov counter.high, EDX
__asm pop EDX
__asm pop EAX
// Return result
return *(__int64 *)(&counter);
}
This function has also the advantage of being extremely fast - it usually takes no more than 50 cpu cycles to execute.
Using the Timing Figures:
If you need to translate the clock counts into true elapsed time, divide the results by your chip's clock speed. Remember that the "rated" GHz is likely to be slightly different from the actual speed of your chip. To check your chip's true speed, you can use several very good utilities or the Win32 call, QueryPerformanceFrequency().

To do this correctly you can use one of two ways, either go with RDTSC or with clock_gettime().
The second is about 2 times faster and has the advantage of giving the right absolute time. Note that for RDTSC to work correctly you need to use it as indicated (other comments on this page have errors, and may yield incorrect timing values on certain processors)
inline uint64_t rdtsc()
{
uint32_t lo, hi;
__asm__ __volatile__ (
"xorl %%eax, %%eax\n"
"cpuid\n"
"rdtsc\n"
: "=a" (lo), "=d" (hi)
:
: "%ebx", "%ecx" );
return (uint64_t)hi << 32 | lo;
}
and for clock_gettime: (I chose microsecond resolution arbitrarily)
#include <time.h>
#include <sys/timeb.h>
// needs -lrt (real-time lib)
// 1970-01-01 epoch UTC time, 1 mcs resolution (divide by 1M to get time_t)
uint64_t ClockGetTime()
{
timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
return (uint64_t)ts.tv_sec * 1000000LL + (uint64_t)ts.tv_nsec / 1000LL;
}
the timing and values produced:
Absolute values:
rdtsc = 4571567254267600
clock_gettime = 1278605535506855
Processing time: (10000000 runs)
rdtsc = 2292547353
clock_gettime = 1031119636

I am using the following to get the desired results:
#include <time.h>
#include <iostream>
using namespace std;
int main (int argc, char** argv)
{
// reset the clock
timespec tS;
tS.tv_sec = 0;
tS.tv_nsec = 0;
clock_settime(CLOCK_PROCESS_CPUTIME_ID, &tS);
...
... <code to check for the time to be put here>
...
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &tS);
cout << "Time taken is: " << tS.tv_sec << " " << tS.tv_nsec << endl;
return 0;
}

For C++11, here is a simple wrapper:
#include <iostream>
#include <chrono>
class Timer
{
public:
Timer() : beg_(clock_::now()) {}
void reset() { beg_ = clock_::now(); }
double elapsed() const {
return std::chrono::duration_cast<second_>
(clock_::now() - beg_).count(); }
private:
typedef std::chrono::high_resolution_clock clock_;
typedef std::chrono::duration<double, std::ratio<1> > second_;
std::chrono::time_point<clock_> beg_;
};
Or for C++03 on *nix,
class Timer
{
public:
Timer() { clock_gettime(CLOCK_REALTIME, &beg_); }
double elapsed() {
clock_gettime(CLOCK_REALTIME, &end_);
return end_.tv_sec - beg_.tv_sec +
(end_.tv_nsec - beg_.tv_nsec) / 1000000000.;
}
void reset() { clock_gettime(CLOCK_REALTIME, &beg_); }
private:
timespec beg_, end_;
};
Example of usage:
int main()
{
Timer tmr;
double t = tmr.elapsed();
std::cout << t << std::endl;
tmr.reset();
t = tmr.elapsed();
std::cout << t << std::endl;
return 0;
}
From https://gist.github.com/gongzhitaao/7062087

In general, for timing how long it takes to call a function, you want to do it many more times than just once. If you call your function only once and it takes a very short time to run, you still have the overhead of actually calling the timer functions and you don't know how long that takes.
For example, if you estimate your function might take 800 ns to run, call it in a loop ten million times (which will then take about 8 seconds). Divide the total time by ten million to get the time per call.

You can use the following function with gcc running under x86 processors:
unsigned long long rdtsc()
{
#define rdtsc(low, high) \
__asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))
unsigned int low, high;
rdtsc(low, high);
return ((ulonglong)high << 32) | low;
}
with Digital Mars C++:
unsigned long long rdtsc()
{
_asm
{
rdtsc
}
}
which reads the high performance timer on the chip. I use this when doing profiling.

If you need subsecond precision, you need to use system-specific extensions, and will have to check with the documentation for the operating system. POSIX supports up to microseconds with gettimeofday, but nothing more precise since computers didn't have frequencies above 1GHz.
If you are using Boost, you can check boost::posix_time.

I'm using Borland code here is the code ti_hund gives me some times a negativnumber but timing is fairly good.
#include <dos.h>
void main()
{
struct time t;
int Hour,Min,Sec,Hun;
gettime(&t);
Hour=t.ti_hour;
Min=t.ti_min;
Sec=t.ti_sec;
Hun=t.ti_hund;
printf("Start time is: %2d:%02d:%02d.%02d\n",
t.ti_hour, t.ti_min, t.ti_sec, t.ti_hund);
....
your code to time
...
// read the time here remove Hours and min if the time is in sec
gettime(&t);
printf("\nTid Hour:%d Min:%d Sec:%d Hundreds:%d\n",t.ti_hour-Hour,
t.ti_min-Min,t.ti_sec-Sec,t.ti_hund-Hun);
printf("\n\nAlt Ferdig Press a Key\n\n");
getch();
} // end main

Using Brock Adams's method, with a simple class:
int get_cpu_ticks()
{
LARGE_INTEGER ticks;
QueryPerformanceFrequency(&ticks);
return ticks.LowPart;
}
__int64 get_cpu_clocks()
{
struct { int32 low, high; } counter;
__asm cpuid
__asm push EDX
__asm rdtsc
__asm mov counter.low, EAX
__asm mov counter.high, EDX
__asm pop EDX
__asm pop EAX
return *(__int64 *)(&counter);
}
class cbench
{
public:
cbench(const char *desc_in)
: desc(strdup(desc_in)), start(get_cpu_clocks()) { }
~cbench()
{
printf("%s took: %.4f ms\n", desc, (float)(get_cpu_clocks()-start)/get_cpu_ticks());
if(desc) free(desc);
}
private:
char *desc;
__int64 start;
};
Usage Example:
int main()
{
{
cbench c("test");
... code ...
}
return 0;
}
Result:
test took: 0.0002 ms
Has some function call overhead, but should be still more than fast enough :)

You can use Embedded Profiler (free for Windows and Linux) which has an interface to a multiplatform timer (in a processor cycle count) and can give you a number of cycles per seconds:
EProfilerTimer timer;
timer.Start();
... // Your code here
const uint64_t number_of_elapsed_cycles = timer.Stop();
const uint64_t nano_seconds_elapsed =
mumber_of_elapsed_cycles / (double) timer.GetCyclesPerSecond() * 1000000000;
Recalculation of cycle count to time is possibly a dangerous operation with modern processors where CPU frequency can be changed dynamically. Therefore to be sure that converted times are correct, it is necessary to fix processor frequency before profiling.

If this is for Linux, I've been using the function "gettimeofday", which returns a struct that gives the seconds and microseconds since the Epoch. You can then use timersub to subtract the two to get the difference in time, and convert it to whatever precision of time you want. However, you specify nanoseconds, and it looks like the function clock_gettime() is what you're looking for. It puts the time in terms of seconds and nanoseconds into the structure you pass into it.

What do you think about that:
int iceu_system_GetTimeNow(long long int *res)
{
static struct timespec buffer;
//
#ifdef __CYGWIN__
if (clock_gettime(CLOCK_REALTIME, &buffer))
return 1;
#else
if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &buffer))
return 1;
#endif
*res=(long long int)buffer.tv_sec * 1000000000LL + (long long int)buffer.tv_nsec;
return 0;
}

Here is a nice Boost timer that works well:
//Stopwatch.hpp
#ifndef STOPWATCH_HPP
#define STOPWATCH_HPP
//Boost
#include <boost/chrono.hpp>
//Std
#include <cstdint>
class Stopwatch
{
public:
Stopwatch();
virtual ~Stopwatch();
void Restart();
std::uint64_t Get_elapsed_ns();
std::uint64_t Get_elapsed_us();
std::uint64_t Get_elapsed_ms();
std::uint64_t Get_elapsed_s();
private:
boost::chrono::high_resolution_clock::time_point _start_time;
};
#endif // STOPWATCH_HPP
//Stopwatch.cpp
#include "Stopwatch.hpp"
Stopwatch::Stopwatch():
_start_time(boost::chrono::high_resolution_clock::now()) {}
Stopwatch::~Stopwatch() {}
void Stopwatch::Restart()
{
_start_time = boost::chrono::high_resolution_clock::now();
}
std::uint64_t Stopwatch::Get_elapsed_ns()
{
boost::chrono::nanoseconds nano_s = boost::chrono::duration_cast<boost::chrono::nanoseconds>(boost::chrono::high_resolution_clock::now() - _start_time);
return static_cast<std::uint64_t>(nano_s.count());
}
std::uint64_t Stopwatch::Get_elapsed_us()
{
boost::chrono::microseconds micro_s = boost::chrono::duration_cast<boost::chrono::microseconds>(boost::chrono::high_resolution_clock::now() - _start_time);
return static_cast<std::uint64_t>(micro_s.count());
}
std::uint64_t Stopwatch::Get_elapsed_ms()
{
boost::chrono::milliseconds milli_s = boost::chrono::duration_cast<boost::chrono::milliseconds>(boost::chrono::high_resolution_clock::now() - _start_time);
return static_cast<std::uint64_t>(milli_s.count());
}
std::uint64_t Stopwatch::Get_elapsed_s()
{
boost::chrono::seconds sec = boost::chrono::duration_cast<boost::chrono::seconds>(boost::chrono::high_resolution_clock::now() - _start_time);
return static_cast<std::uint64_t>(sec.count());
}

Minimalistic copy&paste-struct + lazy usage
If the idea is to have a minimalistic struct that you can use for quick tests, then I suggest you just copy and paste anywhere in your C++ file right after the #include's. This is the only instance in which I sacrifice Allman-style formatting.
You can easily adjust the precision in the first line of the struct. Possible values are: nanoseconds, microseconds, milliseconds, seconds, minutes, or hours.
#include <chrono>
struct MeasureTime
{
using precision = std::chrono::microseconds;
std::vector<std::chrono::steady_clock::time_point> times;
std::chrono::steady_clock::time_point oneLast;
void p() {
std::cout << "Mark "
<< times.size()/2
<< ": "
<< std::chrono::duration_cast<precision>(times.back() - oneLast).count()
<< std::endl;
}
void m() {
oneLast = times.back();
times.push_back(std::chrono::steady_clock::now());
}
void t() {
m();
p();
m();
}
MeasureTime() {
times.push_back(std::chrono::steady_clock::now());
}
};
Usage
MeasureTime m; // first time is already in memory
doFnc1();
m.t(); // Mark 1: next time, and print difference with previous mark
doFnc2();
m.t(); // Mark 2: next time, and print difference with previous mark
doStuff = doMoreStuff();
andDoItAgain = doStuff.aoeuaoeu();
m.t(); // prints 'Mark 3: 123123' etc...
Standard output result
Mark 1: 123
Mark 2: 32
Mark 3: 433234
If you want summary after execution
If you want the report afterwards, because for example your code in between also writes to standard output. Then add the following function to the struct (just before MeasureTime()):
void s() { // summary
int i = 0;
std::chrono::steady_clock::time_point tprev;
for(auto tcur : times)
{
if(i > 0)
{
std::cout << "Mark " << i << ": "
<< std::chrono::duration_cast<precision>(tprev - tcur).count()
<< std::endl;
}
tprev = tcur;
++i;
}
}
So then you can just use:
MeasureTime m;
doFnc1();
m.m();
doFnc2();
m.m();
doStuff = doMoreStuff();
andDoItAgain = doStuff.aoeuaoeu();
m.m();
m.s();
Which will list all the marks just like before, but then after the other code is executed. Note that you shouldn't use both m.s() and m.t().

plf::nanotimer is a lightweight option for this, works in Windows, Linux, Mac and BSD etc. Has ~microsecond accuracy depending on OS:
#include "plf_nanotimer.h"
#include <iostream>
int main(int argc, char** argv)
{
plf::nanotimer timer;
timer.start()
// Do something here
double results = timer.get_elapsed_ns();
std::cout << "Timing: " << results << " nanoseconds." << std::endl;
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js