I am curious if there is a build-in function in C++ for measuring the execution time?
I am using Windows at the moment. In Linux it's pretty easy...
The best way on Windows, as far as I know, is to use QueryPerformanceCounter and QueryPerformanceFrequency.
QueryPerformanceCounter(LARGE_INTEGER*) places the performance counter's value into the LARGE_INTEGER passed.
QueryPerformanceFrequency(LARGE_INTEGER*) places the frequency the performance counter is incremented into the LARGE_INTEGER passed.
You can then find the execution time by recording the counter as execution starts, and then recording the counter when execution finishes. Subtract the start from the end to get the counter's change, then divide by the frequency to get the time in seconds.
LARGE_INTEGER start, finish, freq;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&start);
// Do something
QueryPerformanceCounter(&finish);
std::cout << "Execution took "
<< ((finish.QuadPart - start.QuadPart) / (double)freq.QuadPart) << std::endl;
It's pretty easy under Windows too - in fact it's the same function on both std::clock, defined in <ctime>
You can use the Windows API Function GetTickCount() and compare the values at start and end. Resolution is in the 16 ms ballpark. If for some reason you need more fine-grained timings, you'll need to look at QueryPerformanceCounter.
C++ has no built-in functions for high-granularity measuring code execution time, you have to resort to platform-specific code. For Windows try QueryPerformanceCounter: http://msdn.microsoft.com/en-us/library/ms644904(VS.85).aspx
The functions you should use depend on the resolution of timer you need. Some of them give 10ms resolutions. Those functions are easier to use. Others require more work, but give much higher resolution (and might cause you some headaches in some environments. Your dev machine might work fine, though).
http://www.geisswerks.com/ryan/FAQS/timing.html
This articles mentions:
timeGetTime
RDTSC (a processor feature, not an OS feature)
QueryPerformanceCounter
C++ works on many platforms. Why not use something that also works on many platforms, such as the Boost libraries.
Look at the documentation for the Boost Timer Library
I believe that it is a header-only library, which means that it is simple to setup and use...
Related
I am currently implementing a PID controller for a project I am doing, but I realized I don't know how to ensure a fixed interval for each iteration. I want the PID controller to run at a frequency of 10Hz, but I don't want to use any sleep functions or anything that would otherwise slow down the thread it's running in. I've looked around but I cannot for the life of me find any good topics/functions that simply gives me an accurate measurement of milliseconds. Those that I have found simply uses time_t or clock_t, but time_t only seems to give seconds(?) and clock_t will vary greatly depending on different factors.
Is there any clean and good way to simply see if it's been >= 100 milliseconds since a given point in time in C++? I'm using the Qt5 framework and OpenCV library and the program is running on an ODROID X-2, if that's of any helpful information to anyone.
Thank you for reading, Christian.
I don't know much about the ODROID X-2 platform but if it's at all unixy you may have access to gettimeofday or clock_gettime either one of which would provide a higher resolution clock if available on your hardware.
I want to perform the above mentioned operation in Milliseconds as the unit. Which library and function call should I prefer ?
Ty.
Or if you are using Visual Studio 2010 (or another c++0x aware compiler) use
#include <thread>
#include <chrono>
std::this_thread::sleep();
// or
std::this_thread::sleep_for(std::chrono::milliseconds(10));
With older compilers you can have the same convenience using the relevant Boost Libraries
Needless to say the major benefit here is portability and the ease of converting the delay parameter to 'human' units.
You could use the Sleep function from Win32 API.
the windows task scheduler has a granularity far above 1ms (generally, 20ms). you can test this by using the performance counter to measure the time really spent in the Sleep() function. (using QueryPerformanceFrequency() and QueryPerformanceCounter() allows you to measure time down to the nanosecond). note that Sleep(0) makes the thread sleep for the shortest period of time possible.
however, you can change this behavior by using timeBeginPeriod(), and passing a 1ms period. now Sleep(0) should return much faster.
note that this function call was made for playing multimedia streams with a better accuracy. i have never had any problem using this, but the need for such a fast period is quite rare. depending on what you are trying to achieve, there may be better ways to get the accuracy you want, without resorting to this "hack".
Er, the sleep() function from win32 api?
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686298%28v=vs.85%29.aspx
I'm doing a article about GPU speed up in cluster environment
To do that, I'm programming in CUDA, that is basically a c++ extension.
But, as I'm a c# developer I don't know the particularities of c++.
There is some concern about logging elapsed time? Some suggestion or blog to read.
My initial idea is make a big loop and run the program several times. 50 ~ 100, and log every elapsed time to after make some graphics of velocity.
Depending on your needs, it can be as easy as:
time_t start = time(NULL);
// long running process
printf("time elapsed: %d\n", (time(NULL) - start));
I guess you need to tell how you plan this to be logged (file or console) and what is the precision you need (seconds, ms, us, etc). "time" gives it in seconds.
I would recommend using the boost timer library . It is platform agnostic, and is as simple as:
#include <boost/timer/timer.hpp>
boost::timer t;
// do some stuff, up until when you want to start timing
t.restart();
// do the stuff you want to time.
std::cout << t.elapsed() << std::endl;
Of course t.elapsed() returns a double that you can save to a variable.
Standard functions such as time often have a very low resolution. And yes, a good way to get around this is to run your test many times and take an average. Note that the first few times may be extra-slow because of hidden start-up costs - especially when using complex resources like GPUs.
For platform-specific calls, take a look at QueryPerformanceCounter on Windows and CFAbsoluteTimeGetCurrent on OS X. (I've not used POSIX call clock_gettime but that might be worth checking out.)
Measuring GPU performance is tricky because GPUs are remote processing units running separate instructions - often on many parallel units. You might want to visit Nvidia's CUDA Zone for a variety of resources and tools to help measure and optimize CUDA code. (Resources related to OpenCL are also highly relevant.)
Ultimately, you want to see how fast your results make it to the screen, right? For that reason, a call to time might well suffice for your needs.
What techniques / methods exist for getting sub-millisecond precision timing data in C or C++, and what precision and accuracy do they provide? I'm looking for methods that don't require additional hardware. The application involves waiting for approximately 50 microseconds +/- 1 microsecond while some external hardware collects data.
EDIT: OS is Wndows, probably with VS2010. If I can get drivers and SDK's for the hardware on Linux, I can go there using the latest GCC.
When dealing with off-the-shelf operating systems, accurate timing is an extremely difficult and involved task. If you really need guaranteed timing, the only real option is a full real-time operating system. However if "almost always" is good enough, here are a few tricks you can use that will provide good accuracy under commodity Windows & Linux
Use a Sheilded CPU Basically, this means turn off IRQ affinity for a selected CPU & set the processor affinity mask for all other processes on the machine to ignore your targeted CPU. On your app, set the CPU affinity to run only on your shielded CPU. Effectively, this should prevent the OS from ever suspending your app as it will always be the only runnable process for that CPU.
Never allow let your process willingly yield control to the OS (which is inherently non-deterministic for non realtime OSes). No memory allocation, no sockets, no mutexes, nada. Use the RDTSC to spin in a while loop waiting for your target time to arrive. It'll consume 100% CPU but it's the most accurate way to go.
If number 2 is a bit too draconic, you can 'sleep short' and then burn the CPU up to your target time. Here, you take advantage of the fact that the OS schedules the CPU at set intervals. Usually 100 times per second or 1000 times per second depending on your OS and configuration (On windows you can change the default scheduling period of 100/s to 1000/s using the multimedia API). This can be a little hard to get right but essentially you need determine when the OS scheduling periods occur and calculate the one prior to your target wake time. Sleep for this duration and then, upon waking, spin on RDTSC (if you're on a single CPU... use QueryPerformanceCounter or the Linux equivalent if not) until your target time arrives. Occasionally, OS scheduling will cause you to miss but, generally speaking, this mechanism works pretty good.
It seems like a simple question, but attaining 'good' timing get's exponentially more difficult the tighter your timing constraints are. Good luck!
The hardware (and therefore resolution) varies from machine to machine. On Windows, specifically (I'm not sure about other platforms), you can use QueryPerformanceCounter and QueryPerformanceFrequency, but be aware you should call both from the same thread and there are no strict guarantees about resolution (QueryPerformanceFrequency is allowed to return 0 meaning no high resolution timer is available). However, on most modern desktops, there should be one accurate to microseconds.
boost::datetime has microsecond precision clock but its accuracy depends on the platform.
The documentation states:
ptime microsec_clock::local_time()
"Get the local time using a sub second resolution clock. On Unix systems this is implemented using GetTimeOfDay. On most Win32 platforms it is implemented using ftime. Win32 systems often do not achieve microsecond resolution via this API. If higher resolution is critical to your application test your platform to see the achieved resolution."
http://www.boost.org/doc/libs/1_43_0/doc/html/date_time/posix_time.html#date_time.posix_time.ptime_class
You may try the following:
struct timeval t;
gettimeofday(&t,0x0);
This gives you current timestamp in micro-seconds. I am not sure about the accuracy.
You could try the technique described here, but it's not portable.
Most modern processors have registers for timing or other instrumentation purposes. On x86 since Pentium days there is the RDTSC instruction, for example. You compiler may give you access to this instruction.
See wikipedia for more info.
timeval in sys/time.h has a member 'tv_usec' which is microseconds.
This link and the code below will help illustrate:
http://www.opengroup.org/onlinepubs/000095399/basedefs/sys/time.h.html
timeval start;
timeval finish;
long int sec_diff;
long int mic_diff;
gettimeofday(&start, 0);
cout << "whooo hooo" << endl;
gettimeofday(&finish, 0);
sec_diff = finish.tv_sec - start.tv_sec;
mic_diff = finish.tv_usec - start.tv_usec;
cout << "cout-ing 'whooo hooo' took " << sec_diff << "seconds and " << mic_diff << " micros." << endl;
gettimeofday(&start, 0);
printf("whooo hooo\n");
gettimeofday(&finish, 0);
sec_diff = finish.tv_sec - start.tv_sec;
mic_diff = finish.tv_usec - start.tv_usec;
cout << "fprint-ing 'whooo hooo' took " << sec_diff << "seconds and " << mic_diff << " micros." << endl;
Good luck trying to do that with MS Windows. You need a realtime operating system, that is to say, one where timing is guaranteed repeatable. Windows can switch to another thread or even another process at an inopportune moment. You will also have no control over cache misses.
When I was doing realtime robotic control, I used a very lightweight OS called OnTime RTOS32, which has a partial Windows API emulation layer. I do not know if it would be suitable for what you are doing. However, with Windows, you will probably never be able to prove that it will never fail to give the timely response.
A combination of GetSystemTimeAsFileTime and QueryPerformanceCounter can result in a reliable suite of code to obtain microsecond resolution time services on windows.
See this comment in another thread here.
I essentially want to reconstruct the getTickCount() windows function so I can use it in basic C++ without any non standard libraries or even the STL. (So it complies with the libraries supplied with the Android NDK)
I have looked at
clock()
localtime
time
But I'm still unsure whether it is possible to replicate the getTickCount windows function with the time library.
Can anyone point me in the right direction as to how to do this or even if its possible?
An overview of what I want to do:
I want to be able to calculate how long an application has been "doing" a certain function.
So for example I want to be able to calculate how long the application has been trying to register with a server
I am trying to port it from windows to run on the linux based Android, here is the windows code:
int TimeoutTimer::GetSpentTime() const
{
if (m_On)
{
if (m_Freq>1)
{
unsigned int now;
QueryPerformanceCounter((int*)&now);
return (int)((1000*(now-m_Start))/m_Freq);
}
else
{
return (GetTickCount()-(int)m_Start);
}
}
return -1;
}
On Android NDK you can use the POSIX clock_gettime() call, which is part of libc. This function is where various Android timer calls end up.
For example, java.lang.System.nanoTime() is implemented with:
struct timespec now;
clock_gettime(CLOCK_MONOTONIC, &now);
return (u8)now.tv_sec*1000000000LL + now.tv_nsec;
This example uses the monotonic clock, which is what you want when computing durations. Unlike the wall clock (available through gettimeofday()), it won't skip forward or backward when the device's clock is changed by the network provider.
The Linux man page for clock_gettime() describes the other clocks that may be available, such as the per-thread elapsed CPU time.
clock() works very similarly to Windows's GetTickCount(). The units may be different. GetTickCount() returns milliseconds. clock() returns CLOCKS_PER_SEC ticks per second. Both have a max that will rollover (for Windows, that's about 49.7 days).
GetTickCount() starts at zero when the OS starts. From the docs, it looks like clock() starts when the process does. Thus you can compare times between processes with GetTickCount(), but you probably can't do that with clock().
If you're trying to compute how long something has been happening, within a single process, and you're not worried about rollover:
const clock_t start = clock();
// do stuff here
clock_t now = clock();
clock_t delta = now - start;
double seconds_elapsed = static_cast<double>(delta) / CLOCKS_PER_SEC;
Clarification: There seems to be uncertainty in whether clock() returns elapsed wall time or processor time. The first several references I checked say wall time. For example:
Returns the number of clock ticks elapsed since the program was launched.
which admittedly is a little vague. MSDN is more explicit:
The elapsed wall-clock time since the start of the process....
User darron convinced me to dig deeper, so I found a draft copy of the C standard (ISO/IEC 9899:TC2), and it says:
... returns the implementation’s best approximation to the processor time used ...
I believe every implementation I've ever used gives wall-clock time (which I suppose is an approximation to the processor time used).
Conclusion: If you're trying to time so code so you can benchmark various optimizations, then my answer is appropriate. If you're trying to implement a timeout based on actual wall-clock time, then you have to check your local implementation of clock() or use another function that is documented to give elapsed wall-clock time.
Update: With C++11, there is also the portion of the standard library, which provides a variety of clocks and types to capture times and durations. While standardized and widely available, it's not clear if the Android NDK fully supports yet.
This is platform dependent so you just have to write a wrapper and implement the specifics for each platform.
It's not possible. The C++ standard and, as consequence the standard library, know nothing about processors or 'ticks'. This may or may not change in C++0x with the threading support but at least for now, it's not possible.
Do you have access to a vblank interrupt function (or hblank) on the Android? If so, increment a global, volatile var there for a timer.