Benchmarking an application in a fully loaded machine - c++

I need to "time" or benchmark a number crunching application written in C/C++. The problem is that the machine where I run the program is usually full of people doing similar things, so the CPUs are always at full load.
I thought about using functions from time.h liket "get time of the day" (don't remember the exact syntax, sorry) and similars, but I am afraid they would not be good for this case, am I right?
And the program "time" from bash, gave me some errors long time ago.
Also the problem is, that sometimes I need to get timings in the range of 0.5 secs and so on.
Anybody has a hint?
P.S.: compiler is gcc and in some cases nvcc (NVIDIA)
P.S.2: in my benchmarks I just want to measure the execution time between two parts of the main function

You didn't mention which compiler you are using, but with GNU's g++ I usually set the -pg flag to build with profiling informations.
Each time you run the application, it will create an output file that, parsed with gprof application, will give you lots of information about the performances.
See this for starters.

From your other recent questions, you seem to be using MPI for parallelisation. Assuming this question is within the same context, then the simplest way to time your application would be to use MPI_Wtime().
From the man page:
This subroutine returns the current
value of time as a double precision
floating point number of seconds. This
value represents elapsed time since
some point in the past. This time in
the past will not change during the
life of the task. You are responsible
for converting the number of seconds
into other units if you prefer.
Example usage:
#include "mpi.h"
int main(int argc, char **argv)
{
int rc, taskid;
double t_start, t_end;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD,&taskid);
t_start = MPI_Wtime();
/* .... your computation kernel .... */
t_end = MPI_Wtime();
/* make sure all processes have completed */
MPI_Barrier(MPI_COMM_WORLD);
if (taskid == 0) {
printf("Elapsed time: %1.2f seconds\n", t_start - t_end);
}
MPI_Finalize();
return 0;
}
The advantage of this is that we let the underlying MPI library handle platform specific ways of handling time, although you might want to use MPI_Wtick() to determine the resolution of the timer used on each platform.

It's hard to meaningfully compare timings from programs running for such a short time. Usually the solution is to run multiple times.
The time builtin in bash (or /usr/bin/time) will report time actually used by the processor, which will be more useful on a loaded machine than wall-clock time, but there is too much going on to really compare timings on a fine grain – huge differences of orders of magnitude will still be apparent.
You can also use clock to get a rough estimate:
#include <ctime>
#include <iostream>
struct Timer {
std::clock_t _start, _stop;
Timer() : _start(std::clock()) {}
void restart() { _start = std::clock(); }
void stop() { _stop = std::clock(); }
std::clock_t clocks() const { return _stop - _start; }
double secs() const { return double(clocks()) / CLOCKS_PER_SEC; }
};
int main() {
Timer t;
//run_some_code();
t.stop();
std::cout << "That took " << t.secs() << " seconds.\n";
return 0;
}

Related

C/C++ elapsed process cycles, not including at breakpoints

See the following code, which is my attempt to print the time elapsed between loops.
void main()
{
while (true)
{
static clock_t timer;
clock_t t = clock();
clock_t elapsed = t - timer;
float elapsed_sec = (float)elapsed / (float)CLOCKS_PER_SEC;
timer = t;
printf("[dt:%d ms].\n", (int)(elapsed_sec*1000));
}
}
However, if I set a breakpoint and sit there for 10 seconds, when i continue execution the elapsed time includes that 5 seconds -- and I don't want it to, for my intended usage.
I assume clock() is the wrong function, but what is the correct one?
Note that if there IS no standard C or C++ single call for this -- well, how do you compute it? Is there a posix way?
I suspect that this is actually information only knowable with platform-specific calls. If that is the case, i'd like to at least know how to do so on windows (msvc).
Yes, trying to measure the CPU time of your process would be dependent on support from your operating system. Rather than look up what support is available from various operating systems, though, I would propose that your approach is flawed.
Debugging typically uses a debug build that has most optimizations turned off (to make it easier to do things like set breakpoints accurately). Timings on a non-optimized build lack practical value. Hence any timings of your program should usually be ignored when you are using breakpoints–even if the breakpoint is outside the timed section.
To combine using breakpoints with timings, I would do the debugging in two phases. In one phase, you set breakpoints and look at what is happening in the debug build. In the other phase, you use identical input (redirect a file into std::cin if it helps) and time the process in the release build. There may be some back-and-forth between the stages as you work out what is going on. That's fine; the point is not to have exactly two phases, but to keep breakpoints and timings separate.
Although JaMit gives a better answer (here), it is possible but it depends entirely on your compiler and the amount of overhead this creates will probably slow down your program too much to get an accurate result. You can use whatever time recording function you please but either way you would have to:
Record the start of the loop
Record the start of the breakpoint
Programmatically cause a breakpoint
Record the end of the breakpoint
Record the end of the loop.
If you're looking for speed though, you really need to be testing in an optimized release mode without breakpoints and writing the output to the console or a file. Nonetheless, it is possible to do what you're trying to do, and here's a sample solution.
#include <chrono>
#include <intrin.h> //include for visual studio break
#include <iostream>
int main(void) {
for (int c = 0; c < 100; c++) {
//Start of loop
auto start = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Do stuff here*/
auto startBreak = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
//visual studio only
__debugbreak();
auto endBreak = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Do more stuff here*/
auto end = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::high_resolution_clock::now().time_since_epoch()).count();
/*Time for 1 pass of the loop, including all the records of getting the time*/
std::cout << end - start - (endBreak - startBreak) << "\n";
}
return 0;
}
given your objective: which is my attempt to print the time elapsed between loops.
in the C language:
Note that clock() returns the number of clock tics since the program started.
This code measures the elapsed time for each loop by showing the start/end time for each loop
#include <stdio.h>
#include <time.h>
int main( void )
{
clock_t loopStart = clock();
clock_t loopEnd;
for( int i=0; i< 10000; i++ )
{
loopEnd = clock();
// something here to be timed
printf("[%lf : %lf ms].\n", (double)loopStart/CLOCKS_PER_SEC, (double)loopEnd/CLOCKS_PER_SEC );
loopStart = loopEnd;
}
}
Of course, if you want to display the actual number of clock ticks per loop, then remove the division by CLOCKS_PER_SEC and calculate the difference and only display the difference

How to get the time elapsed running a function in C++

I tried some codes by googling :
clock_t start, end;
start = clock();
//CODES GOES HERE
end = clock();
std::cout << end - start <<"\n";
std::cout << (double) (end-start)/ CLOCKS_PER_SEC;
but the result elapsed time always was 0, even with
std::cout << (double) (end-start)/ (CLOCKS_PER_SEC/1000.0 );
Don't know why but when I get the similar in Java : getCurrentTimeMillis() it works well. I want it to show the milliseconds as maybe the computer compute so fast.
I don't think it's guaranteed that clock has a high enough resolution to profile your function. If you want to know how fast a function executes, you should run it maybe a few thousands times instead of once, measure the total time it takes and take the average.
#include <boost/progress.hpp>
int main()
{
boost::progress_timer timer;
// code to time goes here
}
This will print out the time it took to run main. You can place your code in scopes to time several parts, i.e. { boost::progress_timer timer; ... }.
This question is somehow similar to yours: Timing a function in a C++ program that runs on Linux
Take a look at this answer!

c++ get milliseconds since some date

I need some way in c++ to keep track of the number of milliseconds since program execution. And I need the precision to be in milliseconds. (In my googling, I've found lots of folks that said to include time.h and then multiply the output of time() by 1000 ... this won't work.)
clock has been suggested a number of times. This has two problems. First of all, it often doesn't have a resolution even close to a millisecond (10-20 ms is probably more common). Second, some implementations of it (e.g., Unix and similar) return CPU time, while others (E.g., Windows) return wall time.
You haven't really said whether you want wall time or CPU time, which makes it hard to give a really good answer. On Windows, you could use GetProcessTimes. That will give you the kernel and user CPU times directly. It will also tell you when the process was created, so if you want milliseconds of wall time since process creation, you can subtract the process creation time from the current time (GetSystemTime). QueryPerformanceCounter has also been mentioned. This has a few oddities of its own -- for example, in some implementations it retrieves time from the CPUs cycle counter, so its frequency varies when/if the CPU speed changes. Other implementations read from the motherboard's 1.024 MHz timer, which does not vary with the CPU speed (and the conditions under which each are used aren't entirely obvious).
On Unix, you can use GetTimeOfDay to just get the wall time with (at least the possibility of) relatively high precision. If you want time for a process, you can use times or getrusage (the latter is newer and gives more complete information that may also be more precise).
Bottom line: as I said in my comment, there's no way to get what you want portably. Since you haven't said whether you want CPU time or wall time, even for a specific system, there's not one right answer. The one you've "accepted" (clock()) has the virtue of being available on essentially any system, but what it returns also varies just about the most widely.
See std::clock()
Include time.h, and then use the clock() function. It returns the number of clock ticks elapsed since the program was launched. Just divide it by "CLOCKS_PER_SEC" to obtain the number of seconds, you can then multiply by 1000 to obtain the number of milliseconds.
Some cross platform solution. This code was used for some kind of benchmarking:
#ifdef WIN32
LARGE_INTEGER g_llFrequency = {0};
BOOL g_bQueryResult = QueryPerformanceFrequency(&g_llFrequency);
#endif
//...
long long osQueryPerfomance()
{
#ifdef WIN32
LARGE_INTEGER llPerf = {0};
QueryPerformanceCounter(&llPerf);
return llPerf.QuadPart * 1000ll / ( g_llFrequency.QuadPart / 1000ll);
#else
struct timeval stTimeVal;
gettimeofday(&stTimeVal, NULL);
return stTimeVal.tv_sec * 1000000ll + stTimeVal.tv_usec;
#endif
}
The most portable way is using the clock function.It usually reports the time that your program has been using the processor, or an approximation thereof. Note however the following:
The resolution is not very good for GNU systems. That's really a pity.
Take care of casting everything to double before doing divisions and assignations.
The counter is held as a 32 bit number in GNU 32 bits, which can be pretty annoying for long-running programs.
There are alternatives using "wall time" which give better resolution, both in Windows and Linux. But as the libc manual states: If you're trying to optimize your program or measure its efficiency, it's very useful to know how much processor time it uses. For that, calendar time and elapsed times are useless because a process may spend time waiting for I/O or for other processes to use the CPU.
Here is a C++0x solution and an example why clock() might not do what you think it does.
#include <chrono>
#include <iostream>
#include <cstdlib>
#include <ctime>
int main()
{
auto start1 = std::chrono::monotonic_clock::now();
auto start2 = std::clock();
sleep(1);
for( int i=0; i<100000000; ++i);
auto end1 = std::chrono::monotonic_clock::now();
auto end2 = std::clock();
auto delta1 = end1-start1;
auto delta2 = end2-start2;
std::cout << "chrono: " << std::chrono::duration_cast<std::chrono::duration<float>>(delta1).count() << std::endl;
std::cout << "clock: " << static_cast<float>(delta2)/CLOCKS_PER_SEC << std::endl;
}
On my system this outputs:
chrono: 1.36839
clock: 0.36
You'll notice the clock() method is missing a second. An astute observer might also notice that clock() looks to have less resolution. On my system it's ticking by in 12 millisecond increments, terrible resolution.
If you are unable or unwilling to use C++0x, take a look at Boost.DateTime's ptime microsec_clock::universal_time().
This isn't C++ specific (nor portable), but you can do:
SYSTEMTIME systemDT;
In Windows.
From there, you can access each member of the systemDT struct.
You can record the time when the program started and compare the current time to the recorded time (systemDT versus systemDTtemp, for instance).
To refresh, you can call GetLocalTime(&systemDT);
To access each member, you would do systemDT.wHour, systemDT.wMinute, systemDT.wMilliseconds.
To get more information on SYSTEMTIME.
Do you want wall clock time, CPU time, or some other measurement? Also, what platform is this? There is no universally portable way to get more precision than time() and clock() give you, but...
on most Unix systems, you can use gettimeofday() and/or clock_gettime(), which give at least microsecond precision and access to a variety of timers;
I'm not nearly as familiar with Windows, but one of these functions probably does what you want.
You can try this code (get from StockFish chess engine source code (GPL)):
#include <iostream>
#include <stdio>
#if !defined(_WIN32) && !defined(_WIN64) // Linux - Unix
# include <sys/time.h>
typedef timeval sys_time_t;
inline void system_time(sys_time_t* t) {
gettimeofday(t, NULL);
}
inline long long time_to_msec(const sys_time_t& t) {
return t.tv_sec * 1000LL + t.tv_usec / 1000;
}
#else // Windows and MinGW
# include <sys/timeb.h>
typedef _timeb sys_time_t;
inline void system_time(sys_time_t* t) { _ftime(t); }
inline long long time_to_msec(const sys_time_t& t) {
return t.time * 1000LL + t.millitm;
}
#endif
struct Time {
void restart() { system_time(&t); }
uint64_t msec() const { return time_to_msec(t); }
long long elapsed() const {
return long long(current_time().msec() - time_to_msec(t));
}
static Time current_time() { Time t; t.restart(); return t; }
private:
sys_time_t t;
};
int main() {
sys_time_t t;
system_time(&t);
long long currentTimeMs = time_to_msec(t);
std::cout << "currentTimeMs:" << currentTimeMs << std::endl;
Time time = Time::current_time();
for (int i = 0; i < 1000000; i++) {
//Do something
}
long long e = time.elapsed();
std::cout << "time elapsed:" << e << std::endl;
getchar(); // wait for keyboard input
}

Windows: How do I calculate the time it takes a c/c++ application to run?

I am doing a performance comparison test. I want to record the run time for my c++ test application and compare it under different circumstances. The two cases to be compare are: 1) a file system driver is installed and active and 2) also when that same file system driver is not installed and active.
A series of tests will be conducted on several operating systems and the two runs described above will be done for each operating system and it's setup. Results will only be compared between the two cases for a given operating system and setup.
I understand that when running a c/c++ application within an operating system that is not a real-time system there is no way to get the real time it took for the application to run. I don't think this is a big concern as long as the test application runs for a fairly long period of time, therefore making the scheduling, priorities, switching, etc of the CPU negligible.
Edited: For Windows platform only
How can I generate some accurate application run time results within my test application?
If you're on a POSIX system you can use the time command, which will give you the total "wall clock" time as well as the actual CPU times (user and system).
Edit: Apparently there's an equivalent for Windows systems in the Windows Server 2003 Resource Kit called timeit.exe (not verified).
I think what you are asking is "How do I measure the time it takes for the process to run, irrespective of the 'external' factors, such as other programs running on the system?" In that case, the easiest thing would be to run the program multiple times, and get an average time. This way you can have a more meaningful comparison, hoping that various random things that the OS spends the CPU time on will average out. If you want to get real fancy, you can use a statistical test, such as the two-sample t-test, to see if the difference in your average timings is actually significant.
You can put this
#if _DEBUG
time_t start = time(NULL);
#endif
and finish with this
#if _DEBUG
time end = time(NULL);
#endif
in your int main() method. Naturally you'll have to return the difference either to a log or cout it.
Just to expand on ezod's answer.
You run the program with the time command to get the total time - there are no changes to your program
If you're on a Windows system you can use the high-performance counters by calling QueryPerformanceCounter():
#include <windows.h>
#include <string>
#include <iostream>
int main()
{
LARGE_INTEGER li = {0}, li2 = {0};
QueryPerformanceFrequency(&li);
__int64 freq = li.QuadPart;
QueryPerformanceCounter(&li);
// run your app here...
QueryPerformanceCounter(&li2);
__int64 ticks = li2.QuadPart-li.QuadPart;
cout << "Reference Implementation Ran In " << ticks << " ticks" << " (" << format_elapsed((double)ticks/(double)freq) << ")" << endl;
return 0;
}
...and just as a bonus, here's a function that converts the elapsed time (in seconds, floating point) to a descriptive string:
std::string format_elapsed(double d)
{
char buf[256] = {0};
if( d < 0.00000001 )
{
// show in ps with 4 digits
sprintf(buf, "%0.4f ps", d * 1000000000000.0);
}
else if( d < 0.00001 )
{
// show in ns
sprintf(buf, "%0.0f ns", d * 1000000000.0);
}
else if( d < 0.001 )
{
// show in us
sprintf(buf, "%0.0f us", d * 1000000.0);
}
else if( d < 0.1 )
{
// show in ms
sprintf(buf, "%0.0f ms", d * 1000.0);
}
else if( d <= 60.0 )
{
// show in seconds
sprintf(buf, "%0.2f s", d);
}
else if( d < 3600.0 )
{
// show in min:sec
sprintf(buf, "%01.0f:%02.2f", floor(d/60.0), fmod(d,60.0));
}
// show in h:min:sec
else
sprintf(buf, "%01.0f:%02.0f:%02.2f", floor(d/3600.0), floor(fmod(d,3600.0)/60.0), fmod(d,60.0));
return buf;
}
Download Cygwin and run your program by passing it as an argument to the time command. When you're done, spend some time to learn the rest of the Unix tools that come with Cygwin. This will be one of the best investments for your career you'll ever make; the Unix toolchest is a timeless classic.
QueryPerformanceCounter can have problems on multicore systems, so I prefer to use timeGetTime() which gives the result in milliseconds
you need a 'timeBeginPeriod(1)' before and 'timeEndPeriod(1)' afterwards to reduce the granularity as far as you can but I find it works nicely for my purposes (regulating timesteps in games), so it should be okay for benchmarking.
You can also use the program very sleepy to get a bunch of runtime information about your program. Here's a link: http://www.codersnotes.com/sleepy

How to get system time in C++?

In fact i am trying to calculate the time a function takes to complete in my program.
So i am using the logic to get system time when i call the function and time when the function returns a value then by subtracting the values i get time it took to complete.
So if anyone can tell me some better approach or just how to get system time at an instance it would be quite a help
The approach I use when timing my code is the time() function. It returns a single numeric value to you representing the epoch which makes the subtraction part easier for calculation.
Relevant code:
#include <time.h>
#include <iostream>
int main (int argc, char *argv[]) {
int startTime, endTime, totalTime;
startTime = time(NULL);
/* relevant code to benchmark in here */
endTime = time(NULL);
totalTime = endTime - startTime;
std::cout << "Runtime: " << totalTime << " seconds.";
return 0;
}
Keep in mind this is user time. For CPU, time see Ben's reply.
Your question is totally dependant on WHICH system you are using. Each system has its own functions for getting the current time. For finding out how long the system has been running, you'd want to access one of the "high resolution performance counters". If you don't use a performance counter, you are usually limited to microsecond accuracy (or worse) which is almost useless in profiling the speed of a function.
In Windows, you can access the counter via the 'QueryPerformanceCounter()' function. This returns an arbitrary number that is different on each processor. To find out how many ticks in the counter == 1 second, call 'QueryPerformanceFrequency()'.
If you're coding under a platform other than windows, just google performance counter and the system you are coding under, and it should tell you how you can access the counter.
Edit (clarification)
This is c++, just include windows.h and import the "Kernel32.lib" (seems to have removed my hyperlink, check out the documentation at: http://msdn.microsoft.com/en-us/library/ms644904.aspx). For C#, you can use the "System.Diagnostics.PerformanceCounter" class.
You can use time_t
Under Linux, try gettimeofday() for microsecond resolution, or clock_gettime() for nanosecond resolution.
(Of course the actual clock may have a coarser resolution.)
In some system you don't have access to the time.h header. Therefore, you can use the following code snippet to find out how long does it take for your program to run, with the accuracy of seconds.
void function()
{
time_t currentTime;
time(&currentTime);
int startTime = currentTime;
/* Your program starts from here */
time(&currentTime);
int timeElapsed = currentTime - startTime;
cout<<"It took "<<timeElapsed<<" seconds to run the program"<<endl;
}
You can use the solution with std::chrono described here: Getting an accurate execution time in C++ (micro seconds) you will have much better accuracy in your measurement. Usually we measure code execution in the round of the milliseconds (ms) or even microseconds (us).
#include <chrono>
#include <iostream>
...
[YOUR METHOD/FUNCTION STARTING HERE]
auto start = std::chrono::high_resolution_clock::now();
[YOUR TEST CODE HERE]
auto elapsed = std::chrono::high_resolution_clock::now() - start;
long long microseconds = std::chrono::duration_cast<std::chrono::microseconds>(elapsed).count();
std::cout << "Elapsed time: " << microseconds << " ms;