Global values inside template function not changing [c++] - c++

I need help with a project. Basically I need to measure the clock ticks for some sorting algorithms. Since they all use comparison and sometimes, swapping functions, I designed them to accept these as callback functions.
To measure the clock ticks I wrote:
static clock_t t1, total;
template<typename T>
bool less_default(T & left, T & right){
t1 = clock();
bool v = left < right;
t1 = clock() - t1;
total += t1
return v;
}
When I actually run the algorithms, neither total or t1 reflect any change whatsoever. As if the lines of code referring to them were never written.
nothing works. Not even an increment of a simple integer on function call.
Is it that static global variables can't be changed inside a template function?
I don't understand what I'm doing wrong here.

nothing works. Not even an increment of a simple integer on function call.
I suspect that the following appears in a header file:
static clock_t t1, total;
If that's the case, each translation unit will get its own separate instance of the two variables (thanks to static).
To fix, change static to extern in the header, and add the following to the .cpp file:
clock_t t1, total;
EDIT Sample to follow that demonstrates this:
Per the OP's request, this is a short example that uses a template comparator and the recipe in this answer to declare and manage a running clock total.
main.h
#ifndef PROJMAIN_DEFINED
#define PROJMAIN_DEFINED
extern clock_t total;
template<typename T>
bool less_default(const T& left, const T& right)
{
clock_t t1 = clock();
bool res = (left < right);
total += (clock() - t1);
return res;
};
#endif
main.cpp
#include <iostream>
#include <algorithm>
#include <iterator>
#include <vector>
#include "main.h"
using namespace std;
clock_t total = 0;
int main()
{
static const size_t N = 2048;
vector<int> values;
values.reserve(N);
std::srand((unsigned)time(0));
cout << "Generating..." << endl;
generate_n(back_inserter(values), N, [](){ static int i=0; return ++i;});
for (int i=0;i<5;++i)
{
random_shuffle(values.begin(), values.end());
cout << "Sorting ..." << endl;
total = 0;
std::sort(values.begin(), values.end(), less_default<int>);
cout << "Finished! : Total = " << total << endl;
}
return EXIT_SUCCESS;
}
Output
Generating...
Sorting ...
Finished! : Total = 13725
Sorting ...
Finished! : Total = 13393
Sorting ...
Finished! : Total = 15400
Sorting ...
Finished! : Total = 13830
Sorting ...
Finished! : Total = 15789

There appears to be a bug with how you are setting up the globals. (NPE's answer covers this.)
However, another thing to keep in mind is that you are trying to measure the performance of a single comparison. It depends on what T is, but for most simple types, this will be one or two CPU instructions, which is far too small to be measured accurately with a technique like this.
You would be much better off using a sampling profiler. With the code you have here, your instrumentation is much, much more expensive than the work being done, which makes the profiling data useless.

Related

Idiom for data aggregation and post processing in C++

A common task in programming is to process data on the fly and, when all data are collected, do some post processing. A simple example for this would be the computation of the average (and other statistics), where you can have a class like this
class Statistic {
public:
Statistic() : nr(0), sum(0.0), avg(0.0) {}
void add(double x) { sum += x; ++nr; }
void process() { avg = sum / nr; }
private:
int nr;
double sum;
double avg;
};
A disadvantage with this approach is, that we always have to remember to call the process() function after adding all the data. Since in C++ we have things like RAII, this seems like a less than ideal solution.
In Ruby, for example, we can write code like this
class Avg
attr_reader :avg
def initialize
#nr = 0
#sum = 0.0
#avg = nil
if block_given?
yield self
process
end
end
def add(x)
#nr += 1
#sum += x.to_f
end
def process
#avg = #sum / #nr
end
end
which we then can call like this
avg = Avg.new do |a|
data.each {|x| a.add(x)}
end
and the process method is automatically called when exiting the block.
Is there an idiom in C++ that can provide something similar?
For clarification: this question is not about computing the average. It is about the following pattern: feeding data to an object and then, when all the data is fed, triggering a processing step. I am interested in context-based ways to automatically trigger the processing step - or reasons why this would not be a good idea in C++.
"Idiomatic average"
I don't know Ruby but you can't translate idioms directly anyhow. I know that calculating the average is just an example, so lets see what we can get from that example...
Idiomatic way to caclulate sum, and average of elements in a container is std::accumulate:
std::vector<double> data;
// ... fill data ...
auto sum = std::accumulate( a.begin(), a.end() , 0.0);
auto avg = sum / a.size();
The building blocks are container, iterator and algorithms.
If you do not have elements to be processed readily available in a container you can still use the same algorithms, because algorithms only care about iterators. Writing your own iterators requires a bit of boilerplate. The following is just a toy example that calcualtes average of results of calling the same function a certain number of times:
#include <numeric>
template <typename F>
struct my_iter {
F f;
size_t count;
my_iter(size_t count, F f) : count(count),f(f) {}
my_iter& operator++() {
--count;
return *this;
}
auto operator*() { return f(); }
bool operator==(const my_iter& other) const { return count == other.count;}
};
int main()
{
auto f = [](){return 1.;};
auto begin = my_iter{5,f};
auto end = my_iter{0,f};
auto sum = std::accumulate( begin, end, 0.0);
auto avg = sum / 5;
std::cout << sum << " " << avg;
}
Output is:
5 1
Suppose you have a vector of paramters for a function to be called, then calling std::accumulate is straight-forward:
#include <iostream>
#include <vector>
#include <numeric>
int main()
{
auto f = [](int x){return x;};
std::vector<int> v = {1,2,5,10};
auto sum = std::accumulate( v.begin(), v.end(), 0.0, [f](int accu,int add) {
return accu + f(add);
});
auto avg = sum / 5;
std::cout << sum << " " << avg;
}
The last argument to std::accumulate specifies how the elements are added up. Instead of adding them up directly I add up the result of calling the function. Output is:
18 3.6
For your actual question
Taking your question more literally and to answer also the RAII part, here is one way you can make use of RAII with your statistic class:
struct StatisticCollector {
private:
Statistic& s;
public:
StatisticCollector(Statistic& s) : s(s) {}
~StatisticCollector() { s.process(); }
};
int main()
{
Statistic stat;
{
StatisticCollector sc{stat};
//for (...)
// stat.add( x );
} // <- destructor is called here
}
PS: Last but not least there is the alternative to just keep it simple. Your class definition is kinda broken, because all results are private. Once you fix that, it is kinda obvious that you need no RAII to make sure process gets called:
class Statistic {
public:
Statistic() : nr(0), sum(0.0), avg(0.0) {}
void add(double x) { sum += x; ++nr; }
double process() { return sum / nr; }
private:
int nr;
double sum;
};
This is the right interface in my opinion. The user cannot forget to call process because to get the result they need to call it. If the only purpose of the class is to accumulate numbers and process the result it should not encapsulate the result. The result is for the user of the class to store.

C++ clock() function time.h returns unstable values [duplicate]

I want to find out how much time a certain function takes in my C++ program to execute on Linux. Afterwards, I want to make a speed comparison . I saw several time function but ended up with this from boost. Chrono:
process_user_cpu_clock, captures user-CPU time spent by the current process
Now, I am not clear if I use the above function, will I get the only time which CPU spent on that function?
Secondly, I could not find any example of using the above function. Can any one please help me how to use the above function?
P.S: Right now , I am using std::chrono::system_clock::now() to get time in seconds but this gives me different results due to different CPU load every time.
It is a very easy-to-use method in C++11. You have to use std::chrono::high_resolution_clock from <chrono> header.
Use it like so:
#include <chrono>
/* Only needed for the sake of this example. */
#include <iostream>
#include <thread>
void long_operation()
{
/* Simulating a long, heavy operation. */
using namespace std::chrono_literals;
std::this_thread::sleep_for(150ms);
}
int main()
{
using std::chrono::high_resolution_clock;
using std::chrono::duration_cast;
using std::chrono::duration;
using std::chrono::milliseconds;
auto t1 = high_resolution_clock::now();
long_operation();
auto t2 = high_resolution_clock::now();
/* Getting number of milliseconds as an integer. */
auto ms_int = duration_cast<milliseconds>(t2 - t1);
/* Getting number of milliseconds as a double. */
duration<double, std::milli> ms_double = t2 - t1;
std::cout << ms_int.count() << "ms\n";
std::cout << ms_double.count() << "ms\n";
return 0;
}
This will measure the duration of the function long_operation.
Possible output:
150ms
150.068ms
Working example: https://godbolt.org/z/oe5cMd
Here's a function that will measure the execution time of any function passed as argument:
#include <chrono>
#include <utility>
typedef std::chrono::high_resolution_clock::time_point TimeVar;
#define duration(a) std::chrono::duration_cast<std::chrono::nanoseconds>(a).count()
#define timeNow() std::chrono::high_resolution_clock::now()
template<typename F, typename... Args>
double funcTime(F func, Args&&... args){
TimeVar t1=timeNow();
func(std::forward<Args>(args)...);
return duration(timeNow()-t1);
}
Example usage:
#include <iostream>
#include <algorithm>
typedef std::string String;
//first test function doing something
int countCharInString(String s, char delim){
int count=0;
String::size_type pos = s.find_first_of(delim);
while ((pos = s.find_first_of(delim, pos)) != String::npos){
count++;pos++;
}
return count;
}
//second test function doing the same thing in different way
int countWithAlgorithm(String s, char delim){
return std::count(s.begin(),s.end(),delim);
}
int main(){
std::cout<<"norm: "<<funcTime(countCharInString,"precision=10",'=')<<"\n";
std::cout<<"algo: "<<funcTime(countWithAlgorithm,"precision=10",'=');
return 0;
}
Output:
norm: 15555
algo: 2976
In Scott Meyers book I found an example of universal generic lambda expression that can be used to measure function execution time. (C++14)
auto timeFuncInvocation =
[](auto&& func, auto&&... params) {
// get time before function invocation
const auto& start = std::chrono::high_resolution_clock::now();
// function invocation using perfect forwarding
std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
// get time after function invocation
const auto& stop = std::chrono::high_resolution_clock::now();
return stop - start;
};
The problem is that you are measure only one execution so the results can be very differ. To get a reliable result you should measure a large number of execution.
According to Andrei Alexandrescu lecture at code::dive 2015 conference - Writing Fast Code I:
Measured time: tm = t + tq + tn + to
where:
tm - measured (observed) time
t - the actual time of interest
tq - time added by quantization noise
tn - time added by various sources of noise
to - overhead time (measuring, looping, calling functions)
According to what he said later in the lecture, you should take a minimum of this large number of execution as your result.
I encourage you to look at the lecture in which he explains why.
Also there is a very good library from google - https://github.com/google/benchmark.
This library is very simple to use and powerful. You can checkout some lectures of Chandler Carruth on youtube where he is using this library in practice. For example CppCon 2017: Chandler Carruth “Going Nowhere Faster”;
Example usage:
#include <iostream>
#include <chrono>
#include <vector>
auto timeFuncInvocation =
[](auto&& func, auto&&... params) {
// get time before function invocation
const auto& start = high_resolution_clock::now();
// function invocation using perfect forwarding
for(auto i = 0; i < 100000/*largeNumber*/; ++i) {
std::forward<decltype(func)>(func)(std::forward<decltype(params)>(params)...);
}
// get time after function invocation
const auto& stop = high_resolution_clock::now();
return (stop - start)/100000/*largeNumber*/;
};
void f(std::vector<int>& vec) {
vec.push_back(1);
}
void f2(std::vector<int>& vec) {
vec.emplace_back(1);
}
int main()
{
std::vector<int> vec;
std::vector<int> vec2;
std::cout << timeFuncInvocation(f, vec).count() << std::endl;
std::cout << timeFuncInvocation(f2, vec2).count() << std::endl;
std::vector<int> vec3;
vec3.reserve(100000);
std::vector<int> vec4;
vec4.reserve(100000);
std::cout << timeFuncInvocation(f, vec3).count() << std::endl;
std::cout << timeFuncInvocation(f2, vec4).count() << std::endl;
return 0;
}
EDIT:
Ofcourse you always need to remember that your compiler can optimize something out or not. Tools like perf can be useful in such cases.
simple program to find a function execution time taken.
#include <iostream>
#include <ctime> // time_t
#include <cstdio>
void function()
{
for(long int i=0;i<1000000000;i++)
{
// do nothing
}
}
int main()
{
time_t begin,end; // time_t is a datatype to store time values.
time (&begin); // note time before execution
function();
time (&end); // note time after execution
double difference = difftime (end,begin);
printf ("time taken for function() %.2lf seconds.\n", difference );
return 0;
}
Easy way for older C++, or C:
#include <time.h> // includes clock_t and CLOCKS_PER_SEC
int main() {
clock_t start, end;
start = clock();
// ...code to measure...
end = clock();
double duration_sec = double(end-start)/CLOCKS_PER_SEC;
return 0;
}
Timing precision in seconds is 1.0/CLOCKS_PER_SEC
#include <iostream>
#include <chrono>
void function()
{
// code here;
}
int main()
{
auto t1 = std::chrono::high_resolution_clock::now();
function();
auto t2 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>( t2 - t1 ).count();
std::cout << duration<<"/n";
return 0;
}
This Worked for me.
Note:
The high_resolution_clock is not implemented consistently across different standard library implementations, and its use should be avoided. It is often just an alias for std::chrono::steady_clock or std::chrono::system_clock, but which one it is depends on the library or configuration. When it is a system_clock, it is not monotonic (e.g., the time can go backwards).
For example, for gcc's libstdc++ it is system_clock, for MSVC it is steady_clock, and for clang's libc++ it depends on configuration.
Generally one should just use std::chrono::steady_clock or std::chrono::system_clock directly instead of std::chrono::high_resolution_clock: use steady_clock for duration measurements, and system_clock for wall-clock time.
Here is an excellent header only class template to measure the elapsed time of a function or any code block:
#ifndef EXECUTION_TIMER_H
#define EXECUTION_TIMER_H
template<class Resolution = std::chrono::milliseconds>
class ExecutionTimer {
public:
using Clock = std::conditional_t<std::chrono::high_resolution_clock::is_steady,
std::chrono::high_resolution_clock,
std::chrono::steady_clock>;
private:
const Clock::time_point mStart = Clock::now();
public:
ExecutionTimer() = default;
~ExecutionTimer() {
const auto end = Clock::now();
std::ostringstream strStream;
strStream << "Destructor Elapsed: "
<< std::chrono::duration_cast<Resolution>( end - mStart ).count()
<< std::endl;
std::cout << strStream.str() << std::endl;
}
inline void stop() {
const auto end = Clock::now();
std::ostringstream strStream;
strStream << "Stop Elapsed: "
<< std::chrono::duration_cast<Resolution>(end - mStart).count()
<< std::endl;
std::cout << strStream.str() << std::endl;
}
}; // ExecutionTimer
#endif // EXECUTION_TIMER_H
Here are some uses of it:
int main() {
{ // empty scope to display ExecutionTimer's destructor's message
// displayed in milliseconds
ExecutionTimer<std::chrono::milliseconds> timer;
// function or code block here
timer.stop();
}
{ // same as above
ExecutionTimer<std::chrono::microseconds> timer;
// code block here...
timer.stop();
}
{ // same as above
ExecutionTimer<std::chrono::nanoseconds> timer;
// code block here...
timer.stop();
}
{ // same as above
ExecutionTimer<std::chrono::seconds> timer;
// code block here...
timer.stop();
}
return 0;
}
Since the class is a template we can specify real easily in how we want our time to be measured & displayed. This is a very handy utility class template for doing bench marking and is very easy to use.
If you want to safe time and lines of code you can make measuring the function execution time a one line macro:
a) Implement a time measuring class as already suggested above ( here is my implementation for android):
class MeasureExecutionTime{
private:
const std::chrono::steady_clock::time_point begin;
const std::string caller;
public:
MeasureExecutionTime(const std::string& caller):caller(caller),begin(std::chrono::steady_clock::now()){}
~MeasureExecutionTime(){
const auto duration=std::chrono::steady_clock::now()-begin;
LOGD("ExecutionTime")<<"For "<<caller<<" is "<<std::chrono::duration_cast<std::chrono::milliseconds>(duration).count()<<"ms";
}
};
b) Add a convenient macro that uses the current function name as TAG (using a macro here is important, else __FUNCTION__ will evaluate to MeasureExecutionTime instead of the function you wanto to measure
#ifndef MEASURE_FUNCTION_EXECUTION_TIME
#define MEASURE_FUNCTION_EXECUTION_TIME const MeasureExecutionTime measureExecutionTime(__FUNCTION__);
#endif
c) Write your macro at the begin of the function you want to measure. Example:
void DecodeMJPEGtoANativeWindowBuffer(uvc_frame_t* frame_mjpeg,const ANativeWindow_Buffer& nativeWindowBuffer){
MEASURE_FUNCTION_EXECUTION_TIME
// Do some time-critical stuff
}
Which will result int the following output:
ExecutionTime: For DecodeMJPEGtoANativeWindowBuffer is 54ms
Note that this (as all other suggested solutions) will measure the time between when your function was called and when it returned, not neccesarily the time your CPU was executing the function. However, if you don't give the scheduler any change to suspend your running code by calling sleep() or similar there is no difference between.
It is a very easy to use method in C++11.
We can use std::chrono::high_resolution_clock from header
We can write a method to print the method execution time in a much readable form.
For example, to find the all the prime numbers between 1 and 100 million, it takes approximately 1 minute and 40 seconds.
So the execution time get printed as:
Execution Time: 1 Minutes, 40 Seconds, 715 MicroSeconds, 715000 NanoSeconds
The code is here:
#include <iostream>
#include <chrono>
using namespace std;
using namespace std::chrono;
typedef high_resolution_clock Clock;
typedef Clock::time_point ClockTime;
void findPrime(long n, string file);
void printExecutionTime(ClockTime start_time, ClockTime end_time);
int main()
{
long n = long(1E+8); // N = 100 million
ClockTime start_time = Clock::now();
// Write all the prime numbers from 1 to N to the file "prime.txt"
findPrime(n, "C:\\prime.txt");
ClockTime end_time = Clock::now();
printExecutionTime(start_time, end_time);
}
void printExecutionTime(ClockTime start_time, ClockTime end_time)
{
auto execution_time_ns = duration_cast<nanoseconds>(end_time - start_time).count();
auto execution_time_ms = duration_cast<microseconds>(end_time - start_time).count();
auto execution_time_sec = duration_cast<seconds>(end_time - start_time).count();
auto execution_time_min = duration_cast<minutes>(end_time - start_time).count();
auto execution_time_hour = duration_cast<hours>(end_time - start_time).count();
cout << "\nExecution Time: ";
if(execution_time_hour > 0)
cout << "" << execution_time_hour << " Hours, ";
if(execution_time_min > 0)
cout << "" << execution_time_min % 60 << " Minutes, ";
if(execution_time_sec > 0)
cout << "" << execution_time_sec % 60 << " Seconds, ";
if(execution_time_ms > 0)
cout << "" << execution_time_ms % long(1E+3) << " MicroSeconds, ";
if(execution_time_ns > 0)
cout << "" << execution_time_ns % long(1E+6) << " NanoSeconds, ";
}
I recommend using steady_clock which is guarunteed to be monotonic, unlike high_resolution_clock.
#include <iostream>
#include <chrono>
using namespace std;
unsigned int stopwatch()
{
static auto start_time = chrono::steady_clock::now();
auto end_time = chrono::steady_clock::now();
auto delta = chrono::duration_cast<chrono::microseconds>(end_time - start_time);
start_time = end_time;
return delta.count();
}
int main() {
stopwatch(); //Start stopwatch
std::cout << "Hello World!\n";
cout << stopwatch() << endl; //Time to execute last line
for (int i=0; i<1000000; i++)
string s = "ASDFAD";
cout << stopwatch() << endl; //Time to execute for loop
}
Output:
Hello World!
62
163514
Since none of the provided answers are very accurate or give reproducable results I decided to add a link to my code that has sub-nanosecond precision and scientific statistics.
Note that this will only work to measure code that takes a (very) short time to run (aka, a few clock cycles to a few thousand): if they run so long that they are likely to be interrupted by some -heh- interrupt, then it is clearly not possible to give a reproducable and accurate result; the consequence of which is that the measurement never finishes: namely, it continues to measure until it is statistically 99.9% sure it has the right answer which never happens on a machine that has other processes running when the code takes too long.
https://github.com/CarloWood/cwds/blob/master/benchmark.h#L40
You can have a simple class which can be used for this kind of measurements.
class duration_printer {
public:
duration_printer() : __start(std::chrono::high_resolution_clock::now()) {}
~duration_printer() {
using namespace std::chrono;
high_resolution_clock::time_point end = high_resolution_clock::now();
duration<double> dur = duration_cast<duration<double>>(end - __start);
std::cout << dur.count() << " seconds" << std::endl;
}
private:
std::chrono::high_resolution_clock::time_point __start;
};
The only thing is needed to do is to create an object in your function at the beginning of that function
void veryLongExecutingFunction() {
duration_calculator dc;
for(int i = 0; i < 100000; ++i) std::cout << "Hello world" << std::endl;
}
int main() {
veryLongExecutingFunction();
return 0;
}
and that's it. The class can be modified to fit your requirements.
C++11 cleaned up version of Jahid's response:
#include <chrono>
#include <thread>
void long_operation(int ms)
{
/* Simulating a long, heavy operation. */
std::this_thread::sleep_for(std::chrono::milliseconds(ms));
}
template<typename F, typename... Args>
double funcTime(F func, Args&&... args){
std::chrono::high_resolution_clock::time_point t1 =
std::chrono::high_resolution_clock::now();
func(std::forward<Args>(args)...);
return std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::high_resolution_clock::now()-t1).count();
}
int main()
{
std::cout<<"expect 150: "<<funcTime(long_operation,150)<<"\n";
return 0;
}
This is a very basic timer class which you can expand on depending on your needs. I wanted something straightforward which can be used cleanly in code. You can mess with it at coding ground with this link: http://tpcg.io/nd47hFqr.
class local_timer {
private:
std::chrono::_V2::system_clock::time_point start_time;
std::chrono::_V2::system_clock::time_point stop_time;
std::chrono::_V2::system_clock::time_point stop_time_temp;
std::chrono::microseconds most_recent_duration_usec_chrono;
double most_recent_duration_sec;
public:
local_timer() {
};
~local_timer() {
};
void start() {
this->start_time = std::chrono::high_resolution_clock::now();
};
void stop() {
this->stop_time = std::chrono::high_resolution_clock::now();
};
double get_time_now() {
this->stop_time_temp = std::chrono::high_resolution_clock::now();
this->most_recent_duration_usec_chrono = std::chrono::duration_cast<std::chrono::microseconds>(stop_time_temp-start_time);
this->most_recent_duration_sec = (long double)most_recent_duration_usec_chrono.count()/1000000;
return this->most_recent_duration_sec;
};
double get_duration() {
this->most_recent_duration_usec_chrono = std::chrono::duration_cast<std::chrono::microseconds>(stop_time-start_time);
this->most_recent_duration_sec = (long double)most_recent_duration_usec_chrono.count()/1000000;
return this->most_recent_duration_sec;
};
};
The use for this being
#include <iostream>
#include "timer.hpp" //if kept in an hpp file in the same folder, can also before your main function
int main() {
//create two timers
local_timer timer1 = local_timer();
local_timer timer2 = local_timer();
//set start time for timer1
timer1.start();
//wait 1 second
while(timer1.get_time_now() < 1.0) {
}
//save time
timer1.stop();
//print time
std::cout << timer1.get_duration() << " seconds, timer 1\n" << std::endl;
timer2.start();
for(long int i = 0; i < 100000000; i++) {
//do something
if(i%1000000 == 0) {
//return time since loop started
std::cout << timer2.get_time_now() << " seconds, timer 2\n"<< std::endl;
}
}
return 0;
}

Timing in an elegant way in c++

I am interested in timing the execution time of a free function or a member function (template or not). Call TheFunc the function in question, its call being
TheFunc(/*parameters*/);
or
ReturnType ret = TheFunc(/*parameters*/);
Of course I could wrap these function calls as follows :
double duration = 0.0 ;
std::clock_t start = std::clock();
TheFunc(/*parameters*/);
duration = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
or
double duration = 0.0 ;
std::clock_t start = std::clock();
ReturnType ret = TheFunc(/*parameters*/);
duration = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
but I would like to do something more elegant than this, namely (and from now on I will stick to the void return type) as follows :
Timer thetimer ;
double duration = 0.0;
thetimer(*TheFunc)(/*parameters*/, duration);
where Timer is some timing class that I would like to design and that would allow me to write the previous code, in such way that after the exectution of the last line of previous code the double duration will contain the execution time of
TheFunc(/*parameters*/);
but I don't see how to do this, nor if the syntax/solution I aim for is optimal...
With variadic template, you may do:
template <typename F, typename ... Ts>
double Time_function(F&& f, Ts&&...args)
{
std::clock_t start = std::clock();
std::forward<F>(f)(std::forward<Ts>(args)...);
return static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
I really like boost::cpu_timer::auto_cpu_timer, and when I cannot use boost I simply hack my own:
#include <cmath>
#include <string>
#include <chrono>
#include <iostream>
class AutoProfiler {
public:
AutoProfiler(std::string name)
: m_name(std::move(name)),
m_beg(std::chrono::high_resolution_clock::now()) { }
~AutoProfiler() {
auto end = std::chrono::high_resolution_clock::now();
auto dur = std::chrono::duration_cast<std::chrono::microseconds>(end - m_beg);
std::cout << m_name << " : " << dur.count() << " musec\n";
}
private:
std::string m_name;
std::chrono::time_point<std::chrono::high_resolution_clock> m_beg;
};
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
{
AutoProfiler p("N = 10");
foo(10);
}
{
AutoProfiler p("N = 1,000,000");
foo(1000000);
}
}
This timer works thanks to RAII. When you build the object within an scope you store the timepoint at that point in time. When you leave the scope (that is, at the corresponding }) the timer first stores the timepoint, then calculates the number of ticks (which you can convert to a human-readable duration), and finally prints it to screen.
Of course, boost::timer::auto_cpu_timer is much more elaborate than my simple implementation, but I often find my implementation more than sufficient for my purposes.
Sample run in my computer:
$ g++ -o example example.com -std=c++14 -Wall -Wextra
$ ./example
N = 10 : 0 musec
N = 1,000,000 : 10103 musec
EDIT
I really liked the implementation suggested by #Jarod42. I modified it a little bit to offer some flexibility on the desired "units" of the output.
It defaults to returning the number of elapsed microseconds (an integer, normally std::size_t), but you can request the output to be in any duration of your choice.
I think it is a more flexible approach than the one I suggested earlier because now I can do other stuff like taking the measurements and storing them in a container (as I do in the example).
Thanks to #Jarod42 for the inspiration.
#include <cmath>
#include <string>
#include <chrono>
#include <algorithm>
#include <iostream>
template<typename Duration = std::chrono::microseconds,
typename F,
typename ... Args>
typename Duration::rep profile(F&& fun, Args&&... args) {
const auto beg = std::chrono::high_resolution_clock::now();
std::forward<F>(fun)(std::forward<Args>(args)...);
const auto end = std::chrono::high_resolution_clock::now();
return std::chrono::duration_cast<Duration>(end - beg).count();
}
void foo(std::size_t N) {
long double x {1.234e5};
for(std::size_t k = 0; k < N; k++) {
x += std::sqrt(x);
}
}
int main() {
std::size_t N { 1000000 };
// profile in default mode (microseconds)
std::cout << "foo(1E6) takes " << profile(foo, N) << " microseconds" << std::endl;
// profile in custom mode (e.g, milliseconds)
std::cout << "foo(1E6) takes " << profile<std::chrono::milliseconds>(foo, N) << " milliseconds" << std::endl;
// To create an average of `M` runs we can create a vector to hold
// `M` values of the type used by the clock representation, fill
// them with the samples, and take the average
std::size_t M {100};
std::vector<typename std::chrono::milliseconds::rep> samples(M);
for(auto & sample : samples) {
sample = profile(foo, N);
}
auto avg = std::accumulate(samples.begin(), samples.end(), 0) / static_cast<long double>(M);
std::cout << "average of " << M << " runs: " << avg << " microseconds" << std::endl;
}
Output (compiled with g++ example.cpp -std=c++14 -Wall -Wextra -O3):
foo(1E6) takes 10073 microseconds
foo(1E6) takes 10 milliseconds
average of 100 runs: 10068.6 microseconds
You can do it the MatLab way. It's very old-school but simple is often good:
tic();
a = f(c);
toc(); //print to stdout, or
auto elapsed = toc(); //store in variable
tic() and toc() can work to a global variable. If that's not sufficient, you can create local variables with some macro-magic:
tic(A);
a = f(c);
toc(A);
I'm a fan of using RAII wrappers for this type of stuff.
The following example is a little verbose but it's more flexible in that it works with arbitrary scopes instead of being limited to a single function call:
class timing_context {
public:
std::map<std::string, double> timings;
};
class timer {
public:
timer(timing_context& ctx, std::string name)
: ctx(ctx),
name(name),
start(std::clock()) {}
~timer() {
ctx.timings[name] = static_cast<double>(std::clock() - start) / static_cast<double>(CLOCKS_PER_SEC);
}
timing_context& ctx;
std::string name;
std::clock_t start;
};
timing_context ctx;
int main() {
timer_total(ctx, "total");
{
timer t(ctx, "foo");
// Do foo
}
{
timer t(ctx, "bar");
// Do bar
}
// Access ctx.timings
}
The downside is that you might end up with a lot of scopes that only serve to destroy the timing object.
This might or might not satisfy your requirements as your request was a little vague but it illustrates how using RAII semantics can make for some really nice reusable and clean code. It can probably be modified to look a lot better too!

Speed of associative array (map) in STL [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
Wrote a simple program to measure the speed of STL. The following code showed that it took 1.49sec on my Corei7-2670QM PC (2.2GHz and turbo 3.1GHz). If I remove the Employees[buf] = i%1000; part in the loop, it only took 0.0132sec. So the hashing part took 1.48sec. Why is it that slow?
#include <string.h>
#include <iostream>
#include <map>
#include <utility>
#include <stdio.h>
#include <sys/time.h>
using namespace std;
extern "C" {
int get(map<string, int> e, char* s){
return e[s];
}
int set(map<string, int> e, char* s, int value) {
e[s] = value;
}
}
double getTS() {
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec + tv.tv_usec/1000000.0;
}
int main()
{
map<string, int> Employees;
char buf[10];
int i;
double ts = getTS();
for (i=0; i<1000000; i++) {
sprintf(buf, "%08d", i);
Employees[buf] = i%1000;
}
printf("took %f sec\n", getTS() - ts);
cout << Employees["00001234"] << endl;
return 0;
}
Here's a C++ version of your code. Note that you should obviously take the maps by reference when passing them in get/set.
UPDATE Taking things a bit further and seriously optimizing for the given test case:
Live On Coliru
#include <iostream>
#include <boost/container/flat_map.hpp>
#include <chrono>
using namespace std;
using Map = boost::container::flat_map<string, int>;
int get(Map &e, char *s) { return e[s]; }
int set(Map &e, char *s, int value) { return e[s] = value; }
using Clock = std::chrono::high_resolution_clock;
template <typename F, typename Reso = std::chrono::microseconds, typename... Args>
Reso measure(F&& f, Args&&... args) {
auto since = Clock::now();
std::forward<F>(f)(std::forward<Args>(args)...);
return chrono::duration_cast<Reso>(Clock::now() - since);
}
#include <boost/iterator/iterator_facade.hpp>
using Pair = std::pair<std::string, int>;
struct Gen : boost::iterators::iterator_facade<Gen, Pair, boost::iterators::single_pass_traversal_tag, Pair>
{
int i;
Gen(int i = 0) : i(i) {}
value_type dereference() const {
char buf[10];
std::sprintf(buf, "%08d", i);
return { buf, i%1000 };
}
bool equal(Gen const& o) const { return i==o.i; }
void increment() { ++i; }
};
int main() {
Map Employees;
const auto n = 1000000;
auto elapsed = measure([&] {
Employees.reserve(n);
Employees.insert<Gen>(boost::container::ordered_unique_range, {0}, {n});
});
std::cout << "took " << elapsed.count() / 1000000.0 << " sec\n";
cout << Employees["00001234"] << endl;
}
Prints
took 0.146575 sec
234
Old answer
This just used C++ where appropriate
Live On Coliru
#include <iostream>
#include <map>
#include <chrono>
#include <cstdio>
using namespace std;
int get(map<string, int>& e, char* s){
return e[s];
}
int set(map<string, int>& e, char* s, int value) {
return e[s] = value;
}
using Clock = std::chrono::high_resolution_clock;
template <typename Reso = std::chrono::microseconds>
Reso getElapsed(Clock::time_point const& since) {
return chrono::duration_cast<Reso>(Clock::now() - since);
}
int main()
{
map<string, int> Employees;
std::string buf(10, '\0');
auto ts = Clock::now();
for (int i=0; i<1000000; i++) {
buf.resize(std::sprintf(&buf[0], "%08d", i));
Employees[buf] = i%1000;
}
std::cout << "took " << getElapsed(ts).count()/1000000.0 << " sec\n";
cout << Employees["00001234"] << endl;
}
Prints:
took 0.470009 sec
234
The notion of "slow" depends of course in comparison to what.
I ran your benchmark (using the standard chrono::high_resolution_clock instead of gettimeofday() ) on MSVC2013 with release configuration on an Corei7-920 at 2.67 GHz and find very similar results (1.452 s).
In your code, you do basically 1 millions of:
insertion in the map: Employees\[buf\]
update in the map (copying a new element to exisitng element): = i%1000
SO I tried to understand better where the time is spent:
first, the map needs to store the ordered keys, which is typically implemented with a binary tree. So I tried to use an unordered_map which uses a flatter hash table and gave it a very large bucket size to avoid clisions and rehashing. The result is then 1.198 s.
So roughly 20% of the time (here) is needed for making possibile a sorted access to the map data (i.e. you can iterate through your map using the order of the keys: do you need this ?)
next, playing with the order of insertion can really influence significantly the timing. As Thomas Matthews pointed out in the comments: for benchmarking purpose you should use random order.
then, making only and optimised insertion of data (no search no update) using emplace_hint() brings us to a time of 1.100 s.
So 75% of the time is needed to allocate and insert the data
finally, elaborating on the previous test, if you add an additional search and update after emplace_hint(), then the time goes up slightly above the original time (1.468 s). This confirms that access to the map is only a fraction of the time and most of the execution time is needed for the insertion.
Here the test for the point above:
chrono::high_resolution_clock::time_point ts = chrono::high_resolution_clock::now();
for (i = 0; i<1000000; i++) {
sprintf(buf, "%08d", i);
Employees.emplace_hint(Employees.end(), buf, 0);
Employees[buf] = i % 1000; // matters for 300
}
chrono::high_resolution_clock::time_point te = chrono::high_resolution_clock::now();
cout << "took " << chrono::duration_cast<chrono::milliseconds>(te - ts).count() << " millisecs\n";
Now your benchmark not only depends performance of the map: you do 1 million of sprintf() to set your buffer, and 1 million of conversion to a string. If you'd use a map instead, you'd notice that the whole test would take only 0.950s instead of 1.450s:
30% of your benchmark time is caused not by the map, but by the many strings you handle !
Of course, all this is much slower than a vector. But a vector doesn't sort its elements, and cannot provide for associative store.

Timing the execution of statements - C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to Calculate Execution Time of a Code Snippet in C++
How can I get the time spent by a particular set of statements in some C++ code?
Something like the time utility under Linux but only for some particular statements.
You can use the <chrono> header in the standard library:
#include <chrono>
#include <iostream>
unsigned long long fib(unsigned long long n) {
return (0==n || 1==n) ? 1 : fib(n-1) + fib(n-2);
}
int main() {
unsigned long long n = 0;
while (true) {
auto start = std::chrono::high_resolution_clock::now();
fib(++n);
auto finish = std::chrono::high_resolution_clock::now();
auto microseconds = std::chrono::duration_cast<std::chrono::microseconds>(finish-start);
std::cout << microseconds.count() << "µs\n";
if (microseconds > std::chrono::seconds(1))
break;
}
}
You need to measure the time yourself. The little stopwatch class I'm usually using looks like this:
#include <chrono>
#include <iostream>
template <typename Clock = std::chrono::steady_clock>
class stopwatch
{
typename Clock::time_point last_;
public:
stopwatch()
: last_(Clock::now())
{}
void reset()
{
*this = stopwatch();
}
typename Clock::duration elapsed() const
{
return Clock::now() - last_;
}
typename Clock::duration tick()
{
auto now = Clock::now();
auto elapsed = now - last_;
last_ = now;
return elapsed;
}
};
template <typename T, typename Rep, typename Period>
T duration_cast(const std::chrono::duration<Rep, Period>& duration)
{
return duration.count() * static_cast<T>(Period::num) / static_cast<T>(Period::den);
}
int main()
{
stopwatch<> sw;
// ...
std::cout << "Elapsed: " << duration_cast<double>(sw.elapsed()) << '\n';
}
duration_cast may not be an optimal name for the function, since a function with this name already exists in the standard library. Feel free to come up with a better one. ;)
Edit: Note that chrono is from C++11.
std::chrono or boost::chrono(in case that your compiler does not support C++11) can be used for this.
std::chrono::high_resolution_clock::time_point start(
std::chrono::high_resolution_clock::now() );
....
std::cout << (std::chrono::high_resolution_clock::now() - start);
You need to write a simple timing system. There is no built-in way in c++.
#include <sys/time.h>
class Timer
{
private:
struct timeval start_t;
public:
double start() { gettimeofday(&start_t, NULL); }
double get_ms() {
struct timeval now;
gettimeofday(&now, NULL);
return (now.tv_usec-start_t.tv_usec)/(double)1000.0 +
(now.tv_sec-start_t.tv_sec)*(double)1000.0;
}
double get_ms_reset() {
double res = get_ms();
reset();
return res;
}
Timer() { start(); }
};
int main()
{
Timer t();
double used_ms;
// run slow code..
used_ms = t.get_ms_reset();
// run slow code..
used_ms += t.get_ms_reset();
return 0;
}
Note that the measurement itself can affect the runtime significantly.
Possible Duplicate: How to Calculate Execution Time of a Code Snippet in C++
You can use the time.h C standard library ( explained in more detail at http://www.cplusplus.com/reference/clibrary/ctime/ ). The following program does what you want:
#include <iostream>
#include <time.h>
using namespace std;
int main()
{
clock_t t1,t2;
t1=clock();
//code goes here
t2=clock();
float diff = ((float)t2-(float)t1)/CLOCKS_PER_SEC;
cout << "Running time: " << diff << endl;
return 0;
}
You can also do this:
int start_s=clock();
// the code you wish to time goes here
int stop_s=clock();
cout << "time: " << (stop_s-start_s)/double(CLOCKS_PER_SEC)*1000 << endl;
If you are using GNU gcc/g++:
Try recompiling with --coverage, rerun the program and analyse the resulting files with the gprof utility. It will also print execution times of functions.
Edit: Compile and link with -pg, not with --coverage, --coverage is for gcov (which lines are actually executed).
Here's very fine snippet of code, that works well on windows and linux: https://stackoverflow.com/a/1861337/1483826
To use it, run it and save the result as "start time" and after the action - "end time". Subtract and divide to whatever accuracy you need.
You can use #inclide <ctime> header. It's functions and their uses are here. Suppose you want to watch how much time a code spends. You have to take a current time just before start of that part and another current time just after ending of that part. Then take the difference of these two times. Readymade functions are declared within ctime to do all these works. Just checkout the above link.