Total time in different parts of recursive function - c++

I am new to C++ and I need to measure the total time for different parts of a recursive function. A simple example to show where I get so far is:
#include <iostream>
#include <unistd.h>
#include <chrono>
using namespace std;
using namespace std::chrono;
int recursive(int);
void foo();
void bar();
int main() {
int n = 5; // this value is known only at runtime
int result = recursive(n);
return 0;
}
int recursive(int n) {
auto start = high_resolution_clock::now();
if (n > 1) { recursive(n - 1); n = n - 1; }
auto stop = high_resolution_clock::now();
auto duration_recursive = duration_cast<microseconds>(stop - start);
cout << "time in recursive: " << duration_recursive.count() << endl;
//
// .. calls to other functions and computations parts I don't want to time
//
start = high_resolution_clock::now();
foo();
stop = high_resolution_clock::now();
auto duration_foo = duration_cast<seconds>(stop - start);
cout << "time in foo: " << duration_foo.count() << endl;
//
// .. calls to other functions and computations parts I don't want to time
//
start = high_resolution_clock::now();
bar();
stop = high_resolution_clock::now();
auto duration_bar = duration_cast<seconds>(stop - start);
cout << "time in bar: " << duration_bar.count() << endl;
return 0;
}
void foo() { // a complex function
sleep(1);
}
void bar() { // another complex function
sleep(2);
}
I want the total time for each of the functions, for instance, for foo() it is 5 seconds, while now I always get 1 second. The number of iterations is known only at runtime (n=5 here is fixed just for simplicity).
To compute the total time for each of the functions I did try replacing the type above by using static and accumulate the results but didn't work.

You can use some container to store the times, pass it by reference and accumulate the times. For example with a std::map<std::string,unsinged> to have labels:
int recursive(int n, std::map<std::string,unsigned>& times) {
if (n >= 0) return;
// measure time of foo
times["foo"] += duration_foo;
// measure time of bar
times["bar"] += duration_bar;
// recurse
recursive(n-1,times);
}
Then
std::map<std::string,unsigned> times;
recursive(200,times);
for (const auto& t : times) {
std::cout << t.first << " took total : " << t.second << "\n";
}

Related

Cannot exit While loop

I want to exit While loop, when variable "duration" reaches 5 sec.
Variable "duration" always =0
#include <iostream>
#include <Windows.h>
#include <chrono>
using namespace std;
DOUBLE duration;
int main()
{
using clock = std::chrono::system_clock;
using sec = std::chrono::duration <double>;
const auto before = clock::now();
while (duration < 5.0)
{
const auto after = clock::now();
const sec duration = after - before;
std::cout << "It took " << duration.count() << "s" << std::endl;
}
std::cout << "After While Loop "; //Never reaches this line!!!!!
return 0;
}
Actual output:.
.
it took 9.50618s
it tool 9.50642s
.
I expected while loop to exit at 5.0 or higher.
display variable always shows 0.
You are using two separate variables named duration.
using sec = std::chrono::duration <double>; // number 1
const auto before = clock::now();
while (duration < 5.0) // checking number 1, which is not changed during looping
{
const auto after = clock::now();
const sec duration = after - before; // number 2, created afresh in each iteration
std::cout << "It took " << duration.count() << "s" << std::endl;
}
So, what you are checking in the loop condition is not what gets changed in the loop body.
You have two variables called duration:
DOUBLE duration;
and in while loop:
const sec duration = after - before;
During evaluation of while statement you are checking the first one that is never set or changed.
Inside while loop body you're creating a new variable that shadows global variable. See https://en.wikipedia.org/wiki/Variable_shadowing
Insead you shouldn't declare a new variable but use one declared before entering to the while loop.
There are two issues here:
Declaring a local variable with the same name that of the global variable makes them two separate entities, which may get quiet confusing. This phenomenon is called as Variable shadowing.
You are comparing the duration variable which is actually an object of class sec and not an integer or float. So, for comparing the value in duration just do duration.count() to make it comparable to integer or float values.
Here, in your code, the global duration variable is never changed because you are declaring a new variable inside the while loop. The duration variable that is been checked in the while condition is the global duration variable but you expect it to be the duration variable inside the loop.
There are two possible solutions:
Either make the two duration variables the same.
Or break the while(1) loop when the duration variable hold value >= 5,
Solution 1 code looks like this:
#include <iostream>
#include <Windows.h>
#include <chrono>
using namespace std;
int main()
{
using clock = std::chrono::system_clock;
using sec = std::chrono::duration <double>;
sec duration;
const auto before = clock::now();
while (duration.count() < 5.0)
{
const auto after = clock::now();
duration = after - before;
std::cout << "It took " << duration.count() << "s" << std::endl;
}
std::cout << "After While Loop "; //Never reaches this line!!!!!
return 0;
}
Solution 2 code looks like this:
#include <iostream>
#include <Windows.h>
#include <chrono>
using namespace std;
int main()
{
using clock = std::chrono::system_clock;
using sec = std::chrono::duration <double>;
const auto before = clock::now();
while (1) // Soln 1: while(1)
{
const auto after = clock::now();
const sec duration = after - before;
std::cout << "It took " << duration.count() << "s" << std::endl;
if(duration.count() >= 5)
break;
}
std::cout << "After While Loop "; //Never reaches this line!!!!!
return 0;
}

Why is the C++ thread/future overhead so big

I have a worker routine (code below), which is running slower when I run it in a separate thread. As far as I can tell, the worker code and data is completely independent of other threads. All the worker does is to append nodes to a tree. The goal is having multiple workers growing trees in parallel.
Can someone help me understand why there is (significant) overhead when running the worker in a separate thread?
Edit:
Initially I was testing WorkerFuture twice, I corrected that and I now get the same (better) performance in the no thread and defer async cases, and considerable overhead when an extra thread is involved.
The command to compile (linux): g++ -std=c++11 main.cpp -o main -O3 -pthread
Here is the output (time in milliseconds):
Thread : 4000001 size in 1861 ms
Async : 4000001 size in 1836 ms
Defer async: 4000001 size in 1423 ms
No thread : 4000001 size in 1455 ms
Code:
#include <iostream>
#include <vector>
#include <random>
#include <chrono>
#include <thread>
#include <future>
struct Data
{
int data;
};
struct Tree
{
Data data;
long long total;
std::vector<Tree *> children;
long long Size()
{
long long size = 1;
for (auto c : children)
size += c->Size();
return size;
}
~Tree()
{
for (auto c : children)
delete c;
}
};
int
GetRandom(long long size)
{
static long long counter = 0;
return counter++ % size;
}
void
Worker_(Tree *root)
{
std::vector<Tree *> nodes = {root};
Tree *it = root;
while (!it->children.empty())
{
it = it->children[GetRandom(it->children.size())];
nodes.push_back(it);
}
for (int i = 0; i < 100; ++i)
nodes.back()->children.push_back(new Tree{{10}, 1, {}});
for (auto t : nodes)
++t->total;
}
long long
Worker(long long iterations)
{
Tree root = {};
for (long long i = 0; i < iterations; ++i)
Worker_(&root);
return root.Size();
}
void ThreadFn(long long iterations, long long &result)
{
result = Worker(iterations);
}
long long
WorkerThread(long long iterations)
{
long long result = 0;
std::thread t(ThreadFn, iterations, std::ref(result));
t.join();
return result;
}
long long
WorkerFuture(long long iterations)
{
std::future<long long> f = std::async(std::launch::async, [iterations] {
return Worker(iterations);
});
return f.get();
}
long long
WorkerFutureSameThread(long long iterations)
{
std::future<long long> f = std::async(std::launch::deferred, [iterations] {
return Worker(iterations);
});
return f.get();
}
int main()
{
long long iterations = 40000;
auto t1 = std::chrono::high_resolution_clock::now();
auto total = WorkerThread(iterations);
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << "Thread : " << total << " size in " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << " ms\n";
t1 = std::chrono::high_resolution_clock::now();
total = WorkerFuture(iterations);
t2 = std::chrono::high_resolution_clock::now();
std::cout << "Async : " << total << " size in " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << " ms\n";
t1 = std::chrono::high_resolution_clock::now();
total = WorkerFutureSameThread(iterations);
t2 = std::chrono::high_resolution_clock::now();
std::cout << "Defer async: " << total << " size in " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << " ms\n";
t1 = std::chrono::high_resolution_clock::now();
total = Worker(iterations);
t2 = std::chrono::high_resolution_clock::now();
std::cout << "No thread : " << total << " size in " << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << " ms\n";
}
It seems that the problem is caused by dynamic memory management. When multiple threads are involved (even if the main thread does nothing), C++ runtime must synchronize access to dynamic memory (heap), which generates some overhead. I did some experiments with GCC and the solution of your problem is to use some scalable memory allocator library. For instance, when I used tbbmalloc, e.g.,
export LD_LIBRARY_PATH=$TBB_ROOT/lib/intel64/gcc4.7:$LD_LIBRARY_PATH
export LD_PRELOAD=libtbbmalloc_proxy.so.2
the whole problem disappeared.
The reason is simple. You do not do anything in parallel manner.
When extra thread is doing something main thread does nothing (waits for thread job to complete).
In case of thread you have extra thing to do (handle thread and synchronization) so you have a trade off.
To see any gain you have to do at least two things at the same time.

Program seems to silently terminate after for-loop - C++

I have created a program that prints out all of the permutations of the characters provided through command-line arguments and decided I wanted to compare execution time to an equivalent program written in Java.
The program worked until I decided to find the permutations multiple times in order to get an average execution time.
void avgTime(char**& argv, int times) {
if (sizeof(argv) > 1) {
long permutationAmnt;
clock_t s_start, s_end, t_start, t_end;
float s_runms, t_runms;
long avg_time;
for (int count = 0; count < times; count++) {
t_start = clock();
for (int i = 1; i < sizeof(argv); i++) {
s_start = clock();
permutationAmnt = permutations(std::string(argv[i]));
s_end = clock();
s_runms = ((float)s_end - s_start) / CLOCKS_PER_SEC * 1000;
std::cout << "SUCCESS (" << s_runms << "ms for " << permutationAmnt << " permutations)" << std::endl << std::endl;
}
t_end = clock();
t_runms = ((float) t_end - t_start) / CLOCKS_PER_SEC * 1000;
std::cout << std::endl << "TOTAL RUNTIME: " << t_runms << "ms" << std::endl;
avg_time += t_runms;
}
std::cout << "AVERAGE RUNTIME: " << avg_time / times << "ms" << std::endl;
}
}
int main(int argc, char** argv) {
avgTime(argv, 10);
return 0;
}
The first for-loop in avgTime() only executes a single time (putting a cout inside of it only prints one time) and the program appears to terminate after the nested for-loop breaks.
I am not sure if the problem is with some of the code from avgTime() or if it comes from one of the helper functions, like permute(). Either way here is the code for each of the helper functions as well as the includes (p.s. num is declared outside of any functions).
/*
* Calls the recursive permute() function then
* returns the total amount of permutations possible
* for the given input.
*
* NOTE: the num variable is used in the permute() function
* for numbering the permutations printed as output (see next function
* for clarificiation)
*/
long permutations(const std::string& arg) {
long totalPermutations = factorial(arg.size()); //self-explanatory
num = 1;
permute(arg, 0);
return totalPermutations;
}
/*
* Recursively prints out each permutation
* of the characters in the argument, str
*/
void permute(const std::string& str, int place) {
if (place == str.size() - 1) std::cout << ((num <= 10) ? "0" : "") << num++ << ". " << str << std::endl;
for (int i = place; i < str.size(); i++) {
permute(swap(place, i, str), place + 1); //self-explanatory
}
}
long factorial(int num) {
if (num < 2) {
return 1;
}
return factorial(num - 1) * num;
}
std::string swap(int i, int j, const std::string& str) {
std::string s(str);
s[i] = s[j];
s[j] = str[i];
return s;
}
NOTE: the permute() function appears before the permutation() function in the source code and is visible to all the necessary callers of it.
//Includes and namespace stuff
#include <iostream>
#include <string>
#include <time.h>
I would appreciate any help that you guys can offer, if there is any additional information that you would like me to provide just let me know. Thanks again for any help.
P.S. No, this isn't a homework assignment :P
EDIT: Removed using namespace std; and adjusted the code accordingly to avoid confusion between the function std::swap() and my own swap() function. Also, added the swap() and factorial() functions to avoid any ambiguity. I apologize for the confusion this caused.
I was using sizeof(argv) rather than just using argc. Switching to the latter option fixed the issue.

Proper method of using std::chrono

While I realize this is probably one of many identical questions, I can't seem to figure out how to properly use std::chrono. This is the solution I cobbled together.
#include <stdlib.h>
#include <iostream>
#include <chrono>
typedef std::chrono::high_resolution_clock Time;
typedef std::chrono::milliseconds ms;
float startTime;
float getCurrentTime();
int main () {
startTime = getCurrentTime();
std::cout << "Start Time: " << startTime << "\n";
while(true) {
std::cout << getCurrentTime() - startTime << "\n";
}
return EXIT_SUCCESS;
}
float getCurrentTime() {
auto now = Time::now();
return std::chrono::duration_cast<ms>(now.time_since_epoch()).count() / 1000;
}
For some reason, this only ever returns integer values as the difference, which increments upwards at rate of 1 per second, but starting from an arbitrary, often negative, value.
What am I doing wrong? Is there a better way of doing this?
Don't escape the chrono type system until you absolutely have to. That means don't use .count() except for I/O or interacting with legacy API.
This translates to: Don't use float as time_point.
Don't bother with high_resolution_clock. This is always a typedef to either system_clock or steady_clock. For more portable code, choose one of the latter.
.
#include <iostream>
#include <chrono>
using Time = std::chrono::steady_clock;
using ms = std::chrono::milliseconds;
To start, you're going to need a duration with a representation of float and the units of seconds. This is how you do that:
using float_sec = std::chrono::duration<float>;
Next you need a time_point which uses Time as the clock, and float_sec as its duration:
using float_time_point = std::chrono::time_point<Time, float_sec>;
Now your getCurrentTime() can just return Time::now(). No fuss, no muss:
float_time_point
getCurrentTime() {
return Time::now();
}
Your main, because it has to do the I/O, is responsible for unpacking the chrono types into scalars so that it can print them:
int main () {
auto startTime = getCurrentTime();
std::cout << "Start Time: " << startTime.time_since_epoch().count() << "\n";
while(true) {
std::cout << (getCurrentTime() - startTime).count() << "\n";
}
}
This program does a similar thing. Hopefully it shows some of the capabilities (and methodology) of std::chrono:
#include <iostream>
#include <chrono>
#include <thread>
int main()
{
using namespace std::literals;
namespace chrono = std::chrono;
using clock_type = chrono::high_resolution_clock;
auto start = clock_type::now();
for(;;) {
auto first = clock_type::now();
// note use of literal - this is c++14
std::this_thread::sleep_for(500ms);
// c++11 would be this:
// std::this_thread::sleep_for(chrono::milliseconds(500));
auto last = clock_type::now();
auto interval = last - first;
auto total = last - start;
// integer cast
std::cout << "we just slept for " << chrono::duration_cast<chrono::milliseconds>(interval).count() << "ms\n";
// another integer cast
std::cout << "also known as " << chrono::duration_cast<chrono::nanoseconds>(interval).count() << "ns\n";
// floating point cast
using seconds_fp = chrono::duration<double, chrono::seconds::period>;
std::cout << "which is " << chrono::duration_cast<seconds_fp>(interval).count() << " seconds\n";
std::cout << " total time wasted: " << chrono::duration_cast<chrono::milliseconds>(total).count() << "ms\n";
std::cout << " in seconds: " << chrono::duration_cast<seconds_fp>(total).count() << "s\n";
std::cout << std::endl;
}
return 0;
}
example output:
we just slept for 503ms
also known as 503144616ns
which is 0.503145 seconds
total time wasted: 503ms
in seconds: 0.503145s
we just slept for 500ms
also known as 500799185ns
which is 0.500799 seconds
total time wasted: 1004ms
in seconds: 1.00405s
we just slept for 505ms
also known as 505114589ns
which is 0.505115 seconds
total time wasted: 1509ms
in seconds: 1.50923s
we just slept for 502ms
also known as 502478275ns
which is 0.502478 seconds
total time wasted: 2011ms
in seconds: 2.01183s

Running time of functions

I am wanting to print the running time of my functions. For some reason my timer always returns 0. Can anyone tell me why?
double RunningTime(clock_t time1, clock_t time2)
{
double t=time1 - time2;
double time = (t*1000)/CLOCKS_PER_SEC;
return time;
}
int main()
{
clock_t start_time = clock();
// some code.....
clock_t end_time = clock();
std::cout << "Time elapsed: " << double(RunningTime(end_time, start_time)) << " ms";
return 0;
}
I attempted to use gettimeofday and it still returned 0.
double get_time()
{
struct timeval t;
gettimeofday(&t, NULL);
double d = t.tv_sec + (double) t.tv_usec/100000;
return d;
}
int main()
{
double time_start = get_time();
//Some code......
double time_end = get_time();
std::cout << time_end - time_start;
return 0;
}
Also tried using chrono and it gave me all kinds of build errors:
error: #error This file requires compiler and library support for the
upcoming ISO C++ standard, C++0x. This support is currently
experimental, and must be enabled with the -std=c++0x or -std=gnu++0x
compiler options.
warning: 'auto' will change meaning in C++0x; please remove it
error: ISO C++ forbids declaration of 't1' with no type error:
'std::chrono' has not been declared
error: request for member 'count' in '(t2 - t1)', which is of
non-class type 'int'
int main()
{
auto t1 = std::chrono::high_resolution_clock::now();
//Some code......
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << "Time elapsed: " << std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count() << " milliseconds\n";
return 0;
}
A timer tick is approximately equal to 1/CLOCKS_PER_SEC second, which is a millisecond resolution. To see a real (non-zero) number, you should either invoke a very long-time function or use another library with a higher time resolution facility:
new c++11x library chrono (use MSVS 2012)
boost::chrono (unfortunately, the library refers to a lot of others)
POSIX function gettimeofday, which gives you a 1 microsecond time resolution
After lots of trial and error I went with gettimeofday. Here is my code that I finally got to work properly.
double get_time()
{
struct timeval t;
gettimeofday(&t, NULL);
double d = t.tv_sec + (double) t.tv_usec/1000000;
return d;
}
int main()
{
double time_start = get_time();
//Some code.........
double time_end = get_time();
std::cout << time_end - time_start;
return 0;
}
A solution I have been using lately uses C++11's lambda functionality to time any arbitrary function call or series of actions.
#include <ctime>
#include <iostream>
#include <functional>
void timeit(std::function<void()> func) {
std::clock_t start = std::clock();
func();
int ms = (std::clock() - start) / (double) (CLOCKS_PER_SEC / 1000);
std::cout << "Finished in " << ms << "ms" << std::endl;
}
int main() {
timeit([] {
for (int i = 0; i < 10; ++i) {
std::cout << "i = " << i << std::endl;
}
});
return 0;
}