How can I measure the execution time of one thread? - c++

I'm running a couple of threads in parallel. And I want to measure the time it takes to execute one thread and the time it takes to execute the whole program. I'm using VC++, on Windows 7.
I tried to measure it while debugging but then I saw this question: https://stackoverflow.com/questions/38971267/improving-performance-using-parallelism-in-c?noredirect=1#comment65299718_38971267 and in the answer given by Schnien it says:
Debugging of multiple threads is somehow "special" - when your Debugger halts at a breakpoint, the other threads will not be stopped - they will go on
Is this true ? And if yes how can I otherwise measure the time
Thanks

That statement is indeed true, only the thread that hits a breakpoint will be paused.
However to measure execution times you do not have to use debugging at all. More information on measuring execution time can be found on the below question:
Measure execution time in C (on Windows)
What you would want to do is measure the time inside the threads' functions (by subtracting the time at the beginning and at the end of the functions). You can do the same with the program, you can use thread.join to make sure all the threads executions end before measuring the time one last time.

Use a simple timer class to create a stopwatch capability then capture the time within each thread. Also, creating system threads is slower than using std::async and the latter can both return values and propagate exceptions which, using threads cause program termination unless caught within the thread.
#include <thread>
#include <iostream>
#include <atomic>
#include <chrono>
#include <future>
// stopwatch. Returns time in seconds
class timer {
public:
std::chrono::time_point<std::chrono::high_resolution_clock> lastTime;
timer() : lastTime(std::chrono::high_resolution_clock::now()) {}
inline double elapsed() {
std::chrono::time_point<std::chrono::high_resolution_clock> thisTime=std::chrono::high_resolution_clock::now();
double deltaTime = std::chrono::duration<double>(thisTime-lastTime).count();
lastTime = thisTime;
return deltaTime;
}
};
// for exposition clarity, generally avoid global varaibles.
const int count = 1000000;
double timerResult1;
double timerResult2;
void f1() {
volatile int i = 0; // volatile eliminates optimization removal
timer stopwatch;
while (i++ < count);
timerResult1=stopwatch.elapsed();
}
void f2() {
volatile int i = 0; // volatile eliminates optimization removal
timer stopwatch;
while (i++ < count);
timerResult2=stopwatch.elapsed();
}
int main()
{
std::cout.precision(6); std::cout << std::fixed;
f1(); std::cout << "f1 execution time " << timerResult1 << std::endl;
timer stopwatch;
{
std::thread thread1(f1);
std::thread thread2(f2);
thread1.join();
thread2.join();
}
double elapsed = stopwatch.elapsed();
std::cout << "f1 with f2 execution time " << elapsed << std::endl;
std::cout << "thread f1 execution time " << timerResult1 << std::endl;
std::cout << "thread f1 execution time " << timerResult2 << std::endl;
{
stopwatch.elapsed(); // reset stopwatch
auto future1 = std::async(std::launch::async, f1); // spins a thread and descturctor automatically joins
auto future2 = std::async(std::launch::async, f2);
}
elapsed = stopwatch.elapsed();
std::cout << "async f1 with f2 execution time " << elapsed << std::endl;
std::cout << "async thread f1 execution time " << timerResult1 << std::endl;
std::cout << "async thread f1 execution time " << timerResult2 << std::endl;
}
On my machine creating threads adds about .3 ms per thread whereas async is only about .05 ms per thread as it is implemented with a thread pool.
f1 execution time 0.002076
f1 with f2 execution time 0.002791
thread f1 execution time 0.002018
thread f1 execution time 0.002035
async f1 with f2 execution time 0.002131
async thread f1 execution time 0.002028
async thread f1 execution time 0.002018
[EDIT] Had incorrect f calls in front of statements (cut and past error)

Related

How to do a task in background of c++ code without changing the runtime

I was trying to create a bank system that has features for credit,deposit,transaction history etc. I wanted to add interest rate as well so I was thinking of adding it in after 10 seconds of delay but When I am using delay(like sleep()function). My whole program is delayed by 10 seconds. Is there a way for interest to be calculated in the background while my runtime of the code won't be affected?
If you need just single task to be run then there exists std::async, which allows to run a task (function call) in a separate thread.
As you need to delay this task then just use std::sleep_for or std::sleep_until to add extra delay within async call. sleep_for shall be used if you want to wait for certain amount of seconds, and sleep_until shall be used if you want to wait till some point in time, e.g. to sleep until 11:32:40 time is reached.
In code below you can see that Doing Something 1 is run before start of async thread, then thread starts, which is waiting for 2 seconds, same time Doing Something 2 is called. After that you may wish (if so) to wait for delayed task to be finished, for that you call res.get(), this blocks main thread till async thread is fully finished. Afterwards Doing Something 3 is called.
If you don't do res.get() explicitly then async thread just finishes by itself at some point. Of if program is about to exit while async thread is still running, then program waits for this async thread to finish.
Try it online!
#include <future>
#include <chrono>
#include <thread>
#include <iostream>
#include <iomanip>
int main() {
int some_value = 123;
auto const tb = std::chrono::system_clock::now();
auto Time = [&]{
return std::chrono::duration_cast<std::chrono::duration<double>>(
std::chrono::system_clock::now() - tb).count();
};
std::cout << std::fixed << std::setprecision(3);
std::cout << "Doing Something 1... at "
<< Time() << " sec" << std::endl;
auto res = std::async(std::launch::async, [&]{
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "Doing Delayed Task... at "
<< Time() << " sec, value " << some_value << std::endl;
});
std::cout << "Doing Something 2... at "
<< Time() << " sec" << std::endl;
res.get();
std::cout << "Doing Something 3... at "
<< Time() << " sec" << std::endl;
}
Output:
Doing Something 1... at 0.000 sec
Doing Something 2... at 0.000 sec
Doing Delayed Task... at 2.001 sec, value 123
Doing Something 3... at 2.001 sec

C++ call a function every x seconds

I am trying to run run() function every 5 seconds without stopping while() loop (parallelly). How can I do that ? Thanks in advance
#include <iostream>
#include <thread>
#include <chrono>
using namespace std;
void run()
{
this_thread::sleep_for(chrono::milliseconds(5000));
cout << "good morning" << endl;
}
int main()
{
thread t1(run);
t1.detach();
while(1)
{
cout << "hello" << endl;
this_thread::sleep_for(chrono::milliseconds(500));
}
return 0;
}
In your main function, it is important to understand what each thread is doing.
The main thread creates a std::thread called t1
The main thread continues and detaches the thread
The main thread executes your while loop in which it:
prints hello
sleeps for 0.5 seconds
The main thread returns 0, your program is finished.
Any time from point 1, thread t1 sleeps for 5 seconds and then prints good morning. This happens only once! Also, as pointed out by #Fareanor, std::cout is not thread-safe, so accessing it with the main thread and thread t1 may result in a data race.
When the main thread reaches point 4 (it actually never does because your while loop is infinite), your thread t1 might have finished it's task or not. Imagine the potential problems that could occur. In most of the cases, you'll want to use std::thread::join().
To solve your problem, there are several alternatives. In the following, we will assume that the execution of the function run without the std::this_thread::sleep_for is insignificant compared to 5 seconds, as per the comment of #Landstalker. The execution time of run will then be 5 seconds plus some insignificant time.
As suggested in the comments, instead of executing the function run every 5 seconds, you could simply execute the body of run every 5 seconds by placing a while loop inside of that function:
void run()
{
while (true)
{
std::this_thread::sleep_for(std::chrono::milliseconds(5000));
std::cout << "good morning" << std::endl;
}
}
int main()
{
std::thread t(run);
t.join();
return 0;
}
If, for some reason, you really need to execute the run function every 5 seconds as stated in your question, you could launch a wrapper function or lambda which contains the while loop:
void run()
{
std::this_thread::sleep_for(std::chrono::milliseconds(5000));
std::cout << "good morning" << std::endl;
}
int main()
{
auto exec_run = [](){ while (true) run(); };
std::thread t(exec_run);
t.join();
return 0;
}
As a side note, it's better to avoid using namespace std.
Just call your run function in seperate thread function like below. Is this ok for you?
void ThreadFunction()
{
while(true) {
run();
this_thread::sleep_for(chrono::milliseconds(5000));
}
}
void run()
{
cout << "good morning" << endl;
}
int main()
{
thread t1(ThreadFunction);
t1.detach();
while(1)
{
cout << "hello" << endl;
this_thread::sleep_for(chrono::milliseconds(500));
}
return 0;
}

C++ How to make timer accurate in Linux

Consider this code:
#include <iostream>
#include <vector>
#include <functional>
#include <map>
#include <atomic>
#include <memory>
#include <chrono>
#include <thread>
#include <boost/asio.hpp>
#include <boost/thread.hpp>
#include <boost/asio/high_resolution_timer.hpp>
static const uint32_t FREQUENCY = 5000; // Hz
static const uint32_t MKSEC_IN_SEC = 1000000;
std::chrono::microseconds timeout(MKSEC_IN_SEC / FREQUENCY);
boost::asio::io_service ioservice;
boost::asio::high_resolution_timer timer(ioservice);
static std::chrono::system_clock::time_point lastCallTime = std::chrono::high_resolution_clock::now();
static uint64_t deviationSum = 0;
static uint64_t deviationMin = 100000000;
static uint64_t deviationMax = 0;
static uint32_t counter = 0;
void timerCallback(const boost::system::error_code &err) {
auto actualTimeout = std::chrono::high_resolution_clock::now() - lastCallTime;
std::chrono::microseconds actualTimeoutMkSec = std::chrono::duration_cast<std::chrono::microseconds>(actualTimeout);
long timeoutDeviation = actualTimeoutMkSec.count() - timeout.count();
deviationSum += abs(timeoutDeviation);
if(abs(timeoutDeviation) > deviationMax) {
deviationMax = abs(timeoutDeviation);
} else if(abs(timeoutDeviation) < deviationMin) {
deviationMin = abs(timeoutDeviation);
}
++counter;
//std::cout << "Actual timeout: " << actualTimeoutMkSec.count() << "\t\tDeviation: " << timeoutDeviation << "\t\tCounter: " << counter << std::endl;
timer.expires_from_now(timeout);
timer.async_wait(timerCallback);
lastCallTime = std::chrono::high_resolution_clock::now();
}
using namespace std::chrono_literals;
int main() {
std::cout << "Frequency: " << FREQUENCY << " Hz" << std::endl;
std::cout << "Callback should be called each: " << timeout.count() << " mkSec" << std::endl;
std::cout << std::endl;
ioservice.reset();
timer.expires_from_now(timeout);
timer.async_wait(timerCallback);
lastCallTime = std::chrono::high_resolution_clock::now();
auto thread = new std::thread([&] { ioservice.run(); });
std::this_thread::sleep_for(1s);
std::cout << std::endl << "Messages posted: " << counter << std::endl;
std::cout << "Frequency deviation: " << FREQUENCY - counter << std::endl;
std::cout << "Min timeout deviation: " << deviationMin << std::endl;
std::cout << "Max timeout deviation: " << deviationMax << std::endl;
std::cout << "Avg timeout deviation: " << deviationSum / counter << std::endl;
return 0;
}
It runs timer to call timerCallback(..) periodically with specified frequency. In this example, callback must be called 5000 times per second. One can play with frequency and see that actual (measured) frequency of calls is different from desired one. In fact the higher is the frequency, the higher is deviation. I did some measurements with different frequencies and here is summary:
https://docs.google.com/spreadsheets/d/1SQtg2slNv-9VPdgS0RD4yKRnyDK1ijKrjVz7BBMSg24/edit?usp=sharing
When desired frequency is 10000Hz, system miss 10% (~ 1000) of calls.
When desired frequency is 100000Hz, system miss 40% (~ 40000) of calls.
Question: Is it possible to achieve better accuracy in Linux \ C ++ environment? How? I need it to work without significant deviation with frequency of 500000Hz
P.S. My first idea was that it is the body of the timerCallback(..) method itself causes delay. I measured it. It takes a stably takes less than 1 microsecond to execute. So it does not affect the process.
I have no experience in this problem myself, but I guess (as the references explains) that the scheduler of the OS interferes with your callback somehow.
So, you could try to use the real-time scheduler and try to change priority of your task to a higher one.
Hope this gives you a direction to find your answer.
Scheduler:
http://gumstix.8.x6.nabble.com/High-resolution-periodic-task-on-overo-td4968642.html
Priority:
https://linux.die.net/man/3/setpriority
If you need to achieve one call each two microsecond interval, you'd better to attach to absolute time positions, and don't consider the time each request is going to require.... You run although into the problem that the processing required at each timeslot could be more cpu demanding than the time required for it to execute.
If you have a multicore cpu, I'd divide the timeslot between each core (in a multithreaded approach) for it to be longer for each core, so suppose that you have your requirements in a four core cpu, then you can allow each thread to execute 1 cal per 8usec, which is probably more affordable. In this case you use absolute timers (one absolute timer is one that waits until the wall clock ticks a specific absolute time, and not a delay from the time you called it) and will offset them by an amount equal to the thread number of 2usec delay, in this case (4 cores) you will start thread #1 at time T, thread #2 at time T + 2usec, thread #3 at time T + 4usec, ... and thread #N at time T + 2*(N-1)usec. Each thread will then start itself again at time oldT + 2usec, instead of doing some kind of nsleep(3) call. This will not accumulate the processing time to the delay call, as this is most probably what you are experiencing. The pthread library timers are all absolute time timers, so you can use them. I think this is the only way you'll be capable of reaching such a hard spec. (and prepare to see how the battery suffers with that, assuming you're in an android environment)
NOTE
in this approach, the external bus can be a bottleneck, so even if you get it working, probably it would be better to synchronize several machines with NTP (this can be done to the usec level, at the speed of actual GBit links) and use different processors running in parallel. As you don't describe anything of the process you have to repeat so densely, I cannot provide more help to the problem.

Why time cost is 0 seconds after call std::async?

See the following code.
#include <future>
#include <iostream>
#include <ctime>
int main()
{
std::future<int> future = std::async(std::launch::deferred, [](){
std::this_thread::sleep_for(std::chrono::seconds(5));
return 100;
});
std::cout << "waiting...\n";
clock_t start = clock();
std::future_status status = future.wait_for(std::chrono::seconds(20));
std::cout << "result is " << future.get() << std::endl;
clock_t end = clock();
std::cout<<"Time Cost : "<< (double)(end-start)/CLOCKS_PER_SEC <<" seconds."<< std::endl;
}
It's very confusing about the execution result. Yep, the main thread will wait for only 5 seconds around and then print "100". But why "Time Cost" shows 0? The test environment is Cygwin with g++ 4.9.3.
Then I tested it in VS2013. The result is 25 seocnds. Strange!
It doesn't show 0 on my machine but a very small value : 0;000156s. But as it measures processor time and your main thread does not consume any cpu (wait is not an active loop), the result is almost 0.
clock() returns processor time spent. It doesn't have any guarantee of advancement whatsoever. If your CPU sleeps, the value returned by it will not be advanced. To measure intervals properly, use clocks from std::chrono, for example, std::chrono::steady_clock.

c++ threads execution time and executing thread in another thread

I have a code, and i'm testing how much time will take an executing of 10 threads.
#include <iostream>
#include <thread>
#include <chrono>
#include <time.h>
using namespace std;
void pause_thread(int n){
this_thread::sleep_for(chrono::seconds(n));
cout << "pause of " << n << " seconds ended\n";
}
int main(){
clock_t EndTime = clock();
thread threads[10];
cout << "Spawning 10 threads...\n";
for (int i = 0; i<10; ++i)
threads[i] = thread(pause_thread, i + 1);
cout << "Done spawning threads. Now waiting for them to join:\n";
for (int i = 0; i<10; ++i)
threads[i].join();
cout << "All threads joined!\n";
cout << "==================================================\n";
cout << "Time of executing threads: " << (double)(clock() - EndTime) / CLOCKS_PER_SEC << endl;
system("pause");
return 0;
}
The output is this:
Spawning 10 threads...
Done spawning threads. Now waiting for them to join:
pause of 1 seconds ended
pause of 2 seconds ended
pause of 3 seconds ended
pause of 4 seconds ended
pause of 5 seconds ended
pause of 6 seconds ended
pause of 7 seconds ended
pause of 8 seconds ended
pause of 9 seconds ended
pause of 10 seconds ended
All threads joined!
==================================================
Time of executing threads: 10.041
First question is: Why execution of the program takes 10,041 seconds if the pause between each thread is 1 second? What happened with the program and it took additional 0.041s on executing?
Second question is: Is this right way to execute thread in another thread?
threads[i] = thread(...);
Is this mean that thread is in the thread?
If not, how can it be done (to execute thread in another thread)?
First question is well answered by Brandon Haston's comment.
Answer to second question doesn't fit in a comment.
threads[i] = thread(...);
means that a std::thread has been created and its representative std::thread object has been assigned to a slot in your std::thread array. This raises a question I'm going to have to look into on my own later when I have a compiler to play with: What happened to the thread that was just overwritten?
Anyway, that new thread isn't inside another thread. Threads don't have any concept of ownership mutual. A process owns threads, but threads don't. A thread can start another thread. For example,
void pause_thread(int n){
this_thread::sleep_for(chrono::seconds(n));
cout << "pause of " << n << " seconds ended\n";
if (! cows_are_home)
{
thread newthread(pause_thread, 1);
newthread.detach();
}
}
Each new thread will wait about 1 second, then create a thread which will wait a second and create another thread, and this will go on until the cows come home.