Measuring execution time when using threads

Measuring execution time when using threads - c++

I would like to measure the execution time of some code. The code starts in the main() function and finishes in an event handler.
I have a C++11 code that looks like this:
#include <iostream>
#include <time.h>
...
volatile clock_t t;
void EventHandler()
{
// when this function called is the end of the part that I want to measure
t = clock() - t;
std::cout << "time in seconds: " << ((float)t)/CLOCKS_PER_SEC;
}
int main()
{
MyClass* instance = new MyClass(EventHandler); // this function starts a new std::thread
instance->start(...); // this function only passes some data to the thread working data, later the thread will call EventHandler()
t = clock();
return 0;
}
So it is guaranteed that the EventHandler() will be called only once, and only after an instance->start() call.
It is working, this code give me some output, but it is a horrible code, it uses global variable and different threads access global variable. However I can't change the used API (the constructor, the way the thread calls to EventHandler).
I would like to ask if a better solution exists.
Thank you.

Global variable is unavoidable, as long as MyClass expects a plain function and there's no way to pass some context pointer along with the function...
You could write the code in a slightly more tidy way, though:
#include <future>
#include <thread>
#include <chrono>
#include <iostream>
struct MyClass
{
typedef void (CallbackFunc)();
constexpr explicit MyClass(CallbackFunc* handler)
: m_handler(handler)
{
}
void Start()
{
std::thread(&MyClass::ThreadFunc, this).detach();
}
private:
void ThreadFunc()
{
std::this_thread::sleep_for(std::chrono::seconds(5));
m_handler();
}
CallbackFunc* m_handler;
};
std::promise<std::chrono::time_point<std::chrono::high_resolution_clock>> gEndTime;
void EventHandler()
{
gEndTime.set_value(std::chrono::high_resolution_clock::now());
}
int main()
{
MyClass task(EventHandler);
auto trigger = gEndTime.get_future();
auto startTime = std::chrono::high_resolution_clock::now();
task.Start();
trigger.wait();
std::chrono::duration<double> diff = trigger.get() - startTime;
std::cout << "Duration = " << diff.count() << " secs." << std::endl;
return 0;
}

clock() call will not filter out executions of different processes and threads run by scheduler in parallel with program event handler thread. There are alternative like times() and getrusage() which tells cpu time of process. Though it is not clearly mentioned about thread behaviour for these calls but if it is Linux, threads are treated as processes but it has to be investigated.

clock() is the wrong tool here, because it does not count the time actually required by the CPU to run your operation, for example, if the thread is not running at all, the time is still counted.
Instead you have to use platform-specific APIs, such as pthread_getcpuclockid for POSIX-compliant systems (Check if _POSIX_THREAD_CPUTIME is defined), that counts the actual time spent by a specific thread.
You can take a look at a benchmarking library I wrote for C++ that supports thread-aware measuring (see struct thread_clock implementation).
Or, you can use the code snippet from the man page:
/* Link with "-lrt" */
#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <string.h>
#include <errno.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
#define handle_error_en(en, msg) \
do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)
static void *
thread_start(void *arg)
{
printf("Subthread starting infinite loop\n");
for (;;)
continue;
}
static void
pclock(char *msg, clockid_t cid)
{
struct timespec ts;
printf("%s", msg);
if (clock_gettime(cid, &ts) == -1)
handle_error("clock_gettime");
printf("%4ld.%03ld\n", ts.tv_sec, ts.tv_nsec / 1000000);
}
int
main(int argc, char *argv[])
{
pthread_t thread;
clockid_t cid;
int j, s;
s = pthread_create(&thread, NULL, thread_start, NULL);
if (s != 0)
handle_error_en(s, "pthread_create");
printf("Main thread sleeping\n");
sleep(1);
printf("Main thread consuming some CPU time...\n");
for (j = 0; j < 2000000; j++)
getppid();
pclock("Process total CPU time: ", CLOCK_PROCESS_CPUTIME_ID);
s = pthread_getcpuclockid(pthread_self(), &cid);
if (s != 0)
handle_error_en(s, "pthread_getcpuclockid");
pclock("Main thread CPU time: ", cid);
/* The preceding 4 lines of code could have been replaced by:
pclock("Main thread CPU time: ", CLOCK_THREAD_CPUTIME_ID); */
s = pthread_getcpuclockid(thread, &cid);
if (s != 0)
handle_error_en(s, "pthread_getcpuclockid");
pclock("Subthread CPU time: 1 ", cid);
exit(EXIT_SUCCESS); /* Terminates both threads */
}

Related

wait() for thread made via clone?

I plan on rewriting this to assembly so I can't use c or c++ standard library. The code below runs perfectly. However I want a thread instead of a second process. If you uncomment /*CLONE_THREAD|*/ on line 25 waitpid will return -1. I would like to have a blocking function that will resume when my thread is complete. I couldn't figure out by looking at the man pages what it expects me to do
#include <sys/wait.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
int globalValue=0;
static int childFunc(void*arg)
{
printf("Global value is %d\n", globalValue);
globalValue += *(int*)&arg;
return 31;
}
int main(int argc, char *argv[])
{
auto stack_size = 1024 * 1024;
auto stack = (char*)mmap(NULL, stack_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
if (stack == MAP_FAILED) { perror("mmap"); exit(EXIT_FAILURE); }
globalValue = 5;
auto pid = clone(childFunc, stack + stack_size, /*CLONE_THREAD|*/CLONE_VM|CLONE_SIGHAND|SIGCHLD, (void*)7);
sleep(1); //So main and child printf don't collide
if (pid == -1) { perror("clone"); exit(EXIT_FAILURE); }
printf("clone() returned %d\n", pid);
int status;
int waitVal = waitpid(-1, &status, __WALL);
printf("Expecting 12 got %d. Expecting 31 got %d. ID=%d\n", globalValue, WEXITSTATUS(status), waitVal);
return 0;
}

If you want to call functions asynchronously with threads I recommend using std::async. Example here :
#include <iostream>
#include <future>
#include <mutex>
#include <condition_variable>
int globalValue = 0; // could also have been std::atomic<int> but I choose a mutex (to also serialize output to std::cout)
std::mutex mtx; // to protect access to data in multithreaded applications you can use mutexes
int childFunc(const int value)
{
std::unique_lock<std::mutex> lock(mtx);
globalValue = value;
std::cout << "Global value set to " << globalValue << "\n";
return 31;
}
int getValue()
{
std::unique_lock<std::mutex> lock(mtx);
return globalValue;
}
int main(int argc, char* argv[])
{
// shared memory stuff is not needed for threads
// launch childFunc asynchronously
// using a lambda function : https://en.cppreference.com/w/cpp/language/lambda
// to call a function asynchronously : https://en.cppreference.com/w/cpp/thread/async
// note I didn't ues the C++ thread class, it can launch things asynchronously
// however async is both a better abstraction and you can return values (and exceptions)
// to the calling thread if you need to (which you do in this case)
std::future<int> future = std::async(std::launch::async, []
{
return childFunc(12);
});
// wait until asynchronous function call is complete
// and get its return value;
int value_from_async = future.get();
std::cout << "Expected global value 12, value = " << getValue() << "\n";
std::cout << "Expected return value from asynchronous process is 31, value = " << value_from_async << "\n";
return 0;
}

How to terminate a function call after a timeout?

Let's say I have a foo() function. I want it to run in, for example, 5 seconds, after that, it has to be cancelled and continues to do the rest of the program.
Code snippets:
int main() {
// Blah blah
foo(); // Running in 5 sec only
// After 5 sec, came here and finished
}
References: After a while searching on StackOverflow, I found this is what I need but written in python: Timeout on a function call.
signal.h and unistd.h can be related.

This is possible with threads. Since C++20, it will be fairly simple:
{
std::jthread t([](std::stop_token stoken) {
while(!stoken.stop_requested()) {
// do things that are not infinite, or are interruptible
}
});
using namespace std::chrono_literals;
std::this_thread::sleep_for(5s);
}
Note that many interactions with the operating system cause the process to be "blocked". An example of such is the POSIX function listen, which waits for incoming connections. If the thread is blocked, then it will not be able to proceed to the next iteration.
Unfortunately, the C++ standard doesn't specify whether such platform specific calls should be interrupted by request to stop or not. You need to use platform specific methods to make sure that happens. Typically, signals can be configured to interrupt blocking system calls. In case of listen, an option is to connect to the waiting socket.

There is no way to do that uniformly in C++. There are ways to do this with some degree of success when you use OS specific APIs, however it all becomes extremely cumbersome.
The basic idea which you can use in *nix is a combination of alarm() system call and setjmp/longjmp C function.
A (pseudo) code:
std::jmp_buf jump_buffer;
void alarm_handle(int ) {
longjmp(jump_buffer);
}
int main() {
signal(SIGALRM, alarm_handle);
alarm(5);
if (setjmp(jump_buffer)) {
foo(); // Running in 5 sec only
} else {
// After 5 sec, came here and finished
// if we are here, foo timed out
}
}
This all is extremely fragile and shaky (i.e. long jumps do not place nicely with C++ objects lifetime), but if you know what you are doing this might work.

Perfectly standard C++11
#include <iostream>
#include <thread> // std::this_thread::sleep_for
#include <chrono> // std::chrono::seconds
using namespace std;
// stop flag
bool stopfoo;
// function to run until stopped
void foo()
{
while( ! stopfoo )
{
// replace with something useful
std::this_thread::sleep_for (std::chrono::seconds(1));
std::cout << "still working!\n";
}
std::cout "stopped\n";
}
// function to call a top after 5 seconds
void timer()
{
std::this_thread::sleep_for (std::chrono::seconds( 5 ));
stopfoo = true;
}
int main()
{
// initialize stop flag
stopfoo = false;
// start timer in its own thread
std::thread t (timer);
// start worker in main thread
foo();
return 0;
}
Here is the same thing with a thread safe stop flag ( not really neccessary, but good practice for more complex cases )
#include <iostream>
#include <thread> // std::this_thread::sleep_for
#include <chrono> // std::chrono::seconds
#include <mutex>
using namespace std;
class cFlagThreadSafe
{
public:
void set()
{
lock_guard<mutex> l(myMtx);
myFlag = true;
}
void unset()
{
lock_guard<mutex> l(myMtx);
myFlag = false;
}
bool get()
{
lock_guard<mutex> l(myMtx);
return myFlag;
}
private:
bool myFlag;
mutex myMtx;
};
// stop flag
cFlagThreadSafe stopfoo;
// function to run until stopped
void foo()
{
while( ! stopfoo.get() )
{
// replace with something useful
this_thread::sleep_for (std::chrono::seconds(1));
cout << "still working!\n";
}
cout << "stopped\n";
}
// function to call a top after 5 seconds
void timer()
{
this_thread::sleep_for (chrono::seconds( 5 ));
stopfoo.set();
}
int main()
{
// initialize stop flag
stopfoo.unset();
// start timer in its own thread
thread t (timer);
// start worker in main thread
foo();
t.join();
return 0;
}
And if it is OK to do everything in the main thread, things can be greatly simplified.
#include <iostream>
#include <thread> // std::this_thread::sleep_for
#include <chrono> // std::chrono::seconds
using namespace std;
void foo()
{
auto t1 = chrono::steady_clock ::now();
while( chrono::duration_cast<chrono::seconds>(
chrono::steady_clock ::now() - t1 ).count() < 5 )
{
// replace with something useful
this_thread::sleep_for (std::chrono::seconds(1));
cout << "still working!\n";
}
cout << "stopped\n";
}
int main()
{
// start worker in main thread
foo();
return 0;
}

Can properly written code using mutex be still volatile?

I've been doing pretty basic stuff with std::thread without any particular reason, simply in order to learn it. I thought that the simple example I created, where few threads are operating on the same data, locking each other before doing so, worked just fine, until I realized that every time I run it the returned value is different, while very close to each other, I am pretty sure they should equal each other. Some of the values I have received:
21.692524
21.699258
21.678871
21.705947
21.685744
Am I doing something wrong or maybe there is underlying reason for that behaviour?
#include <string>
#include <iostream>
#include <thread>
#include <math.h>
#include <time.h>
#include <windows.h>
#include <mutex>
using namespace std;
mutex mtx;
mutex mtx2;
int currentValue = 1;
double suma = 0;
int assignPart() {
mtx.lock();
int localValue = currentValue;
currentValue+=10000000;
mtx.unlock();
return localValue;
}
void calculatePart()
{
int value;
double sumaLokalna = 0;
while(currentValue<1500000000){
value = assignPart();
for(double i=value;i<(value+10000000);i++){
sumaLokalna = sumaLokalna + (1/(i));
}
mtx2.lock();
suma+=sumaLokalna;
mtx2.unlock();
sumaLokalna = 0;
}
}
int main()
{
clock_t startTime = clock();
// Constructs the new thread and runs it. Does not block execution.
thread watek(calculatePart);
thread watek2(calculatePart);
thread watek3(calculatePart);
thread watek4(calculatePart);
while(currentValue<1500000000){
Sleep(100);
printf("%-12d %-12lf \n",currentValue, suma);
}
watek.join();
watek2.join();
watek3.join();
watek4.join();
cout << double( clock() - startTime ) / (double)CLOCKS_PER_SEC<< " seconds." << endl;
//Makes the main thread wait for the new thread to finish execution, therefore blocks its own execution.
}

Your loop
while(currentValue<1500000000){
Sleep(100);
printf("%-12d %-12lf \n",currentValue, suma);
}
is printing intermediate results, but you're not printing the final result.
To print the final result, add the line
printf("%-12d %-12lf \n",currentValue, suma);
after joining the threads.

how to reduce the latency from one boost strand to another boost strand

Suppose there are several boost strand share_ptr stored in a vector m_poStrands. And tJobType is the enum indicated different type of job.
I found the time diff from posting a job in one strand (JOBA) to call the onJob of another strand (JOBB) is around 50 milli second.
I want to know if there is any way to reduce the time diff.
void postJob(tJobType oType, UINT8* pcBuffer, size_t iSize)
{
//...
m_poStrands[oType]->post(boost::bind(&onJob, this, oType, pcDestBuffer, iSize));
}
void onJob(tJobType oType, UINT8* pcBuffer, size_t iSize)
{
if (oType == JOBA)
{
//....
struct timeval sTV;
gettimeofday(&sTV, 0);
memcpy(pcDestBuffer, &sTV, sizeof(sTV));
pcDestBuffer += sizeof(sTV);
iSize += sizeof(sTV);
memcpy(pcDestBuffer, pcBuffer, iSize);
m_poStrands[JOBB]->(boost::bind(&onJob, this, JOBB, pcDestBuffer, iSize));
}
else if (oType == JOBB)
{
// get the time from buffer
// and calculate the dime diff
struct timeval eTV;
gettimeofday(&eTV, 0);
}
}

Your latency is probably coming from the memcpys between your gettimeofdays. Here's an example program I ran on my machine (2 ghz core 2 duo). I'm getting thousands of nanoseconds. So a few microseconds. I doubt that your system is running 4 orders of magnitude slower than mine. The worst I ever saw it run was 100 microsecond for one of the two tests. I tried to make the code as close to the code posted as possible.
#include <boost/asio.hpp>
#include <boost/chrono.hpp>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
#include <iostream>
struct Test {
boost::shared_ptr<boost::asio::strand>* strands;
boost::chrono::high_resolution_clock::time_point start;
int id;
Test(int i, boost::shared_ptr<boost::asio::strand>* strnds)
: id(i),
strands(strnds)
{
strands[0]->post(boost::bind(&Test::callback,this,0));
}
void callback(int i) {
if (i == 0) {
start = boost::chrono::high_resolution_clock::now();
strands[1]->post(boost::bind(&Test::callback,this,1));
} else {
boost::chrono::nanoseconds sec = boost::chrono::high_resolution_clock::now() - start;
std::cout << "test " << id << " took " << sec.count() << " ns" << std::endl;
}
}
};
int main() {
boost::asio::io_service io_service_;
boost::shared_ptr<boost::asio::strand> strands[2];
strands[0] = boost::shared_ptr<boost::asio::strand>(new boost::asio::strand(io_service_));
strands[1] = boost::shared_ptr<boost::asio::strand>(new boost::asio::strand(io_service_));
boost::thread t1 (boost::bind(&boost::asio::io_service::run, &io_service_));
boost::thread t2 (boost::bind(&boost::asio::io_service::run, &io_service_));
Test test1 (1, strands);
Test test2 (2, strands);
t1.join();
t2.join();
}

Implementing an event timer using boost::asio

The sample code looks long, but actually it's not so complicated :-)
What I'm trying to do is, when a user calls EventTimer.Start(), it will execute the callback handler (which is passed into the ctor) every interval milliseconds for repeatCount times.
You just need to look at the function EventTimer::Stop()
#include <iostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/bind.hpp>
#include <boost/thread.hpp>
#include <boost/function.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <ctime>
#include <sys/timeb.h>
#include <Windows.h>
std::string CurrentDateTimeTimestampMilliseconds() {
double ms = 0.0; // Milliseconds
struct timeb curtime;
ftime(&curtime);
ms = (double) (curtime.millitm);
char timestamp[128];
time_t now = time(NULL);
struct tm *tp = localtime(&now);
sprintf(timestamp, "%04d%02d%02d-%02d%02d%02d.%03.0f",
tp->tm_year + 1900, tp->tm_mon + 1, tp->tm_mday, tp->tm_hour, tp->tm_min, tp->tm_sec, ms);
return std::string(timestamp);
}
class EventTimer
{
public:
static const int kDefaultInterval = 1000;
static const int kMinInterval = 1;
static const int kDefaultRepeatCount = 1;
static const int kInfiniteRepeatCount = -1;
static const int kDefaultOffset = 10;
public:
typedef boost::function<void()> Handler;
EventTimer(Handler handler = NULL)
: interval(kDefaultInterval),
repeatCount(kDefaultRepeatCount),
handler(handler),
timer(io),
exeCount(-1)
{
}
virtual ~EventTimer()
{
}
void SetInterval(int value)
{
// if (value < 1)
// throw std::exception();
interval = value;
}
void SetRepeatCount(int value)
{
// if (value < 1)
// throw std::exception();
repeatCount = value;
}
bool Running() const
{
return exeCount >= 0;
}
void Start()
{
io.reset(); // I don't know why I have to put io.reset here,
// since it's already been called in Stop()
exeCount = 0;
timer.expires_from_now(boost::posix_time::milliseconds(interval));
timer.async_wait(boost::bind(&EventTimer::EventHandler, this));
io.run();
}
void Stop()
{
if (Running())
{
// How to reset everything when stop is called???
//io.stop();
timer.cancel();
io.reset();
exeCount = -1; // Reset
}
}
private:
virtual void EventHandler()
{
// Execute the requested operation
//if (handler != NULL)
// handler();
std::cout << CurrentDateTimeTimestampMilliseconds() << ": exeCount = " << exeCount + 1 << std::endl;
// Check if one more time of handler execution is required
if (repeatCount == kInfiniteRepeatCount || ++exeCount < repeatCount)
{
timer.expires_at(timer.expires_at() + boost::posix_time::milliseconds(interval));
timer.async_wait(boost::bind(&EventTimer::EventHandler, this));
}
else
{
Stop();
std::cout << CurrentDateTimeTimestampMilliseconds() << ": Stopped" << std::endl;
}
}
private:
int interval; // Milliseconds
int repeatCount; // Number of times to trigger the EventHandler
int exeCount; // Number of executed times
boost::asio::io_service io;
boost::asio::deadline_timer timer;
Handler handler;
};
int main()
{
EventTimer etimer;
etimer.SetInterval(1000);
etimer.SetRepeatCount(1);
std::cout << CurrentDateTimeTimestampMilliseconds() << ": Started" << std::endl;
etimer.Start();
// boost::thread thrd1(boost::bind(&EventTimer::Start, &etimer));
Sleep(3000); // Keep the main thread active
etimer.SetInterval(2000);
etimer.SetRepeatCount(1);
std::cout << CurrentDateTimeTimestampMilliseconds() << ": Started again" << std::endl;
etimer.Start();
// boost::thread thrd2(boost::bind(&EventTimer::Start, &etimer));
Sleep(5000); // Keep the main thread active
}
/* Current Output:
20110520-125506.781: Started
20110520-125507.781: exeCount = 1
20110520-125507.781: Stopped
20110520-125510.781: Started again
*/
/* Expected Output (timestamp might be slightly different with some offset)
20110520-125506.781: Started
20110520-125507.781: exeCount = 1
20110520-125507.781: Stopped
20110520-125510.781: Started again
20110520-125512.781: exeCount = 1
20110520-125512.781: Stopped
*/
I don't know why that my second time of calling to EventTimer::Start() does not work at all. My questions are:
What should I do in
EventTimer::Stop() in order to reset
everything so that next time of
calling Start() will work?
Is there anything else I have to modify?
If I use another thread to start the EventTimer::Start() (see the commented code in the main function), when does the thread actually exit?
Thanks.
Peter

As Sam hinted, depending on what you're attempting to accomplish, most of the time it is considered a design error to stop an io_service. You do not need to stop()/reset() the io_service in order to reschedule a timer.
Normally you would leave a thread or thread pool running attatched to an io_service and then you would schedule whatever event you need with the io_service. With the io_service machinery in place, leave it up to the io_service to dispatch your scheduled work as requested and then you only have to work with the events or work requests that you schedule with the io_service.

It's not entirely clear to me what you are trying to accomplish, but there's a couple of things that are incorrect in the code you have posted.
io_service::reset() should only be invoked after a previous invocation of io_service::run() was stopped or ran out of work as the documentation describes.
you should not need explicit calls to Sleep(), the call to io_service::run() will block as long as it has work to do.

I figured it out, but I don't know why that I have to put io.reset() in Start(), since it's already been called in Stop().
See the updated code in the post.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Measuring execution time when using threads - c++

Related

wait() for thread made via clone?

How to terminate a function call after a timeout?

Can properly written code using mutex be still volatile?

how to reduce the latency from one boost strand to another boost strand

Implementing an event timer using boost::asio

Categories

Resources