This question already has answers here:
How do I profile C++ code running on Linux?
(19 answers)
Measuring execution time of a function in C++
(14 answers)
Closed 7 years ago.
I'm debugging a large C++ project (Linux environment) and one binary appears to be taking more time to run than I expect. How can I see a breakdown of how much time each function call in each source file takes, so I can find the problem(s)?
There's another way to find the problem(s) than by getting a breakdown of function times.
Run it under a debugger, and manually interrupt it several times, and each time examine the call stack.
If you look at each level of the call stack that's in your code, you can see exactly why that moment in time is being spent.
Suppose you have a speed problem that, when fixed, will save some fraction of time, like 30%.
That means each stack sample that you examine has at least a 30% chance of happening during the problem.
So, turning it around, if you see it doing something that could be eliminated, and you see it on more than one sample, you've found your problem! (or at least one of them) **
That's the random-pausing technique.
It will find any problem that timers will, and problems that they won't.
** You might have to think about it a bit. If you see it doing something on a single sample, that doesn't mean much.
Even if the code is only doing a thousand completely different things, none of them taking significant time, it has to stop somewhere.
But if you see it doing something, and you see it on more than one sample, and you haven't taken a lot of samples, the probability that you would hit the same insignificant thing twice is very very small.
So it is far more likely that its probability is significant.
In fact, a reasonable guess of its probability is the number of samples in which you saw it, divided by the total number of samples.
#include <iostream>
#include <ctime>
int main() {
std::clock_t start = std::clock();
//code here
double duration = ( std::clock() - start ) / (double) CLOCKS_PER_SEC;
std::cout << duration << std::endl;
}
You can create your own timer class. At the starting of each block call method to reset the timer variable to zero
and get the timer at the end of the code block. You can do this
in various blocks of the code. Once you had identified the code block that
takes more time, you can have internal timers too.. If you want try a standard tool than I would recommend to use gprof. http://www.thegeekstuff.com/2012/08/gprof-tutorial/
Related
I have a C++ code that generates random 3D network structures. I work well and if I run it manually (from the Terminal), I get two different structures, as expected.
However, if I create a small loop to launch it 10 successive times, it produces 10 times the exact same structure, which is not normal. If I add a sleep(1) line at the end of the code, it works again, so I guess it as something to do with C++ releasing the memory (I am absolutely not an expect so I could be completely wrong).
The problem is that, by adding the sleep(1) command, it take much more time to run (10x more). This is of course not an issue for 10 runs, but the aim is to make 1000's of them.
Is there a way to force C++ to release the memory at the end of the code?
C++ does not release memory automatically at all (except for code in destructors) so that is not the case.
But random numbers generators uses a system clock counter (I may be wrong here).
In a pascal language you should've call randomize procedure to init random generator with seed. Without doing so, random numbers generator produces the same results with each run, wich is very like your situation
In C++ there is srand function that is typycally inited by current time, like in example there http://en.cppreference.com/w/cpp/numeric/random/rand
I dont know how you init your rand generator, but if you doing so with a time with seconds resolution and your code is fast enough to do 10 loops in one second - this can be a case. It explans how 1 second delay fixes situation.
if thats the case, you can try a time function with bigger resolution. Also in c++11 stl, there is much powerfull random module (as in boost libraries, if you dont have c++11x). Documentation is here http://www.cplusplus.com/reference/random/
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
To compare C++ and Java on certain tasks I have made two similar programs, one in Java and one in C++. When I run the Java one it takes 25% CPU without fluctuation, which you would expect as I'm using a quad core. However, the C++ version only uses about 8% and fluctuates havily. I run both programs on the same computer, on the same OS, with the same programs active in the background. How do I make C++ use one full core? These are 2 programs both not interrupted by anything. They both ask for some info and then enter an infinite loop until you exit the program, giving feedback on how many calculations per second.
The code:
http://pastebin.com/5rNuR9wA
http://pastebin.com/gzSwgBC1
http://pastebin.com/60wpcqtn
To answer some questions:
I'm basically looping a bunch of code and seeing how often per second it loops. The problem is: it doesn't use all the CPU it can use. The whole point is to have the same processor do the same task in Java and C++ and compare the amount of loops per second. But if one is using irregular amounts of CPU time and the other one is looping stable at a certain percentage they are hard to compare. By the way, if I ask it to execute this:
while(true){}
it takes 25%, why doesn't it do that with my code?
----edit:----
After some experimenting it seems that my code starts to use less than 25% if I use a cout statement. It isn't clear to me why a cout would cause the program to use less cpu (I guess it pauses until the statement is written which appearantly takes a while?
With this knowledge I will reprogram both programs (to keep them comparable) and just let it report the results after 60 seconds instead of every time it completed a loop.
Thanks for all the help, some of the tips were really helpful. After I discovered the answer someone also turned out to give this as an answer, so even if I wouldn't have found it myself I would have gotten the answer. Thanks!
(though I would like to know why a std::cout takes such an amount of time)
Your main loop has a cout in it, which will call out to the OS to write the accumulated output at some point. Either OS time is not counted against your app, or it causes some disk IO or other activity that forces your program to wait.
It's probably not accurate to compare both of these running at the same time without considering the fact that they will compete for cpu time. The OS will automatically choose the scheduling for these two tasks which can be affected by which one started first and a multitude of other criteria.
Running them both at the same time would require some type of configuration of the scheduling so that each one is confined to run to one (or two) cpus and each application uses different cpus. This can be done by having each main function execute a separate thread that performs all the work and then setting the cpu where this thread will run. In c++11 this can be done using a std::thread and then setting the underlying cpu affinity by getting the native_handle and setting it there.
I'm not sure how to do this in Java but I'm sure the process is similar.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What's the usual way of controlling frame rate?
I'm an amateur when it comes to programming, but I wanna ask if this is an efficient way of handling things.
Right now, my program currently updates itself with every step, but I am looking to divide the program up into a smaller frame rate. My current idea is to set a clock in main, where with every 30 ticks or so (for example), the game will update itself. However, I am looking to update the different parts of the program separately within that slot of time (for instance, every 10 seconds) with the program updating the screen at the end of that period. I figured that this will help to alleviate some of the "pressure" (assuming that there is any).
I wouldn't go that way. It's going to be much better/easier/cleaner, especially when starting, to update the game/screen as often as possible (e.g. put it in a while(true) loop). Then each iteration, figure out the elapsed time and use that accordingly. (e.g. Move an object 1 pixel for every elapsed 20ms) or something
The reason that this is a better starting point, is you'll be hard-pressed to guarantee exactly 30fps and the game will behave weirdly (e.g. if a slow computer can only pull 15fps, you don't want objects going twice the speed) and not to mention drift/individual slow frames etc.
I have a program I want to profile with gprof. The problem (seemingly) is that it uses sockets. So I get things like this:
::select(): Interrupted system call
I hit this problem a while back, gave up, and moved on. But I would really like to be able to profile my code, using gprof if possible. What can I do? Is there a gprof option I'm missing? A socket option? Is gprof totally useless in the presence of these types of system calls? If so, is there a viable alternative?
EDIT: Platform:
Linux 2.6 (x64)
GCC 4.4.1
gprof 2.19
The socket code needs to handle interrupted system calls regardless of profiler, but under profiler it's unavoidable. This means having code like.
if ( errno == EINTR ) { ...
after each system call.
Take a look, for example, here for the background.
gprof (here's the paper) is reliable, but it only was ever intended to measure changes, and even for that, it only measures CPU-bound issues. It was never advertised to be useful for locating problems. That is an idea that other people layered on top of it.
Consider this method.
Another good option, if you don't mind spending some money, is Zoom.
Added: If I can just give you an example. Suppose you have a call-hierarchy where Main calls A some number of times, A calls B some number of times, B calls C some number of times, and C waits for some I/O with a socket or file, and that's basically all the program does. Now, further suppose that the number of times each routine calls the next one down is 25% more times than it really needs to. Since 1.25^3 is about 2, that means the entire program takes twice as long to run as it really needs to.
In the first place, since all the time is spent waiting for I/O gprof will tell you nothing about how that time is spent, because it only looks at "running" time.
Second, suppose (just for argument) it did count the I/O time. It could give you a call graph, basically saying that each routine takes 100% of the time. What does that tell you? Nothing more than you already know.
However, if you take a small number of stack samples, you will see on every one of them the lines of code where each routine calls the next.
In other words, it's not just giving you a rough percentage time estimate, it is pointing you at specific lines of code that are costly.
You can look at each line of code and ask if there is a way to do it fewer times. Assuming you do this, you will get the factor of 2 speedup.
People get big factors this way. In my experience, the number of call levels can easily be 30 or more. Every call seems necessary, until you ask if it can be avoided. Even small numbers of avoidable calls can have a huge effect over that many layers.
This is really annoying me as I have done it before, about a year ago and I cannot for the life of me remember what library it was.
Basically, the problem is that I want to be able to call a method a certain number of times or for a certain period of time at a specified interval.
One example would be I would like to call a method "x" starting from now, 10 times, once every 0.5 seconds. Alternatively, call method "x" starting from now, 10 times, until 5 seconds have passed.
Now I thought I used a boost library for this functionality but I can't seem to find it now and feeling a bit annoyed. Unfortunately I can't look at the code again as I'm not in possession of it any more.
Alternatively, I could have dreamt this all up and it could have been proprietary code. Assuming there is nothing out there that does what I would like, what is currently the best way of producing this behaviour? It would need to be high-resolution, up to a millisecond.
It doesn't matter if it blocks the thread that it is executed from or not.
Thanks!
Maybe you are talking about boost::asio. It is a mainly used for networking, but it can be used for scheduling timers.
It can be used in conjunction with boost::threads.
A combination of boost::this_thread::sleep and time duration found in boost::datetime?
It's probably bad practice to answer your own question but I wanted to add something more to what Nikko suggested as I have no implemented the functionality with the two suggested libraries. Someone might find this useful at some point.
void SleepingExampleTest::sleepInterval(int frequency, int cycles, boost::function<void()> method) {
boost::posix_time::time_duration interval(boost::posix_time::microseconds(1000000 / frequency));
boost::posix_time::ptime timer = boost::posix_time::microsec_clock::local_time() + interval;
boost::this_thread::sleep(timer - boost::posix_time::microsec_clock::local_time());
while(cycles--) {
method();
timer = timer + interval;
boost::this_thread::sleep(timer - boost::posix_time::microsec_clock::local_time());
}
}
Hopefully people can understand this simple example that I have knocked up. Using a bound function just to allow flexibility.
Appears to work with about 50 microsecond accuracy on my machine. Before taking into account the skew of the time it takes to execute the method being called the accuracy was a couple of hundred microseconds, so definitely worth it.