What is profiling? - profiling

I am new to this and is trying to learn.
What is profiling?
What are various free tools for profiling .NET, Java EE?
Can Javascript be profiled?
If so, by which tool?
And lastly, how do these profilers work?

Profiling measures how long various parts of the code take to run. Javascript can be profiled with firebug: http://getfirebug.com/js.html

profiling is measuring the execution times and correlating it with various classes/methods/functions. (see the link I gave to the wikipedia page for some commentary on how profilers can work)

Think of profilers as debuggers for execution duration bugs.
Profilers are implemented a lot like debuggers too, except that rather than allowing you to stop the program and poke around, they simply let it run and keep track of how much time gets spent in every part of the program. This is particularly useful if you have some code that is running slower than you need it to run, as you can figure out exactly where all the time is going, and concentrate your efforts on fixing just that bottleneck.
Many developers believe you should never hand-optimize code without using a profiler.

The way you would usually use your profiler is as follows:
Start the profiler, fire up your application using the profiler.
Use your application for some time or just the features in your application that you have identified as bottlenecks and would like to optimize.
Once your application is closed (or sometimes even before that), the profiler can present you a breakdown of execution times per function. Some will also allow you to get a breakdown of execution times per line or function within one of these functions so you can see where cpu most time was used up using a top-down approach.
Usually some functions in your application will take an unusually long time to execute. After looking at your profiling results, you should be able to identify them and eliminate performance problems.

Here are some .NET profilers for you to try (free):
Prof-it
NProf
CLR Profiler
I am not a big fan of these. I would recommend one of the commercial products to get the best results:
dotTrace
Ants
Other than that take a look at Brad Adams blog posts Profilers for the CLR and .NET Application Profiler.
I personally like dotTrace.

Profiling is a technique for measuring execution times and numbers of invocations of procedures.
It is not however the only or even necessarily the best way to locate things that cause time to be wasted in your code. Look here.
For a different Wikipedia article, try http://en.wikipedia.org/wiki/Performance_tuning#Bottlenecks
For a simple how-to, try http://www.wikihow.com/Optimize-Your-Program%27s-Performance

Wikipedia says:
In software engineering, performance analysis, more commonly today known as profiling, is the investigation of a program's behavior using information gathered as the program executes
Continue reading here http://en.wikipedia.org/wiki/Performance_analysis.
So, about javascript tool Firebug(http://getfirebug.com/index.html#install) is an excelent option.

Profiling is a measure of execution time at method level (functional statistics) as well as run-time level information collection such as consumption of memory, processor, threads and number of classes (non-functional statistics) loaded over a period of time the application is running. It falls under performance analysis (functional and non-functional statistics collection) of the application in question as run by one user. JConsole is one of the built-in tools to profile Java applications.

Profiling or programming profiles is the technique of dynamic analysis of programs, which uses resources such as memory space or the temporal complexity of a program, the use of particular instructions or the frequency, as well as the duration of function calls , to mention a few cases. Typically, profiling information is used to aid program optimization and, more specifically, performance engineering. Profiling is accomplished by instrumenting the program's source code. Profilers employ different methods such as event-based, statistical, instrumented, and simulation methods

Related

Locate the most costly methods and evaluate/profile them

I would like to know if there is a technique or a tool available that can tell you how much time is required to execute a particular method.
Something similar to the big O notation in math/computer science that can give you an idea about the complexity of an algorithm, I wonder if there is something similar for code analysis.
Profiling is a means of analysing a program to determine the relative amount of time spent in particular functions or methods. It is useful for discovering performance issues in a program empirically. Using GCC, for example, you can:
Compile the program with the -pg option to enable profiling.
Run the executable to produce a file called gmon.out, which contains information about the runtime characteristics of your program as it actually ran.
Run gprof to display the information generated by the instrumented executable.
Generally speaking, human analysis is the only way to discover the asymptotic (i.e., big-O) complexity of a particular algorithm—there is, so far as I know, no mechanical way to do it.
If you want to know how much time is spent in a function, use a so-called "profiler".
Complexity analysis is outside the scope of a profiler, though, since a profiler tells you what happens when you run a program once, whereas complexity tells you the limit behavior of what happens when you run an infinite sequence of programs with bigger and bigger input.
So: do you want to know which functions are most costly in your program (in which case find a profiler for your C++ implementation and follow its documentation), or do you want to know about time complexity (in which case you pretty much need a human to analyze your code)?
You should try running your code with callgrind started, it will record what functions are called, how many times, but the code will run 20X slower or such. After you get your callgrind output you should open it with kcachegrind to view the tree structure of the calls. There you can browse around and see where do you have bottlenecks
Useful links:
kcachegrind http://kcachegrind.sourceforge.net/html/Home.html
callgrind documentation http://valgrind.org/docs/manual/cl-manual.html
(valgrind is the framework, callgrind is a component)
how to start callgrind if the program spawns processes and you want to profile those too (replace sage bench.py with your program) https://github.com/titusnicolae/pynac-callgrind/blob/master/run.sh
edit: "--instr-atstart=no" should be removed from the parameter list if you don't plan on enabling the instrumentation later
How is this different from the halting problem?
Please note that I can trivially solve the halting problem using your automatic complexity analyzer -- your problem is HARDER. And the halting problem is already undecidable.

C and C++ source code profiling tools [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What's your favorite profiling tool (for C++)
Are there any good tools to profile a source code which is mix of of C and C++. What are the pros and cons of any, and which ones have you used and would reccomend for usage. Please do not get me a list of tools from google. I can do that too, what i want is to leverage the personal experience of someone who has used these tools and knows the pros and cons about them.
Thanks in advance.
I've found gprof to be the best CPU hotspot profiler, and Google Performance Tools to be the best sampling profiler. Both work for C and C++.
In my opinion there are no good profiling tools on Windows.
GNU gprof pros and cons
GCC only
Works with C and C++
Only treats CPU time, and code inside the binary, you need everything you wish to profile statically linked in
Very accurate
Adds a small overhead to execution
Google Performance Tools pros and cons
I think it requires the GNU tool chain
Occasionally fails to identify symbols
Very customizable
Outputs to a huge variety of formats, including the Callgrind format, and automatically loads KCacheGrind for you
Has various memory profiling tools also
Is a sampling profiler, with minimal overhead
Related useful questions and answers
Alternative to -pg with Clang?
What's your favorite profiling tool (for C++)
Alternatives to gprof
C++ Code Profiler
Confusing gprof output
I would respectfully disagree with Matt.
The tool I use all the time on Windows is the random-pausing technique, and it works with all languages that the IDE supports.
As an example of using it to do performance tuning, this case shows how a speedup of 43 times was achieved through a series of steps.
Gprof has a lot of problems, listed here, and according to the google-perftools manual, some of the same issues are repeated there, such as reporting procedures, not lines, emphasizing self (local) time, emphasizing the graph, etc. (I can't tell from the doc if it samples while blocked.)
As software systems become ever larger, self time becomes less and less relevant. The program counter spends most of its time in library routines or blocked in the system.
Graphs become gigantic nests.
People ask "I know function X is costly, but where in function X is the problem?"
What's more, the "bottlenecks" get bigger and bigger, because the stack gets deeper on average, and every layer of the stack is a fresh opportunity to do more function calls than necessary.
An example of a stack-sampler that reports percent by line, and samples while blocked, and allows user control of sampling so as not to dilute the sample set during user input, is Zoom.
EDIT: Sorry, can't leave well enough alone. Here's a new explanation:
The way programs work, they trace out a call tree, which is a lot like the oak tree outside my window. It has a trunk (main) which sprouts branches (call sites) which sprout further branches for several levels out to leaves (instructions) and acorns (blocking calls).
When the tree surgeon comes to prune (optimize) it, does he look only where the leaves are (hotspots)? Does he ignore acorns (no samples during blocking)?
No, he looks for branches (call sites) that are both heavy (on the stack a lot) and unhealthy (unnecessary). Those are what he prunes.
That's what random-pausing and Zoom do, is help find those call sites.
You can use Callgrind to create profiling output. It is part of Valgrind.
Callgrind-output could be used with KCacheGrind, which is probably worth a look as long as you're using Linux.
AMD CodeAnalyst is pretty nice. It's also cross platform which is nice when one finds a platform specific bottleneck.

How do I profile code beyond per-function level?

AFAIK profilers can only tell how much time is spent in each function. But since C++ compilers tend to inline code aggressively and also some functions are not that short it's often useful to know more details - how much time each construct consumes.
How can this be achieved except restructuring code into smaller functions?
If you use a sampling profiler (e.g. Zoom or Shark), rather than an instrumented profiler (e.g. gprof) then you can get much finer grained profiles, down to the statement and instruction level.
If you can use callgrind then you can get a summary of which methods are taking most of the processing time. Then you can use kcachegrind to view the results. It gives a very nice graph, through which you can easily browse and find bottlenecks.

What are your favorite features of a C/C++ performance profiler / analyzer? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm trying to pick a perforamnce analyzer to use. I'm a beginner developer and not sure what to look for in a performance analyzer. What are the most important features?
If you use valgrind, I can highly recommend KCacheGrind to visualize performance bottlenecks.
I would like to have following features/output information shown in a profiler.
1.) Should be able to show Total Clock cycles consumed and also for each function.
2.) If not one, should tell the total time consumed and time spent in each function.
3.) All it should be able to tell how many times a function is called.
4.) It would be nice to know memory reads, memory writes, cache misses, cache hits.
5.) Code memory for each function
6.) Data memory used: Global constants, Stack, Heap usage.
=AD
The two classical answers (assuming you are in *nix world) are valgrind and gprof. You want something that will let you (at least) check how much time you are spending inside each procedure or function.
Stability - be able to profile your process for long durations without crashing or running out of memory. its surprising how many commercial profilers fail that.
goldenmean has it right, I would add that line execution counts are sometimes handy as well.
My preference is for sampling profilers rather than instrumented profilers. The profiler should be able to map sample data back to the source code, ideally in a GUI. The two best examples of this that I am aware of are:
Mac OS X: Shark developer.apple.com
Linux: Zoom www.rotateright.com
All you need is a debugger or IDE that has a "pause" button. It is not only the simplest and cheapest tool, but in my experience, the best. This is a complete explanation why. Note the 2nd-to-last comment.
EDIT because I thought of a better answer:
As an aside, I studied A.I. in the 70s, and an idea very much in the air was automatic programming, and a number of people tried to accomplish it.
(I took my crack at it.)
The idea is to try to automate the process of having a knowledge structure of a domain, plus desired functional requirements, to generate (and debug) a program that would accomplish those requirements.
It would be a tour-de-force in automated reasoning about the domain of programming.
There were some tantalizing demonstrations, but in a practical sense the field didn't go very far.
Nevertheless, it did contribute a lot of ideas to programming languages, like contracts and logical verification techniques.
To build an ideal profiler, for the purpose of optimizing programs, it would get a sample of the program's state every nanosecond.
Either on-the-fly or later (ideal, remember?) it would carefully examine each sample, to see if, knowing the reasons for which the program is executing, that particular nanosecond of work was actually necessary or could be somehow eliminated.
That would be billions of samples and a lot of reasoning, but course there would be tremendous duplication, because any wastage costing, say, 10% of time, would be evident on 10% of samples.
That wastage could be recognized on a lot fewer than a billion samples.
If fact, 100 samples or even less could spot it, provided they were randomly chosen in time, or at least in the time interval the user cares about.
This is assuming the purpose is to find the wastage so we can get rid of it, as opposed to measuring it with much precision.
Why would it be helpful to apply all that reasoning power to each sample?
Well, if the programs were little, and it were only looking for things like O(n^2) code, it shouldn't be too hard.
But suppose the state of the program consisted of a procedure stack 20-30 levels deep, possibly with some recursive function calls appearing more than once, possibly with some of the functions being calls to external processors to do IO, possibly with the program's action being driven by some data in a table.
Then, to decide if the particular sample is wasteful requires potentially examining all or at least some of that state information, and using reasoning power to see if it is truly necessary in accomplishing the functional requirements.
What the profiler is looking for is nanoseconds being spent for dubious reasons.
To see the reason it is being spent requires examining every function call site on the stack, and the code surrounding it, or at least some of those sites.
The necessity of the nanosecond being spent requires the logical AND of the necessity of every statement being executed on the stack.
It only takes one such function call site to have a dubious justification for the entire sample to have a dubious justification.
So, if the entire purpose is to find nanoseconds being spent for dubious reasons, the more complicated the samples are, the better,
and the more reasoning power brought to bear on each sample, the better.
(That's why bigger programs have more room for speedup - they have deeper stacks, hence more calls, hence more likelihood of poorly justified calls.)
OK, that's in the future.
However, since we don't need a huge number of samples (10 or 20 is very useful), and since we already have highly intelligent automatic programmers (powered by pizza and soda),
we can do this now.
Compare that to the tools we call profilers today.
The very best of them take stack samples, but what's their output?
Measurements. "Hot paths". Rat's nest graphs. Eye-candy.
From those, even an artificially intelligent programmer would easily miss large inefficiencies, except for the ones that are exposed by those outputs.
After you fix the ones you do find, the ones you don't find are the ones that make all the difference.
One of the things one learns studying A.I. is, don't expect to be able to program a computer to do something if a human, in principle, can't also do it.

C++ code performance

When is about writing code into C++ using VS2005, how can you measure the performance of your code?
Is any default tool in VS for that? Can I know which function or class slow down my application?
Are other external tools which can be integrated into VS in order to measure the gaps in my code?
If you have the Team System edition of Visual Studio 2005, you can use the built-in profiler.
http://msdn.microsoft.com/en-gb/library/z9z62c29(VS.80).aspx
AMD CodeAnalyst is available for free for both Windows and Linux and works on most x86 or x64 CPUs (including Intel's).
It has extra features available when you have an AMD processor, of course. It also integrates into Visual Studio.
I've had pretty good luck with it.
Note that there are generally at least two common forms of profiler:
instrumenting: alters your build to record information at the beginning and end of certain areas (usually per function)
sampling: periodically looks at what code is running to record information
The types of information recorded can include (but are not limited to): elapsed time, # of CPU cycles, cache hits/misses, etc.
Instrumenting can be specific to certain areas of the code (just certain files or just code you compile, not libraries you link to). The overhead is much higher (you're adding code to the project, which takes time to execute, so you're altering timing; you may change program behavior for e.g. interrupt handlers or other timing-dependent code). You're guaranteed that you will get information about the functions/areas you instrument, though.
Sampling can miss very small or very sporadic functions, but modern machines have hardware help to allow you to sample much more thoroughly. Note that some sampling systems may still inject timing differences, although they generally will be much much smaller.
Some profiling tools support a mixture of the above, depending on how you use them.
You could also use Intel VTune.
You want a tool called a profiler. For a free one that covers most simple cases, I recommend Very Sleepy. It works by sampling the application's current call stack at regular intervals.
You can always measure the time and performance of you code yourself. Consult MSDN about the the following functions QueryPerformanceCounter() and QueryPerformanceFrequency().
For more in depth analysis of memory allocation and execution times we use Memory Validator and Performance Validator from Software Verify. They have support for several languages other than C++.
I think measuring performance, and locating code to optimize, are different problems, and require different methods.
To locate code to optimize, I swear by this simple method, which is orthogonal to accepted wisdom about profiling, and does not require you to buy or install any tools.
To measure performance, I'm content with the simple process of running the subject code in a loop and timing it.
EDIT: BTW, I just looked at Very Sleepy, and it appears to be on the right track. It samples the entire call stack, and retains each stack. What I can't tell is if it gives you, for each call instruction or regular instruction, the fraction of stack samples containing that instruction. In my opinion, that is the most valuable statistic, and it does not need to be very precise.
dotTrace, on the other hand, also looks like maybe it retains stack samples, but its UI presentation of call-stack info seems to be a call-tree. What I would look for is something that shows the stack-residence percentage of individual instructions (or statements), because they could be in different branches of the call-tree, and thus the call-tree could miss their importance.
For intrusive measurement, use the performance counters. Since you're using C++, you should use a facade over this slightly painful API. STLSoft has a family of such things, with different pros and cons. I suggest winstl::performance_counter for highest resolution, or winstl::threadtimes_counter if you want to monitor the performance of a particular thread regardless of other activity in your process(es). There was an article about this in Dr Dobb's several years ago, in which the design rationale behind the facades was described in detail.
For non-intrusive measurement, you can't go past VTune.
We use Rational quantify which comes as a part of Rational PurifyPlus set of tools.
Its an excellent tool for profiling application performance.
I've recently tried JetBrains dotTrace profiler and it looks very good. It helped me locate a number "black holes" in existing C++ code quite easily.
It works fine in Visual Studio 2005 Professional in a solution which mixes C# and C++ - it uses the right function names for both pieces of code and does an integrated analysis. You can trace for time or memory.
It will be a pity when the evaluation period expires :)
We've had good results from AQTime. It's not free but is cheaper than Visual Studio ;-)