Profiling embedded application - c++

I have an application that runs on an embedded processor (ARM), and I'd like to profile the application to get an idea of where it's using system resources, like CPU, memory, IO, etc. The application is running on top of Linux, so I'm assuming there's a number of profiling applications available. Does anyone have any suggestions?
Thanks!
edit: I should also add the version of Linux we're using is somewhat old (2.6.18). Unfortunately I don't have a lot of control over that right now.

As bobah said, gprof and valgrind are useful. You might also want to try OProfile. If your application is in C++ (as indicated by the tags), you might want to consider disabling exceptions (if your compiler lets you) and avoiding dynamic casts, as mentioned above by sashang. See also Embedded C++.

if your Linux is not very limited then you may find gprof and valgrind useful

On a related note, the C++ working group did a technical report on the performance cost of various C++ language features. For example they analyze the cost of dynamic_casting one or 2 levels deep. The reports here http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf and it might give you some insight into where the pain points in your embedded application might be.

gprof may disappoint you.
Assuming the program you are testing is big enough to be useful, then chances are the call tree could be pruned, so the best opportunities for optimization are function/method calls that you can remove or avoid. That link shows a good way to find them.
Many people approach this as sort of a hierarchical sleuthing process of measuring times.
Or you can simply catch it in the act, which is what I do.

Related

What the best multi-thread application debugger for C++ apps

I'm looking for a good multi-thread-aware debugger, capable of showing performance charts of application threads on Linux, don't know if such a thing exists, perhaps as a Eclipse plugin.
The idea would be to track per thread memory allocation a CPU usage as well as being able to interrupt a thread and examine its stack trace, local vars, etc.
It does not have to be an eclipse plugin or a free tool, do any of you have heard of something similar?
Qt Creator does provide information on a per-thread basis. It also has the features you would expect from any standard debugger. (Watches, breakpoints, etc.)
Although designed for compiling Qt applications, it can be used for just about any C++ project. (I have used it for compiling/editing a non-Qt app before.)
TotalView (and MemoryScape) doesn't do precisely what you're asking for in its' default presentation, but it provides the data you need. It costs money, but a better C++ debugger for Linux cannot be found.
Free trials are available, and there are a number of cool and useful videos on their support site.
If you're on linux, you've got access to one of the most powerful debugging tools in the trade - Valgrind. Read about it, especially about it's additional tools like Helgrind.
Sure, the visualisation is lacking compared to commercial tools, but you can't beat it's level of detail.

Good c++ profiler for GCC

I tried to find a related question but all previous questions are about profilers for native c++ in windows. I googled a while and learned about gprof, but the output of gprof actually contained lot of obscure internal functions. Is there a good opensource c++ profiler with good documentation?
Valgrind
I totally recommend this
http://en.wikipedia.org/wiki/Valgrind
Don't use gprof, for the reasons given here.
What you need are stackshots, explained here. One way to take stackshots is the pstack utility. Another way is to use "Pause" or ctrl-break under the debugger. Also lsstack, if you can get a copy.
If you want to spend money, RotateRight makes a nice tool based on stack sampling called Zoom.
Compile using the flag -pg and use gprof.
If you don't mind the KDE library dependencies, KCachegrind is very useful with the added visualization. It depends on Callgrind and Valgrind, as one could have guessed, so no special compiler flags required during compile-time.
I've heard oprofile is really, really good for real time apps. Linux only though, AFAIK.
How much detail do you need in your profile reports. If you just want to do some really simple time profiling for a few functions, then the new functionality available via the C++11 chrono classes makes it easy to profile in a cross platform, cross compiler way.
See this article for some simple profiling code that works similarly to Matlab's super easy to use tic and toc functions.

Performance profiling on Linux

What are the best tools for profiling C/C++ applications on *nix?
(I'm hoping to profile a server that is a mix of (blocking) file IO, epoll for network and fork()/execv() for some heavy lifting; but general help and more general tools are all also appreciated.)
Can you get the big system picture of RAM, CPU, network and disk all in one overview, and drill into it?
There's been a lot of talk on the kernel lists about things like perf timechart, but I haven't found anything turning up in Ubuntu yet.
I recommend taking stackshots, for which pstack is useful. Here's some more information:
Comments on gprof.
How stackshots work.
A blow-by-blow example.
A very short explanation.
If you want to spend money, Zoom looks like a pretty good tool.
For performance, you can try Callgrind, a Valgrind tool. Here is a nice article showing it in action.
Compile with -pg, run the program, and then use gprof
Compiling (and linking) with -pg adds profiling code and the profiling libraries to the executable, which then produces a file called gmon.out that contains the timing information. gprof displays call graphs and their (absolute and relative) timings.
See man gprof for details.
The canonical example of a full system profiling tool (for Solaris, OS X, FreeBSD) is DTrace. But it is not yet fully available on Linux (you can try here but the site is down for me at the moment, and I haven't tried it myself). There are many tools, in various states of usefulness, for doing full system profiling and kernel profiling on Linux.
You might consider investigating:
oprofile
SystemTap
bootchart
strace (e.g. this SO answer
Allinea MAP is a profiler for C++ and other native languages on Linux. It is commercially supported by my employer. It has a graphical interface and source-line level profiling and profiles code with almost no slowdown which makes it very accurate where timing of other subsystems is relevant - such as for IO.
Callgrind has been useful and accurate - but the slowdown was ~5x so I could only do smaller runs. It can actually count the number of times a function is called which is useful for understanding asymptotic behavior.
Description of using -gp and gproff here http://www.ibm.com/developerworks/library/l-gnuprof.html
oprofile might interest you. Ubuntu should have all the packages you need.
If you can take your application to freeBSD, OS X , or Solaris you can use dtrace, although dtrace is an analyst oriented tool -- i.e., you need to drive it -- read: script it. Nothing else can give you the level of granularity you need; Dtrace can not just profile the latencies of function calls in user-land; it can also follow a context switch into the kernel.
As mentioned in the accepted answer, Zoom can do some amazing things. I've used it to understand thread behavior all the way down to optimizing the assembly generated by the compiler.
The FOSS answer, as already mentioned, is to build with -pg and then use gprof to analyse the output. If it's a product/project that justifies throwing some money at, I would also be tempted to use IBM/Rationals Quantify profiler as that makes it easier to view the profiling data, drill down to the line level or look at it in a '10000ft' level.
Of course there might be viewer for gprof available that can do the same thing, but I am not aware of any.

C++ Code Profiler

Can anybody recommend a good code profiler for C++?
I came across Shiny - any good? http://sourceforge.net/projects/shinyprofiler/
Callgrind for Unix/Linux
DevPartner for Windows
Not C++ specific, but AMD's CodeAnalyst software is free and is feature-packed.
http://developer.amd.com/cpu/codeanalyst/codeanalystwindows/Pages/default.aspx
Gprof if you use gcc. It may not be user friendly but still useful.
Probably you will be interested in Intel VTune. Rather useful and allows to collect low-level events like cache misses which helps a lot in tuning.
Quantify (part of the IBM/Rational PurifyPlus package) is a very good profiler, but not exactly cheap. It is available on several platforms, too - I've used it on Solaris, Windows and Linux.
Depends on what you need to do:
Measure, so you can do regressions testing to see if changes in performance happened.
Find reasons for suboptimal performance and optimize them.
These are not the same.
For 1, use one of the recommended profilers.
For 2, the profiler I much prefer is one you already have:
http://www.wikihow.com/Optimize-Your-Program%27s-Performance
To see how this goes, check this out.
For C++, as for C# and any language that encourages layers of abstraction, those layers may or may not be good from a software engineering standpoint, but they can kill performance. Every method call is a detour in the execution of your program, and the style encourages you to nest those things, sometimes needlessly. Also the style discourages you from knowing or caring what goes on inside them. You may find them creating and deleting objects underneath at a rate and level of generality far beyond what your application really needs.
AQtime (for Windows)
If you are running a Premium version of VS 2010 then you get a profiler with it.
I've also used a couple of other free ones, but they don't compare to the on MS ships. Useful as a second opinion though.
If you have access to a Mac, then I recommend using Shark from the CHUD tools.
You can use the analyzer that´s in Sun Studio 12 on Linux or Solaris. Itś free. http://developers.sun.com/sunstudio/index.jsp
If you cannot locate DevPartner it is because we've moved under new ownership. Check us out on the Micro Focus website: http://www.microfocus.com/products/micro-focus-developer/devpartner/index.aspx. Shameless plug: I work on the DevPartner team. Our long awaited 64-bit versions of BoundsChecker and C++/.NET profilers ship on February 4, 2011. We've changed our pricing model so you can choose either the whole suite or just the performance profiler if that's what you need. Please check out the new DPS 10.5 release when it goes live!

Richer logging/tracing status for C++ applications

There are plenty of logging/trace systems for letting your program output data or strings or state as it runs. Most of these let you print arbitrary strings which you can view live or after your program runs.
I noticed an ad here on SO for Smartinspect which seems to take this to a higher level, giving stack traces for each log, fancier options like plotting graphs and data values which change over time, and a lot of polish to the basic idea of a simple list of output text strings.
Since I use C++, Smartinspect won't even work for me.
A little googling finds tons of logging frameworks, but nothing that seems to do anything more than text dumps. Are their fancier tools (similar to Smartinspect?) that do more? Commercial or open source is fine, and multiplatform is a big plus.
I know this is not the answer you are most probably looking for but I would suggest that such a framework will be very hard (if not impossible) to find for C++. Doing something like dumping the stack cannot be done in a portable way as it can in a language like Java, which not only shares a common runtime accross all platforms, but provides powerful introspection capabilities too.
I don't program in Java, but my guess is that it can provide a stack-trace in the same way as Python: the stack is probably just another object in the runtime which can be inspected and manipulated.
C++ on the other hand has none of these niceities: its meant to be a close-to-the-metal language that basically adds object-orientism to C (I'm sure others will come up with much more elanorate explanations of C++'s benefit's over C but thats another discussion).
In short, C++ is not rich enough at the level required to provide the kind of features you require in a generic way. There may be some platform-specific code that could get some of this info at defined points for you out there, but it certainly wont be standards compliant, cross-platform C++.
With regards to graphs etc, that sounds much more like post-processing, which you should either be able to find something for, or more likely, you can perhaps output your log messages in a format which can be interpreted by some of these existing tools.
Other things you could look at would be integrating with syslogd, for which again, there may be richer analysis tools for (this would provide you with a capability along the lines of the one advertised for SmartInspect - that is TCP/IP based logging).
NB: a lot of what I said here about C++ comes from previous experiences trying to find decent frameworks in C++ to do tweaky, introspective type things (such as proper mock objects etc).
I wrote a article about dumping the stack in C/C++ with Windows and Unix/Linux at DDJ some years ago. Maybe it helps you:
See http://www.ddj.com/architect/185300443
If you can restrict yourself to a certain platform you can add stack traces to your logs manually. We use e.g. the glibc functionality to get stack traces on Linux to attach stack traces into our exception class. There is similar functionalyty available on Windows, but as mentioned these infrastructures are not portable.