Callgrind: How to use Callgrind tool to evaluate function speed - c++

I am interested in testing speed of some function calls from the code written in C/C++. I searched and I was directed to use Valgrind platform with Callgrind tool.
I have briefly read the manual, but I am still wondering how I can utilize the functionality of the tool to evaluate the time of my function runtime speed.
I was wondering if I could get some pointers how I can achieve my goal.
Any help would be appreciated.

Compile your program with debug symbols (e.g. GDB symbols works fine, which are activated with the "-ggdb" flag).
If you are executing your program like this:
./program
Then run it with Valgrind+Callgrind with this command:
valgrind --tool=callgrind ./program
Callgrind will then produce a file called callgrind.out.1234 (1234 is the process ID and will probably be different when you run). Open this file with:
cg_annotate callgrind.out.1234
You may want to use grep to extract your function name. In the left column the number of instructions used for that function is displayed. Functions that use a comparatively low amount of instructions will be ignored, though.
If you want to see the output some nice graphics, I would recommend you to install KCachegrind.

Related

Visualize the call graph output of trace-cmd

I am interested to learn about KVM. I started trace-cmd as follows:
trace-cmd record -p function_graph -g kvm_vcpu_ioctl -o nofuncgraph-irqs qemu-system-x86_64 ***
I analyzed the output file using Kernelshark. It is my first time using it, so I might not know how to use it properly. Anyway I found it hard to find the start and end of function calls.
What I am interested in is something like seeing the functions either as a flamegraph, but I am not interested at timing at all, I am only interested in hierarchy. Or seeing it in a way similar to an IDE where I can collapse those block of functions that I am not interested in.
Is there a way to generate a flamegraph out of the trace-cmd output or a tool to help with visualizing the call graph that is generated?

Can you profile a single call of a function with perf?

I have a C++ function that I want to profile and only that function. One possible way is to use chrono and just measure the time it takes to run that function and print it out, run the program a few times and then do stats on the samples.
I am wondering if I can skip having to explicitly code time measurements and just ask perf to focus on the time spent in a specified function.
Have a look at Google's benchmarking library to micro-benchmark the function of interest.
You can then profile the resulting the executable as usual using perf.
For example, let's say that following the basic usage, you generated an executable named mybenchmark. Then, you can run perf on the binary as usual
$ perf stat ./mybenchmark
You can build a flame graph of whole application in SVG format. With flame graph you can quickly see function that take most of the time when consuming CPU. SVG flame graph is interactive: you can click any function and see detailed flame graph only for that selected function. From description of flame graphs:
It is also interactive: mouse over the SVGs to reveal details, and click to zoom.
You can try it in action for sample bash flame graph:
http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg

Tcl scripts non-instrumenting debugger using Tcl Library and/or Tcl internals?

I would like to know if it is possible to build tcl scripts debugger using Tcl Library API and/or Tcl internal interfaces (I mean if they contain sufficient data to do so). I've noticed that existing tcl debuggers instrument tcl scripts and work with this additional layer. My idea was to use Tcl_CreateObjTrace to trace every evaluated command and use it as a point to retrive callstack, locals etc. Problem is that it seems that not every information is accessible from API at a time of evaluation. For example I would like to know which line is currently evaluated but Interp has such info only for top evaluations (iPtr->cmdFramePtr->line is empty for procedures' bodies). Anyone has tried such approach? Does it make any sense? Maybe should I look into hashed entries in Interp? Any clues and opinions would be appreciated (the best for Tcl 8.5).
Your best bet for a non-intrusive debugging system might be to try using an execution step trace (called for each command called during the execution of the command to which the trace is attached) with info frame to actually get the information. Here's a simple version, attaching to source so that you can watch an entire script:
proc traceinfo args {
puts [dict get [info frame -2] cmd]
}
trace add execution source enterstep traceinfo
source yourscript.tcl
Be prepared for plenty of output. The dictionary out of info frame can have all sorts of relevant entries, such as information about what the line number of the command is and what the source file is; the cmd entry is the unsubstituted source for the command called (if you want the substituted version, see the relevant arguments to the trace callback, traceinfo above).

Combine several google-pprof files (pproc CPU profiler)

I want to use CPU Profiler from google-perftools (gperftools's libprofiler.so ), which is described here:
http://gperftools.googlecode.com/svn/trunk/doc/cpuprofile.html
In my setup I want to run the program to be profiled several times (up to 1500; each run is different) and then combine pprof outputs from all runs (or from some subset of runs) into single pprof file.
How can I do this?
PS: My program uses almost no shared libraries, so only single binary (elf) file will be analyzed.
PPS: Thanks to Chris, pprof can use several profiles:
pprof ./program first.pprof.out second.pprof.out ...

How can I find out what functions have been called from what classes?

I'm using a Physics Toolkit (Geant4) which consists of several thousand C++ header and class files. In order to use the toolkit, you have to write a series of your own class files, that give the toolkit some basic information about what you are trying to model. You then write a main() file which registers these files with the toolkit, 'make' it and then execute the final program. I am using Ubuntu 10.10 as the platform to do this.
I'd like to better understand how the toolkit operates. Specifically, I'd like to find out what functions in what class files, throughout the toolkit, are called and in what order, when the program runs.
One, somewhat brute-force method would be to label each function in each file e.g. insert cout << "File Name, Function Name" << endl as the first statement in each function body and have this all output to a textfile. However, there are some 3000 files I'd need to go through, which would be somewhat... time-consuming.
Is there an easier way of finding out what functions have been called? I've searched through the toolkits manual and, unless I have missed something, I see no way there of doing this via the toolkit. I guess I would need some command at the terminal or an external program?!?
Any help, suggestions or advice would be greatly appreciated!
On ubuntu you'll have a choice of profilers.
I personally love
valgrind --tool=callgrind ./myprogram
kcachegrind
For this because it creates very good callgraphs and statistics (tree map visualizations).
The big FAQ profiler topic is here: How can I profile C++ code running in Linux?
Off the top off my head: gprof (needs instrumentation), oprofile and perf record -g are easy to get started with as well