Systemtap simple userspace example (function tracing, Ubuntu)? - c++

(I've spent quite some time getting this to work, so I thought I'd document it - first, to put it formally as a question):
Is there a simple example of systemtap probing/tracing functions in a user-space application, preferably in C++? My system is Ubuntu 14.04:
$ uname -a
Linux mypc 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ g++ --version
g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4 ...
$ stap --version
Systemtap translator/driver (version 2.3/0.158, Debian version 2.3-1ubuntu1.4 (trusty))

OK, so this didn't turn out to be trivial - first of all, I somehow ended up with a (newer) kernel 4.2.0 on Ubuntu 14.04; and apparently the systemtap that comes with Ubuntu 14.04 is too old for that kernel (see below). That means that I had to build systemtap from source - this was my procedure:
cd /path/to/src
git clone git://sourceware.org/git/elfutils.git elfutils_git
git clone git://sourceware.org/git/systemtap.git systemtap_git
cd systemtap_git
./configure --with-elfutils=/path/to/src/elfutils_git --prefix=/path/to/src/systemtap_git/local --enable-docs=no
make
make install
# after this, there are `stap` executables in:
# /path/to/src/systemtap_git/stap
# /path/to/src/systemtap_git/local/bin/stap
This is the thing:
you shouldn't build elfutils separately, and then systemtap - you should instead pass the elfutils source directory to --with-elfutils of systemtap's configure, which will then configure and build elfutils as well.
you MUST do make install of systemtap, even if it is in a non-system/private (local) directory! - otherwise, some errors occur (unfortunately, didn't log them)
After building, stap reports version:
$ ./stap --version
Systemtap translator/driver (version 3.2/0.170, commit release-3.1-331-g0efba6fc74c8 + changes) ...
Ok, so I found a basic Fibonacci C++ example for analysis, which I slightly modified, and called /tmp/fibo.cpp:
// based on: http://www.cplusplus.com/articles/LT75fSEw/
#include <iostream>
using namespace std;
class Fibonacci{
public:
int a, b, c;
void generate(int);
void doFibonacciStep(int);
};
void Fibonacci::doFibonacciStep(int istep){
c = a + b;
cout << " istep: " << istep << " c: " << c << endl;
a = b;
b = c;
}
void Fibonacci::generate(int n){
a = 0; b = 1;
cout << " Start: a "<< a << " b " << b << endl;
for(int i=1; i<= n-2; i++){
doFibonacciStep(i);
}
}
int main()
{
cout << "Hello world! Fibonacci series" << endl;
cout << "Enter number of items you need in the series: ";
int n;
cin >> n;
Fibonacci fibonacci;
fibonacci.generate(n);
return 0;
}
First I tried compiling it like this:
cd /tmp
g++ -g fibo.cpp -o fibo.exe
Now, the first thing that we want to do, is to figure out which functions are available for probing in our executable; for that, we can use stap -L (note, here I'm still using the old, Ubuntu 14.04 system stap):
$ stap -L 'process("/tmp/fibo.exe").function("*").call'
process("/tmp/fibo.exe").function("_GLOBAL__sub_I__ZN9Fibonacci15doFibonacciStepEi").call
process("/tmp/fibo.exe").function("__static_initialization_and_destruction_0").call $__initialize_p:int $__priority:int
process("/tmp/fibo.exe").function("doFibonacciStep#/tmp/fibo.cpp:13").call $this:class Fibonacci* const $istep:int
process("/tmp/fibo.exe").function("generate#/tmp/fibo.cpp:20").call $this:class Fibonacci* const $n:int
process("/tmp/fibo.exe").function("main#/tmp/fibo.cpp:28").call
Nice - so I'd like to probe/trace the doFibonacciStep and its input argument, istep. So I try from the command line:
$ sudo stap -e 'probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call { printf("stap do step: %d\n", $istep) }' -c /tmp/fibo.exe
WARNING: "__tracepoint_sched_process_fork" [/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko] undefined!
WARNING: "__tracepoint_sys_exit" [/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko] undefined!
WARNING: "__tracepoint_sys_enter" [/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko] undefined!
WARNING: "__tracepoint_sched_process_exec" [/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko] undefined!
WARNING: "__tracepoint_sched_process_exit" [/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko] undefined!
ERROR: Couldn't insert module '/tmp/stap51A5tV/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko': Unknown symbol in module
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed. [man error::pass5]
Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
$ sudo stap -e 'probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call { printf("stap do step: %d\n", $istep) }' -c /tmp/fibo.exe
ERROR: Couldn't insert module '/tmp/stapmo60OW/stap_ab5b824c79b38b5207910696c49c4e22_1760.ko': Unknown symbol in module
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed. [man error::pass5]
Ouch, errors like these - not good. The post "__tracepoint_sched_process_fork undefined" when run systemstap script explains that basically the stap version is too old for the kernel that I have - which required the building from source (above). So let's see now how the new stap -L works:
$ /path/to/src/systemtap_git/stap -L 'process("/tmp/fibo.exe").function("*").call'
process("/tmp/fibo.exe").function("_GLOBAL__sub_I__ZN9Fibonacci15doFibonacciStepEi#/tmp/fibo.cpp:37").call
process("/tmp/fibo.exe").function("__do_global_dtors_aux").call
process("/tmp/fibo.exe").function("__libc_csu_fini").call
process("/tmp/fibo.exe").function("__libc_csu_init").call
process("/tmp/fibo.exe").function("__static_initialization_and_destruction_0#/tmp/fibo.cpp:37").call $__initialize_p:int $__priority:int
process("/tmp/fibo.exe").function("_fini").call
process("/tmp/fibo.exe").function("_init").call
process("/tmp/fibo.exe").function("_start").call
process("/tmp/fibo.exe").function("deregister_tm_clones").call
process("/tmp/fibo.exe").function("doFibonacciStep#/tmp/fibo.cpp:13").call $this:class Fibonacci* const $istep:int
process("/tmp/fibo.exe").function("frame_dummy").call
process("/tmp/fibo.exe").function("generate#/tmp/fibo.cpp:20").call $this:class Fibonacci* const $n:int
process("/tmp/fibo.exe").function("main#/tmp/fibo.cpp:28").call
process("/tmp/fibo.exe").function("register_tm_clones").call
Nice, this is already a bit more verbose than the old version. Anyways, I'd like to probe the doFibonacciStep function, and its input argument, here $istep. So I write this on the command line:
$ sudo /path/to/src/systemtap_git/stap -e 'probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call { printf("stap do step: %d\n", $istep) }' -c /tmp/fibo.exe
semantic error: while processing probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep#/tmp/fibo.cpp:13").call from: process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call
semantic error: No cfa_ops supplied, but needed by DW_OP_call_frame_cfa: identifier '$istep' at <input>:1:107
source: probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call { printf("stap do step: %d\n", $istep) }
Pass 2: analysis failed. [man error::pass2]
Ouch - a nasty error, and doesn't really tell me anything - there are very few bug reports on this error (and mostly from 2010). So I was about to get stuck here, when for some reason, I remembered that the other day, I compiled some programs with -gdwarf-2 (for reasons I've forgotten by now); so I thought I'd try it - and whaddayaknow, it actually started working now:
$ g++ -gdwarf-2 fibo.cpp -o fibo.exe
$ sudo /path/to/src/systemtap_git/stap -e 'probe process("/tmp/fibo.exe").function("Fibonacci::doFibonacciStep").call { printf("stap do step: %d\n", $istep) }' -c /tmp/fibo.exe
Hello world! Fibonacci series
Enter number of items you need in the series: 5
Start: a 0 b 1
istep: 1 c: 1
istep: 2 c: 2
istep: 3 c: 3
stap do step: 1
stap do step: 2
stap do step: 3
Nice! Note that the stap prints are actually printed after the program has finished (that is, they are not interleaved with the actual program output where they occured).
Instead of specifying the probe points and behavior directly on the command line, we could write a script instead - so here is check-do-step.stp - here with some extra stuff:
#!/usr/bin/env stap
global stringone = "Testing String One"
global stringtwo = "Testing String Two"
probe begin {
printf("begin: %s\n", stringone)
#exit() # must have; else probe end runs only upon Ctrl-C if we only have `begin` and `end` probes!
}
probe process(
"/tmp/fibo.exe"
).function(
"Fibonacci::doFibonacciStep"
).call {
printf("stap do step: %d\n", $istep)
}
probe end {
newstr = "We Are " . stringtwo . " And We're Done" # string concat
printf("%s\n", newstr)
}
... and with this script, our call and results look like this:
$ sudo /path/to/src/systemtap_git/stap check-do-step.stp -c /tmp/fibo.exe
Hello world! Fibonacci series
Enter number of items you need in the series: begin: Testing String One
6
Start: a 0 b 1
istep: 1 c: 1
istep: 2 c: 2
istep: 3 c: 3
istep: 4 c: 5
stap do step: 1
stap do step: 2
stap do step: 3
stap do step: 4
We Are Testing String Two And We're Done
Notice again - the begin: Testing ... string does not hit at the very start as we'd otherwise expect, but only after the program already started generating output.
Well, I guess this is it - certainly good enough for me, for a simple example...

Related

How to sort the output of C++ program (stdout) via "| sort"

I have a working program in c++ that creates a List and makes possible to fill that list with items (add), remove items, print items.
I want to test that add function works, so I create and run test.cc:
#include "List.h"
#include <string>
using namespace std;
int main()
{
List s;
s.add("OMG Milk Factory", "Milk", 140, 2);
s.add("Just Milk", "Milk", 80, 4);
s.print(cout);
return 0;
}
Because the print function shuffle items before printing, the output might be:
140 2 Milk OMG Milk Factory
80 4 Milk Just Milk
or
80 4 Milk Just Milk
140 2 Milk OMG Milk Factory
I create bash script and I want to sort the output of test.cc by using piping output to sort, but I do not know how. I have this one and it doesn't work:
compile_and_run() {
rm -f ./a.out
LANG=C run -C Build "g++ -std=c++17 -Wall -I. ~/Documents/testcase/$1 libhw2.a && ./a.out"
}
compile_and_run test.cc | sort
test "Add 2 Element Function Test" exact '140 2 Milk "OMG Milk Factory"\n80 4 Milk Just Milk\n' stdout
How to correctly use | sort ?
In your case you might just have to do
compile_and_run test.cc | sort -k1,1n
so that it sorts numerically in ascending order on the first column.

Out of bounds with find and replace in C++

I seem to be running into an odd situation that I do not undersand in C++. When I execute a function that parses and replaces strings (Roman Numerals). I end up going out of bound if the string is not present:
Terminal output:
Mac Shell: CPP/>$ ./Roman2Num
Retrieving input:
------------------
Enter a number: 24
input: XXIV
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
Abort trap: 6
Mac Shell: CPP/>$ ./Roman2Num
Retrieving input:
------------------
Enter a number: 29
input: XXVIV
Roman: XXIX
Mac Shell: CPP/>$ ./Roman2Num
Retrieving input:
------------------
Enter a number: 1999
input: MDCDLXLVIV
Roman: MDCDLXLIX
Mac Shell: CPP/>$ ./Roman2Num
Retrieving input:
------------------
Enter a number: 1998
input: MDCDLXLVIII
libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: basic_string
Abort trap: 6
Mac Shell: CPP/>$
Code written:
string Cleanup(string Roman){
int count = 0;
printf("input: %s\n", Roman.c_str());
size_t w = Roman.find("VIV");
Roman.replace(w, std::string("VIV").length(), "IX");
/* size_t x = Roman.find("LIX");
Roman.replace(x, std::string("LIX").length(), "IL");
size_t y = Roman.find("VIV");
Roman.replace(y, std::string("VIV").length(), "IX");
size_t z = Roman.find("VIV");
Roman.replace(z, std::string("VIV").length(), "IX");*/
return Roman;
}
I have been doing some reading here:
http://www.cplusplus.com/reference/string/string/replace/
Does anyone see what I am doing wrong?
Am I making this way harder than it needs to be?
You need to check for and protect against the "Not Found" condition.
size_t w = Roman.find("VIV");
if (w != string::npos) {
Roman.replace(w, string("VIV").length(), "IX");
}
When the string is not found Romand.find() returns string::npos, which equals (std::string::size_type)-1.
See here: http://www.cplusplus.com/reference/string/string/find/

How to use boost program_options to read an integer array?

I am using Ubuntu and boost v1.50.
Previously I used boost program_options to feed a set of options into a program like so:
#!/bin/bash
./prog --arg1 1 --arg2 "2" --arg3 {1,2,3} --arg4 {1,2} --arg5 5
So I am dealing with a mix of single integers, strings and integer arrays.
This worked fine.
However, after "improving" the code by introducing local variables in bash, I have:
#!/bin/bash
a1=1
a2="2"
a3={1,2,3}
a4={1,2}
a5=5
./prog --arg1 $a1 --arg2 $a2 --arg3 $a3 --arg4 $a4 --arg5 $a5
Executing this results in an error:
error: the argument ('{1,2,3}') for option '--arg3' is invalid
I have set up the boost program_options like this:
namespace po = boost::program_options;
using namespace std;
try{
po::options_description desc("Allowed options");
desc.add_options()
("help", "produce help message")
("arg1", po::value<int>(&arg1)->required(), "doc1")
("arg2", po::value<string>(&arg2)->default_value("test"), "doc2")
("arg3", po::value<vector<int> >(&arg3)->multitoken(), "doc3")
("arg4", po::value<vector<int> >(&arg4)->multitoken(), "doc4")
("arg5", po::value<int>(&arg5)->default_value(1), "doc5")
;
po::variables_map vm;
po::store(po::parse_command_line(ac, av, desc), vm);
po::notify(vm);
if(vm.count("help")) cout << desc << "\n";
}
catch(exception& e){
cerr << "error: " << e.what() << "\n";
errorflag=1;
}
catch(...){
cerr << "Exception of unknown type!\n";
errorflag=1;
}
Where did I go wrong? Is multitoken not appropriate in this context? What can I use instead? Is it not possible to read integer arrays?
I tried omitting multitoken but then it fails also.
Using quotation marks around the local variable in the bash script does not help either.
If I change the array input from {a,b,c} to "a b c", it is ok. However, I already have a large number of entries in the other format and I am would like to continue using it as other programs depend on it too.
I think it should be doable, since it worked without the local variables. Does someone know how?
EDIT: I have been mistaken. "a b c" does NOT work as input :(
EDIT 2: I came up with a little workaround:
I convert {a,b,c} -> a b c using
argnew=`echo ${arg:1:-1} | tr ',' ' '`
and the feeding it to the program works fine. Is that the best solution?
Changing your original script to add the -x bash debugging option, like this:
#!/bin/bash -x
./prog --arg1 1 --arg2 "2" --arg3 {1,2,3} --arg4 {1,2} --arg5 5
and then running it shows this output:
+ ./prog --arg1 1 --arg2 2 --arg3 1 2 3 --arg4 1 2 --arg5 5
So your curly brace groupings aren't working they way you think they're working, because bash command line processing is expanding them before invoking ./prog.
You can get it working if in your second script, if you change the assignments for a3 and a4 to be like this:
a3='1 2 3'
a4='1 2'
and then double-quote all your variables when you call ./prog:
./prog --arg1 "$a1" --arg2 "$a2" --arg3 "$a3" --arg4 "$a4" --arg5 "$a5"

dtrace execute action only when the function returns to a specific module

I'm tracking some libc functions with dtrace. I want to make a predicate that only executes the action when the function returns to an adress into a specific module given in the parameters.
copyin(uregs[R_ESP],1) on the return probe should give the return adress i think, i'm not entirely sure of it so it would be nice if someone can confirm.
But then i need a way to resolve that adress to a module, is this possible and how?
There is a ucaller variable which will give you the
saved program counter as a uint64_t and umod() will
translate it into the corresponding module name, e.g.
# dtrace -n 'pid$target:::entry {#[umod(ucaller)]=count()}' -p `pgrep -n xscreensaver`
dtrace: description 'pid$target:::entry ' matched 14278 probes
^C
xscreensaver 16
libXt.so.4 73
libX11.so.4 92
libxcb.so.1 141
libc.so.1 144
^C#
However, umod() is an action (as opposed to a subroutine); it
cannot be assigned to an lvalue and therefore cannot be used in
an expression (because the translation is deferred until the address
is received by the dtrace(1) user-land program).
Fortunately, there's nothing stopping you from finding the address
range occupied by libc in your process and comparing it with ucaller.
Here's an example on Solaris (where a hardware-specific libc is
mounted at boot time):
# mount | fgrep libc
/lib/libc.so.1 on /usr/lib/libc/libc_hwcap1.so.1 read/write/setuid/devices/rstchown/dev=30d0002 on Sat Jul 13 20:27:32 2013
# pmap `pgrep -n gedit` | fgrep libc_hwcap1.so.1
FEE10000 1356K r-x-- /usr/lib/libc/libc_hwcap1.so.1
FEF73000 44K rwx-- /usr/lib/libc/libc_hwcap1.so.1
FEF7E000 4K rwx-- /usr/lib/libc/libc_hwcap1.so.1
#
I'll assume that the text section is the one with only
read & execute permissions, but note that in some
circumstances the text section will be writeable.
# cat Vision.d
/*
* self->current is a boolean indicating whether or not execution is currently
* within the target range.
*
* self->next is a boolean indicating whether or not execution is about to
* return to the target range.
*/
BEGIN
{
self->current = 1;
}
pid$target:::entry
{
self->current = (uregs[R_PC] >= $1 && uregs[R_PC] < $2);
}
syscall:::return
/pid==$target/
{
self->next = self->current;
self->current = 0;
}
pid$target:::return
{
self->next = (ucaller >= $1 && ucaller < $2);
}
pid$target:::return,syscall:::return
/pid==$target && self->next && !self->current/
{
printf("Returning to target from %s:%s:%s:%s...\n",
probeprov, probemod, probefunc, probename);
ustack();
printf("\n");
}
pid$target:::return,syscall:::return
/pid==$target/
{
self->current = self->next;
}
# dtrace -qs Vision.d 0xFEE10000 0xFEF73000 -p `pgrep -n gedit`
This produces results like
Returning to target from pid2095:libcairo.so.2.10800.10:cairo_bo_event_compare:return...
libcairo.so.2.10800.10`cairo_bo_event_compare+0x158
libc.so.1`qsort+0x51c
libcairo.so.2.10800.10`_cairo_bo_event_queue_init+0x122
libcairo.so.2.10800.10`_cairo_bentley_ottmann_tessellate_bo_edges+0x2d
libcairo.so.2.10800.10`_cairo_bentley_ottmann_tessellate_polygon+0
.
.
.
Returning to target from syscall::pollsys:return...
libc.so.1`__pollsys+0x15
libc.so.1`poll+0x81
libxcb.so.1`_xcb_conn_wait+0xb5
libxcb.so.1`_xcb_out_send+0x3b
libxcb.so.1`xcb_writev+0x65
libX11.so.4`_XSend+0x17c
libX11.so.4`_XFlush+0x30
libX11.so.4`XFlush+0x37

Is it possible to print only user defined variables by NM?

There are a lot of system variables in output of nm it looks like this
N
_CRT_MT
_CRT_fmode
_CRT_glob
Dictionary::variable4
namespace1::variable1
__cpu_features
__crt_xc_end__
__crt_xc_start__
__crt_xi_end__
__crt_xi_start__
__crt_xl_start__
__crt_xp_end__
__crt_xp_start__
__crt_xt_end__
__crt_xt_start__
__tls_end__
__tls_start__
__xl_a
__xl_c
__xl_d
__xl_z
_argc
_argv
_bss_end__
_bss_start__
_data_end__
_data_start__
_end__
_fmode
_tls_end
_tls_index
_tls_start
_tls_used
mingw_initltsdrot_force
mingw_initltsdyn_force
mingw_initltssuo_force
variable0
variable10
Is it possible to print only user defined variables - in this case variable10, variable0, Dictionary::variable1, Dictionary::variable4, N?
Not that I know of. But at least you can safely filter all variables starting with double underscore or underscore + uppercase letter, as these are reserved for the implementation:
$ nm -j foo | grep -v '^_[A-Z]\|^__\+[A-Za-z]'
N
Dictionary::variable4
namespace1::variable1
_argc
_argv
_bss_end__
_bss_start__
_data_end__
_data_start__
_end__
_fmode
_tls_end
_tls_index
_tls_start
_tls_used
mingw_initltsdrot_force
mingw_initltsdyn_force
mingw_initltssuo_force
variable0
variable10
You can probably filter more by adding additional patterns that reliably denote implementation-defined identifiers.
Alternatively, create an empty executable (i.e. one which contains no user-defined symbols) and compute the difference of the output of nm on each executable via comm:
$ # Preparation
$ echo 'int main() { }' > mt.cpp
$ g++ -o mt.out mt.cpp
$ nm -j mt.out > mt.symbols
$
$ nm -j your_exe > your_exe.symbols
$ comm -23 your_exe.symbols mt.symbols
N
Dictionary::variable4
namespace1::variable1
variable0
variable10