I'm trying to profile some C/C++ code inside Docker using pprof from gperftools (not pprof for golang).
When running it locally I have no problem. However, when running it in a Docker environment, pprof is unable to find the files/lines of the functions and instructions.
I'm installing gperftools (including pprof) the following way:
FROM debian:11
# Add dependencies
RUN apt-get update && export DEBIAN_FRONTEND=noninteractive && apt-get -y install \
wget git cmake build-essential lsb-release software-properties-common && \
rm -rf /var/lib/apt/lists/*
# Add libunwind
RUN wget -q https://github.com/libunwind/libunwind/releases/download/v1.6.2/libunwind-1.6.2.tar.gz && \
tar -xzf libunwind-1.6.2.tar.gz && \
cd libunwind-1.6.2 && \
./configure && \
make -j $(cat /proc/cpuinfo | grep "cpu cores" | uniq | awk '{print $NF}') && \
make install && \
cd .. && \
rm -rf libunwind-1.6.2 libunwind-1.6.2.tar.gz
# Add pprof
RUN wget -q https://github.com/gperftools/gperftools/releases/download/gperftools-2.10/gperftools-2.10.tar.gz && \
tar -xzf gperftools-2.10.tar.gz && \
cd gperftools-2.10 && \
./configure && \
make -j $(cat /proc/cpuinfo | grep "cpu cores" | uniq | awk '{print $NF}') && \
make install && \
cd .. && \
rm -rf gperftools-2.10 gperftools-2.10.tar.gz
Note that I'm installing libunwind manually because the apt installation does not install properly in system paths on Debian.
When profiling a code (compiled with no optimization, no inline, with symbols etc), and running:
pprof --no_strip_temp --text --files myexec prof.out
I get:
Total: 802 samples
802 100.0% 100.0% 802 100.0% ?
while when doing it outside Docker, I have the proper list of files. However when listing samples for functions, I have a proper output, the functions are properly named and profiled.
When profiling at line level, the file and line which should be on the end of the line is replaced by a question mark
Total: 802 samples
0 0.0% 0.0% 802 100.0% __libc_start_main##GLIBC_2.2.5 ?
0 0.0% 0.0% 802 100.0% _start ?
0 0.0% 0.0% 802 100.0% main ?
140 17.5% 17.5% 802 100.0% my_func ?
130 16.2% 33.7% 130 16.2% my_namespace::Calculator::set_a ?
130 16.2% 49.9% 130 16.2% my_namespace::Calculator::set_b ?
47 5.9% 55.7% 114 14.2% my_namespace::Calculator::mul ?
20 2.5% 58.2% 85 10.6% my_namespace::Calculator::add ?
70 8.7% 67.0% 70 8.7% recurse ?
65 8.1% 75.1% 65 8.1% add ?
65 8.1% 83.2% 65 8.1% mul ?
28 3.5% 86.7% 62 7.7% my_namespace::Calculator::sub ?
34 4.2% 90.9% 34 4.2% sub ?
31 3.9% 94.8% 31 3.9% my_namespace::Calculator::get_a ?
18 2.2% 97.0% 18 2.2% double addT<double> ?
14 1.7% 98.8% 14 1.7% double addT<int> ?
8 1.0% 99.8% 8 1.0% my_namespace::Calculator::get_b ?
2 0.2% 100.0% 2 0.2% _init ?
However, the file names and lines are present in the symbols:
> objdump -d -l ./myexec
[...]
0000000000001735 <sub>:
sub():
/path/to/my/project/libs/sub/sub.c:4
1735: 55 push %rbp
1736: 48 89 e5 mov %rsp,%rbp
1739: f2 0f 11 45 f8 movsd %xmm0,-0x8(%rbp)
173e: f2 0f 11 4d f0 movsd %xmm1,-0x10(%rbp)
/path/to/my/project/libs/sub/sub.c:5
1743: f2 0f 10 45 f8 movsd -0x8(%rbp),%xmm0
1748: f2 0f 5c 45 f0 subsd -0x10(%rbp),%xmm0
174d: 66 48 0f 7e c0 movq %xmm0,%rax
/path/to/my/project/libs/sub/sub.c:6
1752: 66 48 0f 6e c0 movq %rax,%xmm0
1757: 5d pop %rbp
1758: c3 retq
1759: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
I tried using recent/maintained golang version of pprof, it works and I have the file paths and lines.
However, it does not handle template functions properly (it is impossible to make the difference between 2 implementations of the same C++ template function).
Any clue with this problem?
Or is there a way, with the recent/maintained golang version of pprof to get functions template arguments?
Related
It's possible to loop mount a binary file that contains a filesystem image. I'd like to put that binary file into a C static variable, and then mount that. Is this possible? If so, what C API magic do I need?
There are several steps we'll want to take
Create a file system image
Embed this image into the binary
Mount the embedded image as a read-only file system.
It sounds like you already know how to perform stems 1 and 2, but not how to do step 3.
I prepared ab.sqfs image and a.out which contains that image at offset 0x3010. Here are the commands to mount this filesystem:
# optional, look at the bytes of the filesystem from step 1
xxd -l 16 -g1 ab.sqfs
00000000: 68 73 71 73 07 00 00 00 6c 61 ce 60 00 00 02 00 hsqs....la.`....
# optional: confirm that we have the correct file offset to the start of FS image
xxd -l 16 -g1 -s 0x3010 a.out
00003010: 68 73 71 73 07 00 00 00 6c 61 ce 60 00 00 02 00 hsqs....la.`....
# create a loop device which "points" into the file:
sudo losetup -r -o 0x3010 loop0 a.out
losetup: a.out: Warning: file does not fit into a 512-byte sector; the end of the file will be ignored.
# optional: confirm that (just created) /dev/loop0 contains expected bytes
sudo xxd -l 16 -g1 /dev/loop0
00000000: 68 73 71 73 07 00 00 00 6c 61 ce 60 00 00 02 00 hsqs....la.`....
# create directory on which the FS will be mounted
mkdir /tmp/mnt
# finally mount the FS:
sudo mount -oro /dev/loop0 /tmp/mnt
# optional: verify contents of /tmp/mnt
ls -lR /tmp/mnt
... has exactly the files I've put into it.
what C API magic do I need?
You can run the losetup and mount commands under strace to observe what they do. The key steps for losetup are:
openat(AT_FDCWD, "/tmp/a.out", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/dev/loop0", O_RDONLY|O_CLOEXEC) = 4
ioctl(4, LOOP_SET_FD, 3) = 0
ioctl(4, LOOP_SET_STATUS64, {lo_offset=0x3010, lo_number=0, lo_flags=LO_FLAGS_READ_ONLY, lo_file_name="/tmp/a.out", ...}) = 0
And for mount:
mount("/dev/loop0", "/tmp/mnt", "squashfs", MS_RDONLY, NULL) = 0
These calls can be performed by the application itself, or by "shelling out" to external losetup and mount commands.
I'm having trouble with encrypted base64 encoded values I'm using in Google Deployment Manager via runtimeconfig.v1beta1.config resource declarations.
After I perform the deployment, the value that I stored using Deployment Manager appear to be quite different to what I retrieve using gcloud beta runtime-configs. As a result, I can't decrypt the value.
First I encrypted and base64 encoded some secret text:
$ echo "secret"|gcloud kms encrypt --key my-crypto-key \
--keyring my-keyring --location australia-southeast1 \
--plaintext-file - --ciphertext-file - | base64 -w0
CiQAsOSNmVXBs2ayUjRePnE5+Oi5dUPuVvjn6UKKUXgxMTA56koSMABDkVUGnXlocFgdUEsQ5qLCF3PVIz5zit+ZCSXjSvNzEAO5XRv6WBRkxBJMjVcheg==
Which I then store in a deployment manager YAML file:
resources:
- name: my-config
type: runtimeconfig.v1beta1.config
properties:
config: my-config
description: "A demo configuration"
- name: dummy-secret
type: runtimeconfig.v1beta1.variable
properties:
parent: $(ref.my-config.name)
variable: 'dummy/secret'
value: "CiQAsOSNmVXBs2ayUjRePnE5+Oi5dUPuVvjn6UKKUXgxMTA56koSMABDkVUGnXlocFgdUEsQ5qLCF3PVIz5zit+ZCSXjSvNzEAO5XRv6WBRkxBJMjVcheg=="
Then I create the deployment (which completes without errors or warnings):
$ gcloud deployment-manager deployments create my-config \
--config my-config.yaml
But when I try extracting the variable value, it is completely different from what I stored:
$ gcloud beta runtime-config configs variables \
get-value 'dummy/secret' --config-name my-config|base64 -w0
CiQAPz8/P1U/P2Y/UjRePnE5Pz8/dUM/Vj8/P0I/UXgxMTA5P0oSMABDP1UGP3locFgdUEsQPz8/F3M/Iz5zPz8/CSU/Sj9zEAM/XRs/WBRkPxJMP1cheg==
This is repeatable / reproducible and I haven't a clue what I'm doing wrong. I don't have this problem using gcloud beta runtime-config variables set followed by get-value.
Looking at the decoded base64 binary of your content, we notice that all the bytes with values >= 0x80 have been changed to 0x3F, ASCII '?'. We suspect you're passing the binary data through the shell or some other pipe which isn't binary-clean.
Corrupted value:
dierks#dierks:~$ base64 -d | hexdump -C
CiQAPz8/P1U/P2Y/UjRePnE5Pz8/dUM/Vj8/P0I/UXgxMTA5P0oSMABDP1UGP3locFgdUEsQPz8/F3M/Iz5zPz8/CSU/Sj9zEAM/XRs/WBRkPxJMP1cheg==
00000000 0a 24 00 3f 3f 3f 3f 55 3f 3f 66 3f 52 34 5e 3e |.$.????U??f?R4^>|
00000010 71 39 3f 3f 3f 75 43 3f 56 3f 3f 3f 42 3f 51 78 |q9???uC?V???B?Qx|
00000020 31 31 30 39 3f 4a 12 30 00 43 3f 55 06 3f 79 68 |1109?J.0.C?U.?yh|
00000030 70 58 1d 50 4b 10 3f 3f 3f 17 73 3f 23 3e 73 3f |pX.PK.???.s?#>s?|
00000040 3f 3f 09 25 3f 4a 3f 73 10 03 3f 5d 1b 3f 58 14 |??.%?J?s..?].?X.|
00000050 64 3f 12 4c 3f 57 21 7a |d?.L?W!z|
00000058
Original value:
dierks#dierks:~$ base64 -d | hexdump -C
CiQAsOSNmVXBs2ayUjRePnE5+Oi5dUPuVvjn6UKKUXgxMTA56koSMABDkVUGnXlocFgdUEsQ5qLCF3PVIz5zit+ZCSXjSvNzEAO5XRv6WBRkxBJMjVcheg==
00000000 0a 24 00 b0 e4 8d 99 55 c1 b3 66 b2 52 34 5e 3e |.$.....U..f.R4^>|
00000010 71 39 f8 e8 b9 75 43 ee 56 f8 e7 e9 42 8a 51 78 |q9...uC.V...B.Qx|
00000020 31 31 30 39 ea 4a 12 30 00 43 91 55 06 9d 79 68 |1109.J.0.C.U..yh|
00000030 70 58 1d 50 4b 10 e6 a2 c2 17 73 d5 23 3e 73 8a |pX.PK.....s.#>s.|
00000040 df 99 09 25 e3 4a f3 73 10 03 b9 5d 1b fa 58 14 |...%.J.s...]..X.|
00000050 64 c4 12 4c 8d 57 21 7a |d..L.W!z|
Suppose I have an ELF binary prog and suppose objdump -d prog produces output along the following lines [snippet]:
0000000000400601 <.cstart_c941>:
400601: eb 01 jmp 400604 <.end_c941>
0000000000400603 <.cslot_c941>:
400603: 84 .byte 0x84
0000000000400604 <.end_c941>:
400604: 48 81 ec 80 00 00 00 sub $0x80,%rsp
40060b: 50 push %rax
40060c: 53 push %rbx
40060d: 56 push %rsi
40060e: 48 31 c0 xor %rax,%rax
400611: 48 c7 c6 41 06 40 00 mov $0x400641,%rsi
What I need is the file offset corresponding to .cslot_c941, since I need to modify the byte at this position.
How would I accomplish this task?
You can get OBJDUMP to dump the file offsets by using the -F. From the OBJDUMP documentation:
objdump
..snip..
[-F|--file-offsets]
..snip..
Try using objdump -DF prog. You should see each label listed with the file offset with information like:
0000000000400601 <.cstart_c941>: (File Offset: 0xXXXXXXXX)
0xXXXXXXXX should be the file offset of that label.
I try to profile my program myprog using perf, and here's what I get:
#
# Overhead Symbol Shared Object
# ........ ................................................................... .....................................
#
7.71% 0x743a l [.] list_iter_next myprog
I use objdump -D to see which instruction the IP refers to.
The thing is, the 0x743a IP shown here is in a .debug section of the myprog.
$ grep -ne ' 743a' dump
418233: 743a: 65 gs
429445: 743a: 40 00 00 add %al,(%rax)
The hex value provided by perf could match several places in the dump, as shown by:
$ grep -ne 743a dump
7973: 40743a: 48 8b 00 mov (%rax),%rax
72861: 44743a: 66 0f f8 c8 psubb %xmm0,%xmm1
87650: 45743a: 41 d3 e9 shr %cl,%r9d
The correct IP is 0x40743a, as shown here:
$ grep -n4 40743a dump
7969-0000000000407430 <list_iter_next>:
7970- 407430: 48 8b 07 mov (%rdi),%rax
7971- 407433: 48 8b 40 08 mov 0x8(%rax),%rax
7972- 407437: 48 89 07 mov %rax,(%rdi)
7973: 40743a: 48 8b 00 mov (%rax),%rax
7974- 40743d: c3 retq
7975- 40743e: 66 90 xchg %ax,%ax
7976-
Does anybody know what's going on?
Have you compiled your program with debug options (-g with gcc)? It seems that debug information is missing, as explained in the perf tutorial at : https://perf.wiki.kernel.org/index.php/Tutorial
When the symbol is printed as an hexadecimal address, this is because the ELF image does not have a symbol table. This happens when binaries are stripped.
About the symbol value you get, I don't know where it comes from and if we can interpret it like you did.
As per http://www.vmware.com/support/esx15/doc/esx15_runvm5.html, how would one go about generating the UUID format that is specified in the docs?
I'd like this to be a command line utility so I can re-use this in an automation script.
Command:
uuidgen | perl -ne '{ s/-//g; s/(.{2})/\1 /g; substr($_,23,1,"-"); print ; }'
Outputs:
CB 7B E9 47 F7 55 42 42-AC 16 46 C1 E9 08 35 53