How to profile an app running inside KVM guest - profiling

Is there any way to profile an application running inside KVM guest using a tool like perf_events?
I've tried to do that using
perf kvm --guestkallsyms=.. --guestmodules=.. --guest record -a
but information in report is pretty useless:
# ========
#
# Samples: 627 of event 'cache-misses'
# Event count (approx.): 295421
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
73.18% :15661 [x_tables] [g] 0xffffffff8176bc80
26.82% :15661 [unknown] [u] 0x00000000004004fe
#
# (For a higher level overview, try: perf report --sort comm,dso)
#

No.
The perf tool runs in the host and does not have any way to get information about the applications in the guest. I think the attribution of samples to guest-kernelspace or guest-userspace is based on the cpu-mode at the time the sample was taken (not on higher-level information about what the guest is doing).
You can get some profiling information by running perf directly in the guest. Use perf list to see the options (they are probably all in the 'software' category).

Yes, you probably can. The host can see the guest. You can use raw hardware events to do so (just check the event number to be available on your system).
For me this works as an example:
sudo perf kvm stat -I 1000 -e r1a8 -a
(make sure you are monitoring the guest by turning off the KVM machine after a while and see the zeros ..)

yes,
what about
sudo perf kvm stat record -p appPID
it should work based on the help of perf kvm but it does not! it works fine in system wide mode with -a

Related

Statistics for DPDK interfaces

I want to get RX/TX statistics like bytes or packets sent/received for DPDK enabled interfaces. Similar to data present in the /proc/net/dev file. How can I get this?
I tried the command
./dpdk-procinfo -- --stats
But I get the following error.
The command that I use for the primary application.
./tas --ip-addr=10.0.0.1/24 --shm-len=1073741824 --dpdk-extra="-w 01:00.1" --fp-cores-max=4
I get the following output on ldd
[EDIT] based on debug session with Ashwin, it is been found PRIMARY application is compiled DPDK-19.11 while procinfo is run with DPDK-17.11.4. Running with the right version for primary-secondary is working with l2fwd. Application has CFLAGS and LDFLAGS cleanup to be done. Suggested the same
Solution: always run dpdk-procinfo with the same version as primary.
I humbly request you to go through http://doc.dpdk.org/api/rte__ethdev_8h.html. There are API rte_eth_stats_get and rte_eth_get_xstats which does the job for you. These can be invoked in the primary and secondary application of DPDK.
But if you are looking for a ready-made solution please take a look into dpdk-procifno application. The binary for the target is present in the target folder/app while the source code is present in dpdk-root/app/procinfo.
quick way to test the same is by referring to https://doc.dpdk.org/guides-18.08/tools/proc_info.html. the sample command line can be ./dpdk-procinfo -- --stats and ./dpdk-procinfo -- --xstats.
[EDIT]
as per the comment, if primary is run with whitelist PCIe devices, please pass the same in dpdk-procinfo

profiling linux application with perf record

I've been trying to profile my C++ application in Linux by following this article on perf record. My understanding is all I need to do is run perf record program [program_options], where program is the program executable and [program options] are the arguments I want to pass to the program. However, when I try to profile my application like this:
perf record ./csvJsonTransducer -enable-AVX-deletion test.csv testout.json
perf returns almost immediately with a report. It takes nearly 30 seconds to run./csvJsonTransducer -enable-AVX-deletion test.csv testout.json without perf, though, and I want perf to monitor my program for the entirety of its execution, not return immediately. Why is perf returning so quickly? How can I make it take the entire run of my program into account?
Your commands seems ok. Try change the paranoid level at /proc/sys/kernel/perf_event_paranoid. Setting this parameter to -1 (as root) should solve permission issues:
echo "-1" > /proc/sys/kernel/perf_event_paranoid
You can also try to set the event that you want to monitor with perf record. The default event is cycles (if supported). Check man perf-list.
Try the command:
perf record -e cycles ./csvJsonTransducer -enable-AVX-deletion test.csv testout.json
to force the monitoring of cycles.

perf.data file has no samples

I am using perf 3.0.4 on ubuntu 11.10. Its record command works well and displays on terminal 256 samples collected. But when I make use of perf report , it gives me the following error:
perf.data file has no samples
I searched a lot for the solution but no success yet.
This thread has some useful information: http://www.spinics.net/lists/linux-perf-users/msg01436.html
It seems that if you are running in a VM that does not expose the PMU to the guest, the default collection (-e cycles) won't work. Try running with -e cpu-clock. According to that thread, the OP had the same problem also in a real host running Ubuntu 10.04, so it might solve it for you too...
The number of samples reported by the perf record command is an approximation and not the correct number of events (see perf wiki here).
To get the accurate number of events, dump the raw file and use wc -l to count then number of results:
perf report -D -i perf.data | grep RECORD_SAMPLE | wc -l
This command should report 0 in your case where perf report says it can't find events.
Let us know more information about how you use perf record, which event are you sampling, which hardware, which program.
EDIT: you can try first to increase the sampling period or frequency with the -c or -F options
Whenever I run into this on a machine where perf record has worked in the past, it is because I have left something else running that uses the performance counters, e.g., I have perf top running in another terminal tab.
In this case, it seems that perf record simply doesn't record any PMU related samples.

Increasing shared memory on OSX to properly install PostgreSQL

This is my first stackoverflow post. I am trying to set up PostgreSQL to use with Django. Very new to all of this (took one course in Python in college, now trying to teach myself a little web development).
The installation guide for PostgreSQL says:
"Before running the installation, please ensure that your system is
configured to allow the use of larger amounts of shared memory. Note that
this does not 'reserve' any memory so it is safe to configure much higher
values than you might initially need. You can do this by editting the
file /etc/sysctl.conf - e.g.
% sudo vi /etc/sysctl.conf
On a MacBook Pro with 2GB of RAM, the author's sysctl.conf contains:
kern.sysv.shmmax=1610612736
kern.sysv.shmall=393216
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.maxprocperuid=512
kern.maxproc=2048
Note that (kern.sysv.shmall * 4096) should be greater than or equal to
kern.sysv.shmmax. kern.sysv.shmmax must also be a multiple of 4096.
Once you have edited (or created) the file, reboot before continuing with
the installation. If you wish to check the settings currently being used by
the kernel, you can use the sysctl utility:
% sysctl -a
The database server can now be installed."
I am running a fresh-out-of-the-box MBA with 4GB of ram. How to I set this up properly? Thanks in advance.
Just download the installer and click "ok" to get started. When everything is running, you can always increase memory settings and edit postgresql.conf to get better performance.

oprofile on Linux running in a virtual machine

I'm running a Linux Ubuntu 10.4 VM using VirtualBox. I'm trying to use oprofile to profile some application in the virtual machine. I've installed oprofile 0.9.6 but I cannot get it to work. When I try to start I get the following error:
opcontrol --start
/usr/local/bin/opcontrol: line 323: /usr/local/bin/ophelp: cannot execute binary file
/usr/local/bin/opcontrol: line 1483: /usr/local/bin/oprofiled: cannot execute binary file
Couldn't start oprofiled.
Check the log file "/var/lib/oprofile/samples/oprofiled.log" and kernel syslog
As I'm not sure if VirtualBox could provide access to the performance counters (I'm in doubt here so if you have any pointers it would be great) I defaulted oprofile to the timer interrupt like so:
opcontrol --deinit
/usr/local/bin/opcontrol: line 323: /usr/local/bin/ophelp: cannot execute binary file
Unloading oprofile module
root#dev-ubuntu-10:/usr/local/bin# /sbin/modprobe oprofile timer=1
root#dev-ubuntu-10:/usr/local/bin# opcontrol --init
But still not working and I'm getting the same error. Is it even possible to run oprofile in a VM?
Thanks
I've tried something similar in the past, only with VMware Fusion and a different profiler, and run into the same problem. It seems that access to the performance registers and other low level stuff that profilers need is just not feasible in a VM. You'll need a real machine for profiling, I'm afraid.
This error:
/usr/local/bin/ophelp: cannot execute binary file
usually means that you are attempting to execute an x86_64 binary on a 32-bit kernel.
What do file usr/local/bin/ophelp and uname -a print?
A couple of years ago I had some problem running oprofile inside vmware. I wrote my little experience on this post http://blogs.epfl.ch/category/3239
You could try installing older versions like oprofile-0.9.7
extract it anywhere then follow steps:
install it by > 1 ./configure 2. make 3. make install
Then try using it it works fine you might want to turn on virtual CPU counters in VMWARE and disable nmi_watchdog registers in linux as they might be used by other profilers.
use of HPC(hardware performance counters) requires hardware supprot, try to install cpuid in vbox, you will see
Architecture Performance Monitoring Features (0xa/ebx):
core cycle event not available = false
instruction retired event not available = false
reference cycles event not available = false
last-level cache ref event not available = false
last-level cache miss event not avail = false
branch inst retired event not available = false
branch mispred retired event not avail = false
Architecture Performance Monitoring Features (0xa/edx):
number of fixed counters = 0x0 (0)
bit width of fixed counters = 0x0 (0)
It seems that just Vmware and KVM can emulate PMU unit, and not the VBOX