Stack trace running UNIX application - c++

How can I perform a live stack trace on a running UNIX applicaiton, and are there any utilities that are useful in digesting the stack trace once its done?
I'm looking to see if any functions are getting called more often than I would have expected them to be - the application works fine, it just recently slowed down and it doesn't appear that anything else in the system is responsible (no other processes are running with unusual memory/processor usage).

Profiling tools will show what bits of the program are taking up the CPU time. If you have to dig deeper, you may need other tooling. Depending on what species of unix you're after, the tools will vary, as this is sometimes quite platform specific. This article discusses process monitoring on Linux. Different versions of unix may have different sets of utilities for functions that have to interact with the kernel (e.g. Dtrace for Solaris). Some do work across platforms.

Have you tried using an actual profiler on your application? This will help you much more than just a stack trace. Usually you just have to compile your application with profile information. Then you can run it and use the information written to determine in which functions the most time is being spent, number of calls, etc.

How can I perform a live stack trace on a running UNIX applicaiton
gcore can be used to grab a core file for a live process. This will give you a snapshot of the stack traces for all threads. This approach may, however, be a little too heavyweight for your needs.
If the suspect functions are system calls, you could try using strace to see what's going on.
In any case, I think the first port of call should be a profiler, such as gprof.

I'm guessing you want this at runtime without any debugger involvment. For that you could use glibc's backtrace functions. Documentation here and here(assuming Linux) just to get you started. This Linux Journal article might be of help also.

Your debugger may have the functionality to attach to a running process. With gdb this looks like
$ gdb path/to/exec 1234
where 1234 is the PID of the running process.
(But consider those answers that direct you to a profiling utility.)

Related

How to monitor processes on linux

When an executable is running on Linux, it generates processes, threads, I/O ... etc, and uses libraries from languages like C/C++, sometimes there might be timers in question, is it possible to monitor this? how can I get a deep dive into these software and processes and what is going on in the background?
I know this stuff is abstracted from me because I shouldn't be worrying about it as a regular user, but I'm curious to what would I see.
What I need to see are:
System calls for this process/thread.
Open/closed sockets.
Memory management and utilization, what block is being accessed.
Memory instructions.
If a process is depending on the results of another one.
If a process/thread terminates, why, and was it successful?
I/O operations and DB read/write if any.
The different things you wanted to monitor may require different tools. All tools I will mention below have extensive manual pages where you can find exactly how to use them.
System calls for this process/thread.
The strace command does exactly this - it lists exactly which system calls are invoked by your program. The ltrace tool is similar, but focuses on calls to library functions - not just system calls (which involve the kernel).
Open/closed sockets.
The strace/ltrace commands will list among other things socket creation, but if you want to know which sockets are open - connected, listening, and so on - right now, there is the netstat utility, which lists all the connected (or with "-a", also listening) sockets in the system, and which process they belong to.
Memory management and utilization, what block is being accessed.
Memory instructions.
Again ltrace will let you see all malloc()/free() calls, but to see exactly what memory is being access where, you'll need a debugger, like gdb. The thing is that almost everything your program does will be a "memory instruction" so you'll need to know exactly what you are looking for, with breakpoints, tracepoints, single-stepping, and so on, and usually don't just want to see every memory access in your program.
If you don't want to find all memory accesses but rather are searching for bugs in this area - like accessing memory after it's freed and so on, there are tools that help you find those more easily. One of them called ASAN ("Address Sanitizer") is built into the C++ compiler, so you can build with it enabled and get messages on bad access patterns. Another one you can use is valgrind.
Finally, if by "memory utilization" you meant to just check how much memory your process or thread is using, well, both ps and top can tell you that.
If a process is depending on the results of another one.
If a process/thread terminates, why, and was it successful?
Various tools I mentioned like strace/ltrace will let you know when the process they follow exits. Any process can print the exit code of one of its sub-processes, but I'm not aware of a tool which can print the exit status of all processes in the system.
I/O operations
There is iostat that can give you periodic summaries of how much IO was done to each disk. netstat -s gives you network statistics so you can see how many network operations were done. vmstat gives you, among other things, statistics on IO caused by swap in/out (in case this is a problem in your case).
and DB read/write if any.
This depends on your DB, I guess, and how you monitor it.

How can I limit memory usage for a C++ code via command line using gcc?

How can I limit memory usage for a C++ code via command line using gcc?
For context, I'm implementing a code judge so I need to run every script students submit, I was able to do the same for Java with the following command:
java -Xmx<memoryLimit> Main
So far no luck with gcc, any ideas?
Thank you.
There is not much that the compiler can do in regard to limit the memory use of a program.
Programs are generally not run within a "C++ virtual machine" that would be analogous to the JVM, so there is no comparable command line options for an executable.
Typically however, operating systems do support specifying resource limits of processes. To find out how, see the documentation of the operating system in which you run the program.
If you use a POSIX operating system, there is ulimit command which can set limits to processes of a user.
If you use Linux, there are cgroups, which can be used to set limits for process groups. cgroups can be a bit intimidating to use, and there is a higher level way to manage them: Containers. Other operating systems have similar features such as jails in FreeBSD.

substitution for googleperf tools that works on 64 bit Linux

As some of you might know Google provides for free great collection of tools for analyzing c++ code:
http://code.google.com/p/google-perftools/
Problem is that there is apparently some libunwind problem on 64 bits, and authors can't do anything on their side to fix it(
But I don't expect a
fix anytime soon: it depends on the libc folks and the libunwind
folks working out some locking issues. There's unfortunately not
much we ourselves can do.
), so I'm searching for replacement.
Is there any similar tool that provides cool graphical representation of profiling data(for example: )
)
EDIT: paste from README that explains the problem:
2) On x86-64 64-bit systems, while tcmalloc itself works fine, the
cpu-profiler tool is unreliable: it will sometimes work, but sometimes
cause a segfault. I'll explain the problem first, and then some
workarounds.
Note that this only affects the cpu-profiler, which is a
google-perftools feature you must turn on manually by setting the
CPUPROFILE environment variable. If you do not turn on cpu-profiling,
you shouldn't see any crashes due to perftools.
The gory details: The underlying problem is in the backtrace()
function, which is a built-in function in libc. Backtracing is fairly
straightforward in the normal case, but can run into problems when
having to backtrace across a signal frame. Unfortunately, the
cpu-profiler uses signals in order to register a profiling event, so
every backtrace that the profiler does crosses a signal frame.
In our experience, the only time there is trouble is when the signal
fires in the middle of pthread_mutex_lock. pthread_mutex_lock is
called quite a bit from system libraries, particularly at program
startup and when creating a new thread.
The solution: The dwarf debugging format has support for 'cfi
annotations', which make it easy to recognize a signal frame. Some OS
distributions, such as Fedora and gentoo 2007.0, already have added
cfi annotations to their libc. A future version of libunwind should
recognize these annotations; these systems should not see any
crashses.
Workarounds: If you see problems with crashes when running the
cpu-profiler, consider inserting ProfilerStart()/ProfilerStop() into
your code, rather than setting CPUPROFILE. This will profile only
those sections of the codebase. Though we haven't done much testing,
in theory this should reduce the chance of crashes by limiting the
signal generation to only a small part of the codebase. Ideally, you
would not use ProfilerStart()/ProfilerStop() around code that spawns
new threads, or is otherwise likely to cause a call to
pthread_mutex_lock!
--- 17 May 2011
Valgrind has a collection of great tools, including callgrind to profile the code. The gui client for the callgrind and cachegrind is kcachegrind.

check the performance of an exe through code

I want to check the performance of an application (whose exe i have, no source code) by running it multiple times and possibly compare the results, dint find much on the internet regarding this topic,
Since i have to do it with multiple input times, i thought doing it through code(no bar on the language used) can make things easier, as i may have to repeat them many times,
can anyone help me start off???
Note: by Performance i mean the memory usage, cpu and possibly the time taken to do it!
(I'm currently using perfmon on windows by using necessary counters to check these parameters and manually noting it down)
Thanks
It strongly depends upon your operating system. On Linux, you could use the time utility. And strace might help you understanding the system calls that are used.
I have no idea of the equivalent on Windows systems.
I think that you could create a bash/batch script to call your program as many times as you need and with different inputs.
You could then have your script create a CSV file that contains the time it took to execute your program (start date and end date for example). CSV files are usually compatible with most spreadsheet programs like Excel, so I think that can make it easier for you to process your data, like creating means and standard deviations.
I don't have much to say regarding the memory and CPU usage, but if you are in Windows it wouldn't hurt to take a look at the Process Explorer and the Process Monitor (you can find them in this page). I think that they might help you in your task.
Finally if you are in Linux I think that you might be able to use grep with the top command to gather some statistics.
Regards,
Felipe
If you want exact results, Rational Purify (on Windows), or valgrind (on Linux) are the best tools; these run your application in a virtual machine that can be instructed to do exact cycle counting.
In another post an utility named timethis.exe was mentioned for measuring time under Windows. Maybe it is useful for your purposes.
I used the perform im using to manually note down in an automated way,
that is, i used the performance counter class available in dot net and obtained samples of the particular application at regular intervals and generated a graph with those values..
Thanks :)

How to profile multi-threaded C++ application on Linux?

I used to do all my Linux profiling with gprof.
However, with my multi-threaded application, it's output appears to be inconsistent.
Now, I dug this up:
http://sam.zoy.org/writings/programming/gprof.html
However, it's from a long time ago and in my gprof output, it appears my gprof is listing functions used by non-main threads.
So, my questions are:
In 2010, can I easily use gprof to profile multi-threaded Linux C++ applications? (Ubuntu 9.10)
What other tools should I look into for profiling?
Edit: added another answer on poor man's profiler, which IMHO is better for multithreaded apps.
Have a look at oprofile. The profiling overhead of this tool is negligible and it supports multithreaded applications---as long as you don't want to profile mutex contention (which is a very important part of profiling multithreaded applications)
Have a look at poor man's profiler. Surprisingly there are few other tools that for multithreaded applications do both CPU profiling and mutex contention profiling, and PMP does both, while not even requiring to install anything (as long as you have gdb).
Try modern linux profiling tool, the perf (perf_events): https://perf.wiki.kernel.org/index.php/Tutorial and http://www.brendangregg.com/perf.html:
perf record ./application
# generates profile file perf.data
perf report
Have a look at Valgrind.
A Paul R said, have a look at Zoom. You can also use lsstack, which is a low-tech approach but surprisingly effective, compared to gprof.
Added: Since you clarified that you are running OpenGL at 33ms, my prior recommendation stands. In addition, what I personally have done in situations like that is both effective and non-intuitive. Just get it running with a typical or problematic workload, and just stop it, manually, in its tracks, and see what it's doing and why. Do this several times.
Now, if it only occasionally misbehaves, you would like to stop it only while it's misbehaving. That's not easy, but I've used an alarm-clock interrupt set for just the right delay. For example, if one frame out of 100 takes more than 33ms, at the start of a frame, set the timer for 35ms, and at the end of a frame, turn it off. That way, it will interrupt only when the code is taking too long, and it will show you why. Of course, one sample might miss the guilty code, but 20 samples won't miss it.
I tried valgrind and gprof. It is a crying shame that none of them work well with multi-threaded applications. Later, I found Intel VTune Amplifier. The good thing is, it handles multi-threading well, works with most of the major languages, works on Windows and Linux, and has many great profiling features. Moreover, the application itself is free. However, it only works with Intel processors.
You can randomly run pstack to find out the stack at a given point. E.g. 10 or 20 times.
The most typical stack is where the application spends most of the time (according to experience, we can assume a Pareto distribution).
You can combine that knowledge with strace or truss (Solaris) to trace system calls, and pmap for the memory print.
If the application runs on a dedicated system, you have also sar to measure cpu, memory, i/o, etc. to profile the overall system.
Since you didn't mention non-commercial, may I suggest Intel's VTune. It's not free but the level of detail is very impressive (and the overhead is negligible).
Putting a slightly different twist on matters, you can actually get a pretty good idea as to what's going on in a multithreaded application using ftrace and kernelshark. Collecting the right trace and pressing the right buttons and you can see the scheduling of individual threads.
Depending on your distro's kernel you may have to build a kernel with the right configuration (but I think that a lot of them have it built in these days).
Microprofile is another possible answer to this. It requires hand-instrumentation of the code, but it seems like it handles multi-threaded code pretty well. And it also has special hooks for profiling graphics pipelines, including what's going on inside the card itself.