Does the DTrace pid provider only work for debug mode compiled programs? - dtrace

I want to use DTrace to track FFI usage in a wide range of programs.
DTrace is designed to work in production environments, but most software is compiled and distributed with compiler optimisations.
Do these optimisations prevent using the pid provider to track the entry and return probes?
E.g.
pid$target::example_ffi_fn_x:entry
pid$target::example_ffi_fn_x:return
http://dtrace.org/guide/dtrace-ebook.pdf#page=291
When the compiler inlines a function, the pid provider’s probe does
not fire. To avoid inlining a function at compile time, consult the
documentation for your compiler.
Does this mean that you can only use DTrace on programs that publish USDT probes or are compiled in debug mode (without optimisations)?
Is there another way to track FFI usage in programs where I do not have the original source code?

Compiler optimizations don't impede DTrace's ability to trace functions for the most part. Yes, aggressive (or explicit) inlining will render a function invisible, but that's mercifully uncommon. A more typical challenge in binaries is the lack of symbolic information that lets you find the functions you might be interested in.
I'm pretty sure I documented this, but I can't find it--it's certainly in the code: the pid provider lets you specify instruction addresses within a process' address space. Now, it seems to be subtly broken in macOS 10.15.2 (where I'm testing it out), but it would look something like this: dtrace -n 'pid11118::-:0x103c87a19'

Related

Best practice to build C++ program with optimization while keeping debugability in production env

I am writing a C++ server program that will be deployed to *nix systems (Linux/macOS). The program sometimes runs into segfault in production environment, and I would like to collect some information like core dump file. But I have no idea what the best practice is for doing this:
I would like to make the program perform best.
I would like to analyze the core dump in production env offline if it really happens.
I learn that there are some things I could try:
There is RelWithDebInfo and Release for CMAKE_BUILD_TYPE, but it seems they have different optimization level. So I assume RelWithDebInfo build performs not as well as Release build. (“RelWithDebInfo uses -O2, but Release uses -O3“ according to here What are CMAKE_BUILD_TYPE: Debug, Release, RelWithDebInfo and MinSizeRel?)
Tools like objectcopy/strip allow you to strip debug information from a binary (How to generate gcc debug symbol outside the build target?)
Printing stack trace when handling SIGSEGV signal (How to automatically generate a stacktrace when my program crashes)
I am new to deploy a production C++ server program, and I would like to know the answers for the following questions:
What is the recommended build type to use in this case, RelWithDebInfo or Release?
Compared to choosing different build type, when do I need to use tools like strip?
If I create a Release build binary for production deployment, when the Release build generates a core dump in production environment, can I later use the using the same revision of source code to build a RelWithDebInfo binary and use (gdb + RelWithDebInfo binary + Release build core dump) for core dump analysis?
Is it common to turn on core dump in production env? If it is not a good practice, what could be the recommended approach for collecting information for troubleshooting, printing the stack trace when crashing?
In general, I would like to know how C++ programs is recommended to be build for production, allowing it to be best optimized while I am still able to troubleshoot it. Thanks so much.
This will be a rather general answer.
If you are still having reliability issues then go with RelWithDebInfo. Alternatively, you can override -O2 optimization to get compiler to optimize all the way.
You need to strip when debug info is present. This doesn't change any of the stuff that actually gets executed, but removes things that make debugging much easier. You can still debug stripped executables, but it is more difficult to understand what is going on.
No, due to different optimization levels. If the only difference between the two was that one was stripped and the other not, then yes. But with different optimization levels the resulting assembly will actually be different.
Enabling core dumps in production environments is usually advised against, mostly for security reasons. For example, core dump may contain plain-text passwords, session tokens, etc. These are all things that others should not see. If you have total control of the machine where this is running then this concern is somewhat smaller. Another concern is disk space usage. Core dump can be huge, depending on what your program is doing. If you have fixed core file name then at least there will enver be more than one file, but if you have a name that includes timestamp and/or PID, then you can have multiple files, each of them taking lots (meaning n GB) of space. This can again lead to problems.
The general rule is (or should be) that you consider release environment as hostile. Sometimes that is true, sometimes it is not -- here generality can't apply because only you know your specific situation.
I always deploy my stuff fully optimized. I will only include debug info if a program is particularly problematic, because it makes it easy to either run it with or attach to it using gdb.
The downside of full optimization is that things sometimes look a bit different than the code you wrote. Order of things may change, some things may not happen at all, and you may observe that some functions don't really exist as proper standalone functions because compiler decides they are better of being inlined. These are changes I observed, but there are probably others as well.
Recently I learn that there are tools like Google Breakpad which will generate crash report with minidump format that could be collected and analyzed in production environment. I don't give it a try yet but it could be useful for this exact purpose.

Getting address of caller in c++

At the moment I'm working on a anticheat. I added a way to detect any hooking to the directx functions, since those are what most cheats do.
The problem comes in when a lot of programs, such as OBS, Fraps and many other programs that hook directx get their hook detected too.
So to be able to hook directx, you will most probabbly have to call VirtualProtect. If I could determine what address this is being called from, then I could loop through all dll's in memory, and then find what module it has been called from, and then sending the information to the server, maybe perhaps even taking a md5 hash and sending it to the server for validation.
I could also hook the DirectX functions that the cheats hook and check where those get called from (since most of them use ms detours).
I looked it up, and apparently you can check the call stack, but every example I found did not seem to help me.
This -getting the caller's address- is not possible in standard C++. And many C++ compilers might optimize some calls (e.g. by inlining them, even when you don't specify inline, or because there is no more any framepointer, e.g. compiler option -fomit-frame-pointerfor x86 32 bits with GCC, or by optimizing a tail-call ....) to the point that the question might not make any sense.
With some implementations and some C or C++ standard libraries and some (but not all) compiler options (in particular, don't ask the compiler to optimize too much*) you might get it, e.g. (on Linux) use backtrace from GNU glibc or I.Taylor's libbacktrace (from inside GCC implementation) or GCC return address builtins.
I don't know how difficult would it be to port these to Windows (Perhaps Cygwin did it). The GCC builtins might somehow work, if you don't optimize too much.
Read also about continuations. See also this answer to a related question.
Note *: on Linux, better compile all the code (including external libraries!) with at most g++ -Wall -g -O1 : you don't want too much optimization, and you want the debug information (in particular for libbacktrace)
Ray Chen's blog 'The old new thing' covers using return address' to make security decisions and why its a pretty pointless thing
https://devblogs.microsoft.com/oldnewthing/20060203-00/?p=32403
https://devblogs.microsoft.com/oldnewthing/20040101-00/?p=41223
Basically its pretty easy to fake (by injecting code or using a manually constructed fake stack to trick you). Its Windows centric but the basic concepts are generally applicable.

Debugging: Tracing (and diff-ing) function call tree of two version of the same program

I'm working on the rewriting of some
code in a c++ cmd line program.
I
changed the low level data structure that
it uses and the new version passes all
the tests (quite a lot) without any
problem and I get the correct output
from both the new and the old version...
Still, when give certain input, they give
different behaviour.
Getting to the point: Being somewhat of
a big project I don't have a clue about
how to track down when the execution
flow diverges, so... is there way to trace
the function call tree (possibly excluding
std calls) along with, i don't know, line
number in the source file and source
name?
Maybe some gcc or macro kungfu?
I would need a Linux solution since that's where the program runs.
Still, when give certain input, they give different behaviour
I would expand logging in you old and new versions in order to understand better work of your algorithms for certain input. When it become clearer you can for example use gdb if you still need it.
Update
OK, As for me logging is OK, but you do not want to add it.
Another method is tracing. Actually I used it only on Solaris but I see that it exists also on Linux. I have not used it on Linux so it is just an idea that you can test.
You can use SystemTap
User-Space Probing SystemTap initially focused on kernel-space probing. However, there are many instances where userspace probing can
help diagnose a problem. SystemTap 0.6 added support to allow probing
userspace processes. SystemTap includes support for probing the entry
into and return from a function in user-space processes, probing
predefined markers in user-space code, and monitoring user-process
events.
I can gurantee that it will work but why don't give it a try?
There is even an example in the doc:
If you want to see how the function xmalloc function is being called
by the command ls, you could use the user-space backtrack functions to
provide that information.
stap -d /bin/ls --ldd \
-e 'probe process("ls").function("xmalloc") {print_ustack(ubacktrace())}' \
-c "ls /"

Instrumentation (diagnostic) library for C++

I'm thinking about adding code to my application that would gather diagnostic information for later examination. Is there any C++ library created for such purpose? What I'm trying to do is similar to profiling, but it's not the same, because gathered data will be used more for debugging than profiling.
EDIT:
Platform: Linux
Diagnostic information to gather: information resulting from application logic, various asserts and statistics.
You might also want to check out libcwd:
Libcwd is a thread-safe, full-featured debugging support library for C++
developers. It includes ostream-based debug output with custom debug
channels and devices, powerful memory allocation debugging support, as well
as run-time support for printing source file:line number information
and demangled type names.
List of features
Tutorial
Quick Reference
Reference Manual
Also, another interesting logging library is pantheios:
Pantheios is an Open Source C/C++ Logging API library, offering an
optimal combination of 100% type-safety, efficiency, genericity
and extensibility. It is simple to use and extend, highly-portable (platform
and compiler-independent) and, best of all, it upholds the C tradition of you
only pay for what you use.
I tend to use logging for this purpose. Log4cxx works like a charm.
If debugging is what you're doing, perhaps use a debugger. GDB scripts are pretty easy to write up and use. Maintaining them in parallel to your code might be challenging.
Edit - Appending Annecdote:
The software I maintain includes a home-grown instrumentation system. Macros are used to queue log messages and configuration options control what classes of messages are logged and the level of detail to be logged. A thread processes the logging queue, flushing messages to file and rotating files as they become too large (which they commonly do). The system provides a lot of detail, but often all too often it provides huge files our support engineers must wade through for hours to find anything useful.
Now, I've only used GDB to diagnose bugs a few times, but for those issues it had a few nice advantages over the logging system. GDB scripting allowed me to gather new instrumentation data without adding new instrumentation lines and deploying a new build of my software to the client. GDB can generate messages from third-party libraries (needed to debug into openssl at one point). GDB adds no run-time impact to the software when not in use. GDB does a pretty good job of printing the contents of objects; the code-level logging system requires new macros to be written when new objects need to have their states logged.
One of the drawbacks was that the gdb scripts I generated had no explicit relationship to the source code; the source file and the gdb script were developed independently. Ideally, changes to the source file should impact and update the gdb script. One thought is to put specially-formatted comments in code and have a scripting language make a pass on the source files to generate the debugger script file for the source file. Finally, have the makefile execute this script during the build cycle.
It's a fun exercise to think about the potential of using GDB for this purpose, but I must admit that there are probably better code-level solutions out there.
If you execute your application in Linux, you can use "ulimit" to generate a core when your application crash (or assert(false), or kill -6 ), later, you can debug with gdb (gdb -c core_file binary_file) and analyze the stack.
Salu2.
PD. for profiling, use gprof

Finding very similar program executions

I was wondering if its possible / anyone knows any tools out there to compare the execution of two related programs (for example, assignments on a class) to see how similar they are. For example, not to compare the names of functions, but how they use syscalls. One silly case of this would be testing if a C string is printed as (see example below) in more than one case one separate program.
printf("%s",str)
Or as
for (i=0;i<len;i++) printf("%c",str[i]);
I haven´t put much thought into this, but i would imagine that strace / ltrace (maybe even oprofile) would be a good starting point. Particularly, this is for UNIX C / C++ programs.
Thanks.
If you have access to the source code of the two programs, you may build a graph of the functions (each function is a node, and there is an edge from A to B if A calls B()), and compute some graph similarity metrics. This will catch a source code copy made by renaming and reorganizing.
An initial idea would be to use ltrace and strace to log the calls and then use diff on the logs. This would obviously only cover the library an system calls. If you need a more fine granular logging, the oprofile might help.
If you have access to the source code you could instrument your code by compiling it with profiling information and then parse the gcov output after the runs. A pure static source code analysis may be sufficient if your code is not taking different routes depending on external data/state.
I think you can do this kind of thing using valgrind.
A finer-grained version (and depending on what is the access to the program source and what you exactly want in terms of comparison) would be to use kprobes.
Kernel Dynamic Probes (Kprobes) provides a lightweight interface for kernel modules to implant probes and register corresponding probe handlers. A probe is an automated breakpoint that is implanted dynamically in executing (kernel-space) modules without the need to modify their underlying source. Probes are intended to be used as an ad hoc service aid where minimal disruption to the system is required. They are particularly advocated in production environments where the use of interactive debuggers is undesirable. Kprobes also has substantial applicability in test and development environments. During test, faults may be injected or simulated by the probing module. In development, debugging code (for example a printk) may be easily inserted without having to recompile to module under test.