Instrumenting C/C++ codes using LLVM

Instrumenting C/C++ codes using LLVM - c++

I just read about the LLVM project and that it could be used to do static analysis on C/C++ codes using the analyzer Clang which the front end of LLVM. I wanted to know if it is possible to extract all the accesses to memory(variables, local as well as global) in the source code using LLVM.
Is there any inbuilt library present in LLVM which I could use to extract this information.
If not please suggest me how to write functions to do the same.(existing source code, reference, tutorial, example...)
Of what i have thought, is I would first convert the source code into LLVM bc and then instrument it to do the analysis, but don't know exactly how to do it.
I tried to figure out myself which IR should I use for my purpose ( Clang's Abstract Syntax Tree (AST) or LLVM's SSA Intermediate Representation (IR). ), but couldn't really figure out which one to use.
Here is what I m trying to do.
Given any C/C++ program (like the one given below), I am trying to insert calls to some function, before and after every instruction that reads/writes to/from memory. For example consider the below C++ program ( Account.cpp)
#include <stdio.h>
class Account {
int balance;
public:
Account(int b) {
balance = b;
}
int read() {
int r;
r = balance;
return r;
}
void deposit(int n) {
balance = balance + n;
}
void withdraw(int n) {
int r = read();
balance = r - n;
}
};
int main () {
Account* a = new Account(10);
a->deposit(1);
a->withdraw(2);
delete a;
}
So after the instrumentation my program should look like:
#include <stdio.h>
class Account {
int balance;
public:
Account(int b) {
balance = b;
}
int read() {
int r;
foo();
r = balance;
foo();
return r;
}
void deposit(int n) {
foo();
balance = balance + n;
foo();
}
void withdraw(int n) {
foo();
int r = read();
foo();
foo();
balance = r - n;
foo();
}
};
int main () {
Account* a = new Account(10);
a->deposit(1);
a->withdraw(2);
delete a;
}
where foo() may be any function like get the current system time or increment a counter .. so on. I understand that to insert function like above I will have to first get the IR and then run an instrumentation pass on the IR which will insert such calls into the IR, but I don't really know how to achieve it. Please suggest me with examples how to go about it.
Also I understand that once I compile the program into the IR, it would be really difficult to get 1:1 mapping between my original program and the instrumented IR. So, is it possible to reflect the changes made in the IR ( because of instrumentation ) into the original program.
In order to get started with LLVM pass and how to make one on my own, I looked at an example of a pass that adds run-time checks to LLVM IR loads and stores, the SAFECode's load/store instrumentation pass (http://llvm.org/viewvc/llvm-project/safecode/trunk/include/safecode/LoadStoreChecks.h?view=markup and http://llvm.org/viewvc/llvm-project/safecode/trunk/lib/InsertPoolChecks/LoadStoreChecks.cpp?view=markup). But I couldn't figure out how to run this pass. Please give me steps how to run this pass on some program say the above Account.cpp.

First off, you have to decide whether you want to work with clang or LLVM. They both operate on very different data structures which have advantages and disadvantages.
From your sparse description of your problem, I'll recommend going for optimization passes in LLVM. Working with the IR will make it much easier to sanitize, analyze and inject code because that's what it was designed to do. The downside is that your project will be dependent on LLVM which may or may not be a problem for you. You could output the result using the C backend but that won't be usable by a human.
Another important downside when working with optimization passes is that you also lose all symbols from the original source code. Even if the Value class (more on that later) has a getName method, you should never rely on it to contain anything meaningful. It's meant to help you debug your passes and nothing else.
You will also have to have a basic understanding of compilers. For example, it's a bit of a requirement to know about basic blocks and static single assignment form. Fortunately they're not very difficult concepts to learn or understand (the Wikipedia articles should be adequate).
Before you can start coding, you first have to do some reading so here's a few links to get you started:
Architecture Overview: A quick architectural overview of LLVM. Will give you a good idea of what you're working with and whether LLVM is the right tool for you.
Documentation Head: Where you can find all the links below and more. Refer to this if I missed anything.
LLVM's IR reference: This is the full description of the LLVM IR which is what you'll be manipulating. The language is relatively simple so there isn't too much to learn.
Programmer's manual: A quick overview of basic stuff you'll need to know when working with LLVM.
Writting Passes: Everything you need to know to write transformation or analysis passes.
LLVM Passes: A comprehensive list of all the passes provided by LLVM that you can and should use. These can really help clean up the code and make it easier to analyze. For example, when working with loops, the lcssa, simplify-loop and indvar passes will save your life.
Value Inheritance Tree: This is the doxygen page for the Value class. The important bit here is the inheritance tree that you can follow to get the documentation for all the instructions defined in the IR reference page. Just ignore the ungodly monstrosity that they call the collaboration diagram.
Type Inheritance Tree: Same as above but for types.
Once you understand all that then it's cake. To find memory accesses? Search for store and load instructions. To instrument? Just create what you need using the proper subclass of the Value class and insert it before or after the store and load instruction. Because your question is a bit too broad, I can't really help you more than this. (See correction below)
By the way, I had to do something similar a few weeks ago. In about 2-3 weeks I was able to learn all I needed about LLVM, create an analysis pass to find memory accesses (and more) within a loop and instrument them with a transformation pass I created. There was no fancy algorithms involved (except the ones provided by LLVM) and everything was pretty straightforward. Moral of the story is that LLVM is easy to learn and work with.
Correction: I made an error when I said that all you have to do is search for load and store instructions.
The load and store instruction will only give accesses that are made to the heap using pointers. In order to get all memory accesses you also have to look at the values which can represent a memory location on the stack. Whether the value is written to the stack or stored in a register is determined during the register allocation phase which occurs in an optimization pass of the backend. Meaning that it's platform dependent and shouldn't be relied on.
Now unless you provide more information about what kind of memory accesses you're looking for, in what context and how you intend to instrument them, I can't help you much more then this.

Since there are no answer to your question after two days, I will offer his one which is slightly but not completely off-topic.
As an alternative to LLVM, for static analysis of C programs, you may consider writing a Frama-C plug-in.
The existing plug-in that computes a list of inputs for a C function needs to visit every lvalue in the function's body. This is implemented in file src/inout/inputs.ml. The implementation is short (the complexity is in other plug-ins that provide their results to this one, e.g. resolving pointers) and can be used as a skeleton for your own plug-in.
A visitor for the Abstract Syntax Tree is provided by the framework. In order to do something special for lvalues, you simply define the corresponding method. The heart of the inputs plug-in is the method definition:
method vlval lv = ...
Here is an example of what the inputs plug-in does:
int a, b, c, d, *p;
main(){
p = &a;
b = c + *p;
}
The inputs of main() are computed thus:
$ frama-c -input t.c
...
[inout] Inputs for function main:
a; c; p;
More information about writing Frama-C plug-ins in general can be found here.

Related

C++ function instrumentation via clang++'s -finstrument-functions : how to ignore internal std library calls?

Let's say I have a function like:
template<typename It, typename Cmp>
void mysort( It begin, It end, Cmp cmp )
{
std::sort( begin, end, cmp );
}
When I compile this using -finstrument-functions-after-inlining with clang++ --version:
clang version 11.0.0 (...)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: ...
The instrument code explodes the execution time, because my entry and exit functions are called for every call of
void std::__introsort_loop<...>(...)
void std::__move_median_to_first<...>(...)
I'm sorting a really big array, so my program doesn't finish: without instrumentation it takes around 10 seconds, with instrumentation I've cancelled it at 10 minutes.
I've tried adding __attribute__((no_instrument_function)) to mysort (and the function that calls mysort), but this doesn't seem to have an effect as far as these standard library calls are concerned.
Does anyone know if it is possible to ignore function instrumentation for the internals of a standard library function like std::sort? Ideally, I would only have mysort instrumented, so a single entry and a single exit!
I see that clang++ sadly does not yet support anything like finstrument-functions-exclude-function-list or finstrument-functions-exclude-file-list, but g++ does not yet support -finstrument-functions-after-inlining which I would ideally have, so I'm stuck!
EDIT: After playing more, it would appear the effect on execution-time is actually less than that described, so this isn't the end of the world. The problem still remains however, because most people who are doing function instrumentation in clang will only care about the application code, and not those functions linked from (for example) the standard library.
EDIT2: To further highlight the problem now that I've got it running in a reasonable time frame: the resulting trace that I produce from the instrumented code with those two standard library functions is 15GB. When I hard code my tracing to ignore the two function addresses, the resulting trace is 3.7MB!

I've run into the same problem. It looks like support for these flags was once proposed, but never merged into the main branch.
https://reviews.llvm.org/D37622
This is not a direct answer, since the tool doesn't support what you want to do, but I think I have a decent work-around. What I wound up doing was creating a "skip list" of sorts. In the instrumented functions (__cyg_profile_func_enter and __cyg_profile_func_exit), I would guess the part that is contributing most to your execution time is the printing. If you can come up with a way of short-circuiting the profile functions, that should help, even if it's not the most ideal. At the very least it will limit the size of the output file.
Something like
#include <stdint.h>
uintptr_t skipAddrs[] = {
// assuming 64-bit addresses
0x123456789abcdef, 0x2468ace2468ace24
};
size_t arrSize = 0;
int main(void)
{
...
arrSize = sizeof(skipAddrs)/sizeof(skipAddrs[0]);
// https://stackoverflow.com/a/37539/12940429
...
}
void __cyg_profile_func_enter (void *this_fn, void *call_site) {
for (size_t idx = 0; idx < arrSize; idx++) {
if ((uintptr_t) this_fn == skipAddrs[idx]) {
return;
}
}
}
I use something like objdump -t binaryFile to examine the symbol table and find what the addresses are for each function.
If you specifically want to ignore library calls, something that might work is examining the symbol table of your object file(s) before linking against libraries, then ignoring all the ones that appear new in the final binary.
All this should be possible with things like grep, awk, or python.

You have to add attribute __attribute__((no_instrument_function)) to the functions that should not be instrumented. Unfortunately it is not easy to make it work with C/C++ standard library functions because this feature requires editing all the C++ library functions.
There are some hacks you can do like #define existing macros from include/__config to add this attribute as well. e.g.,
-D_LIBCPP_INLINE_VISIBILITY=__attribute__((no_instrument_function,internal_linkage))
Make sure to append existing macro definition with no_instrument_function to avoid unexpected errors.

LLVM Clang C++ code injection

I am a bit confused about implementing a code injection function in LLVM Clang. I basically want to add a function before a variable or a pointer is created in the source code. Example:
#include <iostream>
int main() {
int a;
return 0;
}
to
#include <iostream>
int main() {
foo();
int a;
return 0;
}
I read the LLVM docs to find an answer but couldn't. Please help me.
Thank you in advance.

First step is to decide whether you want to do this in Clang or LLVM. Although they are "connected", they are not the same thing. At clang you can do it at AST level, in which case you need to write a recursive AST-visitor, and from that identify the function definitions that you want to instrument - inserting the AST to call your foo function. This will only work for functions implemented by the compiler.
There is information on how to write such a visitor here:
https://clang.llvm.org/docs/RAVFrontendAction.html
In LLVM you could write a function-pass, that inserts code into each function. This obviously works for ANY functions, regardless of language.
How to write an LLVM pass:
http://llvm.org/docs/WritingAnLLVMPass.html
However, while this may seem trivial at the beginning, there are some interesting quirks. In an LLVM function, the alloca instructions should be first, so you would have to "skip" those functions. There may be functions that "shouldbn't be instrumented" - for example, if your function foo prints something using cout << something;, it would be rather terrible idea to insert foo into the operator<<(ostream&, ...) type functions... ;) And you obviously don't want to instrument foo itself, or any functions it calls.
There are ways in Clang that you can determine if the source is the "main file" or some header-file - although that may not be enough in your case. It is much harder to determine "which function is this" in LLVM.

Dead virtual function elimination

Question
(Can I get clang or perhaps some other optimizing tool shipped with LLVM to identify unused virtual functions in a C++ program, to mark them for dead code elimination? I guess not.)
If there is no such functionality shipped with LLVM, how would one go about implementing a thing like this? What's the most appropriate layer to achieve this, and where can I find examples on which I could build this?
Thoughts
My first thought was an optimizer working on LLVM bitcode or IR. After all, a lot of optimizers are written for that representation. Simple dead code elimination is easy enough: any function which is neither called nor has its address taken and stored somewhere is dead code and can be omitted from the final binary. But a virtual function has its address taken and stored in the virtual function table of the corresponding class. In order to identify whether that function has a chance of getting called, an optimizer would not only have to identify all virtual function calls, but also identify the type hierarchy to map these virtual function calls to all possible implementations.
This makes things look quite hard to tackle at the bitcode level. It might be better to handle this somewhere closer to the front end, at a stage where more type information is available, and where calls to a virtual function might be more readily associated with implementations of these functions. Perhaps the VirtualCallChecker could serve as a starting point.
One problem is probably the fact that while it's possible to combine the bitcode of several objects into a single unit for link time optimization, one hardly ever compiles all the source code of a moderately sized project as a single translation unit. So the association between virtual function calls and implementations might have to be somehow maintained till that stage. I don't know if any kind of custom annotation is possible with LLVM; I have seen no indication of this in the language specification.
But I'm having a bit of a trouble with the language specification in any case. The only reference to virtual in there are the virtuality and virtualIndex properties of MDSubprogram, but so far I have found no information at all about their semantics. No documentation, nor any useful places inside the LLVM source code. I might be looking at the wrong documentation for my use case.
Cross references
eliminate unused virtual functions asked about pretty much the same thing in the context of GCC, but I'm specifically looking for a LLVM solution here. There used to be a -fvtable-gc switch to GCC, but apparently it was too buggy and got punted, and clang doesn't support it either.
Example:
struct foo {
virtual ~foo() { }
virtual int a() { return 12345001; }
virtual int b() { return 12345002; }
};
struct bar : public foo {
virtual ~bar() { }
virtual int a() { return 12345003; }
virtual int b() { return 12345004; }
};
int main(int argc, char** argv) {
foo* p = (argc & 1 ? new foo() : new bar());
int res = p->a();
delete p;
return res;
};
How can I write a tool to automatically get rid of foo::b() and bar::b() in the generated code?
clang++ -fuse-ld=gold -O3 -flto with clang 3.5.1 wasn't enough, as an objdump -d -C of the resulting executable showed.
Question focus changed
Originally I had been asking not only about how to use clang or LLVM to this effect, but possibly for third party tools to achieve the same if clang and LLVM were not up to the task. Questions asking for tools are frowned upon here, though, so by now the focus has shifted from finding a tool to writing one. I guess chances for finding one are slim in any case, since a web search revealed no hints in that direction.

IOS code has become very slow because of objc_msgSend

I have rewritten part of my code from very simple c arrays to using (or trying to use) objects in order to get more structure into it. Instead of passing arrays through the function header I am now using a global array defined by a singleton. You can see an example of a function in my code below:
it was:
void calcdiv(int nx,int ny,float **u,float **v,
float **divu,float dx,float dy,float **p,
float dt,float rho, float **bp,float **lapp)
{
int i,j;
for (i=2;i<=nx-3;++i){
for (j=2;j<=ny-3;++j){
divu[i][j] = (u[i+1][j]-u[i-1][j])*facu +
(v[i][j+1]-v[i][j-1])*facv;
}
}
...
now it is:
void calcdiv()
{
int i,j;
SingletonClass* gV = [SingletonClass sharedInstance];
for (i=2;i<=gV.nx-3;++i){
for (j=2;j<=gV.ny-3;++j){
gV.divu[i][j] = (gV.u[i+1][j]-gV.u[i-1][j])*facu +
(gV.v[i][j+1]-gV.v[i][j-1])*facv;
}
}
...
Before the restructuring I have been using the function call as given above. That means passing the pointers to the arrays directly. Now I access the arrays by the singleton call "SingletonClass* gV...". It works very fine except the fact that it is much slower than before. The profiler tells me that my program spends 41% of the time with objc_msgSend which I have not had before.
From reading through the posts I have understood that this probably can happen when msgSend is called very often. This is then most likely the case here, because my program needs a lot of number crunching in order to display an animated flow with OpenGl.
This leads me to my question: What would you suggest? Should I stay with my simple C implementation or is there a rather simple way to accelerate the objective c version? Please be patient with me since I am new to objective c programming.
Any hints and recommendations are greatly appreciated! Thanks in advance.

If your straight C method works fine, and your Objective C method puts you at a disadvantage due to method calling, and you need the performance, then there's no reason not to use straight C. From looking at your code, I don't see any advantage to whatever "structure" you're adding, because the working code looks almost precisely the same. In other words, Obj-C doesn't buy you anything here, but straight C does, so go with what's best for your user, because in terms of maintainability and readability, there's no difference in the two implementations.

Trace all user defined function calls using gdb

I want to trace all the user defined functions that have been called (in order, and preferably with the input params). Is there a way to do this using gdb? OR is there a better free/opensource application out there to do this job?
Please note that I want to print only the user defined function calls.
for example:
int abc(int a, char b) {
return xyz(a+b);
}
int xyz(int theta) {
return theta * theta;
}
I need the following output:
abc(a, b);
xyz(theta);
My codebase is pretty huge and is compiled in various pieces and hence I want to avoid using a tool which needs me to compile my source code again with some options enabled.
PS: I found that there are ways where you can define functions in gdb and pass in function names as params to find out there call trace. But in my case the code base is pretty huge and I'm starting off with it, so I'm not even sure what all functions are called etc. It wouldn't be practical to list all the functions in here.
TIA,

You need to run some form of third-party tool against your binary, something like Quantify (IBM) or Callgrind, (or as #Paul R mentioned above gprof). They will generate a call tree, which will give you the information you need, google: "call tree C functions" for example, will reveal lots of goodies you can link against your code...
If you want to roll your own, you'd need to add one line to to the top of each of your functions which creates a stack allocated object and you can catch the ctor/dtor sequence to know when you've entered and exited the function and then maintain a "stack" of these to generate your own call tree... pretty easy to do (in single threaded, complex in multi-threaded)...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js