Are there any helper methods to traverse the AST, basic blocks etc. generated by LLVM compiler for a C code ?
If you're trying to load a module (from a .bc file compiled from a .c file by clang -emit-llvm) and traverse its functions, basic blocks, etc., then you might want to start with the llvm::Module class. It has functions for iterating through global variables and functions. Then the llvm::Function class has functions for iterating through basic blocks. Then the llvm::BasicBlock class has functions for iterating through instructions.
Or if you'd prefer, you can traverse the AST structure created by Clang. Here's some example code: http://eli.thegreenplace.net/2012/06/08/basic-source-to-source-transformation-with-clang/.
Basically, it is impossible to do full operations on the AST in LLVM. Because the LLVM pass works on bitcode level not on the AST. I think what you want is an AST iterator.
You could refer to Chapter 3 in Artem Degrachev: Clang Static Analyzer: A Checker Developer's Guide.
Clang now have a page for checker developers. You could find more following the link.
Related
I've read this link but still don't fully understand what's the difference between TraverseDecl and VisitDecl (and their use case) http://clang.llvm.org/doxygen/classclang_1_1RecursiveASTVisitor.html
Which method should I be overriding when writing my RecursiveASTVisitor?
TraverseDecl tells the frontend library's ASTConsumer to visit declarations recursively from the AST. Then VisitDecl is called where you can extract the relevant information.
Follow these two links for more details and a simple checker example:
http://clang.llvm.org/docs/RAVFrontendAction.html
How to traverse clang AST manually ?
I'm working on a library which I'd like certain introspection features to be available. Let's say I'm compiling with clang, so I have access to libtooling or whatever.
What I'd like specifically is for someone to be able to view the LLVM IR of an already-compiled function as part of the program. I know that, when compiling, I can use -emit-llvm to get the IR. But that saves it to a file. What I'd like is for the LLVM IR to be embedded in and retrievable from the program itself -- e.g. my_function_object.llvm_ir()
Is such a thing possible? Thanks!
You're basically trying to have reflection to your program. Reflection requires the existence of metadata in your binary. This doesn't exist out of the box in LLVM, as far as I know.
To achieve an effect like this, you could create a global key-value dictionary in your program, exposed via an exported function - something like IRInstruction* retrieve_llvm_ir_stream(char* name).
This dictionary would map some kind of identifier (for example, the exported name) of a given function to an in-memory array that represents the IR stream of that function (each instruction represented as a custom IRInstruction struct, for example). The types and functions of the representation format (like the custom IRInstruction struct) will have to be included in your source.
At the step of the IR generation, this dictionary will be empty. Immediately after the IR generation step, you'll need to add a custom build step: open the IR file and populate the dictionary with the data - for each exported function of your program, inject its name as a key to the dictionary and its IR stream as a value. The IR stream would be generated from the definitions of your functions, as read by your custom build tool (which would leverage the LLVM API to read the generated IR and convert it to your format).
Then, proceed to the assembler and linker as before.
I am wondering how to convert Julia code into runnable LLVM IR(the *.ll file).
There is a command named code_llvm which can compile a Julia function into LLVM IR. But its result contains something like %jl_value_t* which seems to be an (hidden?) object type, and it doesn't look like pure LLVM IR.
Is there a way to generate runnable LLVM IR from Julia, so that I can run it with lli xx.ll (or do something else)?
The code_llvm function just shows the Function by default, but you can also have it print out a complete module:
open("file.ll", "w") do io
code_llvm(io, +, (Int, Int); raw=true, dump_module=true, optimize=true)
end
This output (file.ll) is now valid to use with other llvm tools, such as llc and opt. However, since it's just the code for this one function, and assumes the existence of all the other code and data, it's not necessarily going to work with lli, so buyer beware.
If you want a complete system, you might be interested in the --output-bc flag to Julia, which will dump a complete object file in LLVM format. This is used extensively internally to build and bootstrap Julia. It's also wrapped into a utility tool at https://github.com/JuliaLang/PackageCompiler.jl to automate some of these steps.
Based on the Kaleidoscope and Kaleidoscope with MCJIT tutorials, I have code to create a Module and function and call it using MCJIT. The function needs a prototype:
auto ft = llvm::FunctionType::get(llvm::Type::getInt32Ty(Context), argTypes, false);
However, the example only covers Double as parameters and return values (the above uses an int). To do anything advanced, you need to pass things like classes and containers.
How do you use existing C++ classes in the module?
Sure, you can link to any library you want, but you need to declare function prototypes to use them. If the library API has classes, how do you declare them?
What I want is something like this:
auto ft = llvm::FunctionType::get(llvm::Type::getStructTy("class.std::string"), argTypes, false);
where class.std::string has been imported from string.h.
The LLVM API only has primitive types. You can define structs to represent the classes, but this is way too hard to do manually (and not portable).
A way to do it might be to compile the class to bitcode and read it into a module, but I want to avoid temporary files if possible. Also I'm not sure how to extract the type from the module but it should be possible. I tried this on a header file of one of my classes (I renamed the header file to a cpp file otherwise clang would make into a .gch precompiled header) and the result was just a constant... maybe it was optimised out? I tried it on the cpp file and it resulted in 36000 lines of code...
Then I found this page. Instead of using the LLVM API, I should use the Clang API because Clang, as a compiler, can compile the code into a Module. Then I can use the LLVM API with the imported Modules. Is this the right way to go? Any working source code is appreciated because it took forever just to get function calling working (the tutorials are out of date and documentation is scarce).
The way I would do it is to compile the class to LLVM IR, and then link the two modules. Then, there's two options to extract the type from the module:
First, you can use the llvm::TypeFinder. The way you use it is by creating it, and then calling run() on it with the module as an argument. This code snippet will print out all of the types in the module:
llvm::TypeFinder type_finder;
type_finder.run(module, true);
for (auto t : type_finder) {
std::cout << t->getName().str() << std::endl;
}
Alternatively, it's possible to use Module's getIdentifiedStructTypes() method and iterate over the resulting vector in the same way as above.
I´m looking for a free software, tool, library or whatever to analyze C++ code.
As far as I know tools for 'static code analysis' like 'Cppcheck' are not helpful to me because I cannot define my own rules or output. A library which gives me an AST (Abstract Syntax Tree) of a C++ file would be the best, I guess.
My goal is to program a command line tool which generates an output containing something like:
Test.cpp:
The file contains 42 global Integers.
The Class Test has the following attributes:
String name,
Int size.
The Class Test contains the following global functions:
void Test(),
int getTestSize(),
String renameTest(String newName).
You can use clang and the existing analyzer or implement your own analyzer on top of the provided APIs.
As David suggest Clang is a good choice, you have just to implement your own ASTConsumer , you can take as example the already existing clang ASTConsumers like ASTPrinter or ASTDumpXML