How do I correclty implement an LLVM InstVisitor? - llvm

To learn LLVM I made a ModulePass that runs through the functions, basic blocks, and finally instructions. At some point I want to dig into the instructions and perform analysis. While reading the documentation I came across http://llvm.org/docs/doxygen/html/classllvm_1_1InstVisitor.html and the documentation recommends using these structures to efficiently traverse IR rather than do a lot of if(auto* I = dyn_cast<>()) lines.
I tried making a variation of the documentation example, but for BranchInst:
struct BranchInstVisitor : public InstVisitor<BranchInst> {
unsigned Count;
BranchInstVisitor() : Count(0) {}
void visitBranchInst(BranchInst &BI){
Count++;
errs() << "BI found! " << Count << "\n";
}
}; // End of BranchInstVisitor
Within my ModulePass, I created the visitor:
for(Module::iterator F = M.begin(), modEnd = M.end(); F != modEnd; ++F){
BranchInstVisitor BIV;
BIV.visit(F);
...
Unfortunately, my call to visit(F) fails when I compile:
error: invalid static_cast from type ‘llvm::InstVisitor<llvm::BranchInst>* const’ to type ‘llvm::BranchInst*’ static_cast<SubClass*>(this)->visitFunction(F);
How do I correctly implement an LLVM InstVisitor? Are InstVisitors supposed to be run outside of passes? If I missed documentation, please let me know where to go.

The template parameter should be the type you're declaring, not a type of instruction, like this:
struct BranchInstVisitor : public InstVisitor<BranchInstVisitor>
Each visitor can override as many visit* methods as you want -- it's not like each visitor is tied to one type of instruction. That wouldn't be very useful.

Related

how to correctly pass data structures between custom llvm passes

I have a Function pass, called firstPass, which does some analysis and populates:
A a;
where
typedef std::map< std::string, B* > A;
class firstPass : public FunctionPass {
A a;
}
typedef std::vector< C* > D;
class B {
D d;
}
class C {
// some class packing information about basic blocks;
}
Hence I have a map of vectors traversed by std::string.
I wrote associated destructors for these classes. This pass works successfully on its own.
I have another Function pass, called secondPass, needing this structure of type A to make some transformations. I used
bool secondPass::doInitialization(Module &M) {
errs() << "now running secondPass\n";
a = getAnalysis<firstPass>().getA();
return false;
}
void secondPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<firstPass>();
AU.setPreservesAll();
}
The whole code compiles fine, but I get a segmentation fault when printing this structure at the end of my first pass only if I call my second pass (since B* is null).
To be clear:
opt -load ./libCustomLLVMPasses.so -passA < someCode.bc
prints in doFinalization() and exits successfully
opt -load ./libCustomLLVMPasses.so -passA -passB < someCode.bc
gives a segmentation fault.
How should I wrap this data structure and pass it to the second pass without issues? I tried std::unique_ptr instead of raw ones but I couldn't make it work. I'm not sure if this is the correct approach anyway, so any help will be appreciated.
EDIT:
I solved the problem of seg. fault. It was basically me calling getAnalysis in doInitialization(). I wrote a ModulePass to combine my firstPass and secondPass whose runOnModule is shown below.
bool MPass::runOnModule(Module &M) {
for(Function& F : M) {
errs() << "F: " << F.getName() << "\n";
if(!F.getName().equals("main") && !F.isDeclaration())
getAnalysis<firstPass>(F);
}
StringRef main = StringRef("main");
A& a = getAnalysis<firstPass>(*(M.getFunction(main))).getA();
return false;
}
This also gave me to control the order of the functions processed.
Now I can get the output of a pass but cannot use it as an input to another pass. I think this shows that the passes in llvm are self-contained.
I'm not going to comment on the quality of the data structures based on their C++ merit (it's hard to comment on that just by this minimal example).
Moreover, I wouldn't use the doInitialization method, if the actual initialization is that simple, but this is a side comment too. (The doc does not mention anything explicitly about it, but if it is ran once per Module while the runOn method is ran on every Function of that module, it might be an issue).
I suspect that the main issue seems to stem from the fact A a in your firstPass is bound to the lifetime of the pass object, which is over once the pass is done. The simplest change would be to allocate that object on the heap (e.g. new) and return a pointer to it when calling getAnalysis<firstPass>().getA();.
Please note that using this approach might require manual cleanup if you decide to use a raw pointer.

LLVM unable to get a required analysis

I am writing a pass that needs information about loops. Therefore I am overriding getAnalysisUsage(AnalysisUsage&) to let the pass manager know that my pass depends on LoopInfoWrapperPass. However, when I try to get the result of that analysis, LLVM asserts that the analysis wasn't required by my pass. Here's a simple pass that I'm having trouble with:
#include <llvm/Pass.h>
#include <llvm/Support/raw_ostream.h>
#include <llvm/Analysis/LoopInfo.h>
struct Example : public ModulePass {
static char ID;
Example() : ModulePass(ID) {}
bool runOnModule(Module& M) override {
errs() << "what\n";
LoopInfo& loops = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
loops.print(errs());
return false;
}
virtual void getAnalysisUsage(AnalysisUsage& AU) const override {
errs() << "here\n";
AU.addRequired<LoopInfoWrapperPass>();
}
};
char Example::ID = 0;
static RegisterPass<Example> X("example", "an example", false, false);
When I run this pass, the two debug statements are printed in the correct order (here then what) but when getAnalysis<LoopInfoWrapperPass>() is called, I get this assertion error:
opt: /home/matt/llvm/llvm/include/llvm/PassAnalysisSupport.h:211: AnalysisType& llvm::Pass::getAnalysisID(llvm::AnalysisID) const [with AnalysisType = llvm::LoopInfoWrapperPass; llvm::AnalysisID = const void*]: Assertion `ResultPass && "getAnalysis*() called on an analysis that was not " "'required' by pass!"' failed.
This is the same method that is given in LLVM's documentation on writing passes, so I'm not quite sure what's going wrong here. Could anyone point me in the right direction?
LoopInfoWrapperPass is derived from FunctionPass. Your Example class, however, derives from ModulePass. It works on the module level, so you'll need to tell LoopInfoWrapperPass which function you want to analyze. Basically, you might want to loop every function f in the module, and use getAnalysis<LoopInfoWrapperPass>(f).
Alternatively, the easiest way to fix the code above is to replace ModulePass with FunctionPass and runOnModule(Module& M) with runOnFunction(Function& F). Then, getAnalysis<LoopInfoWrapperPass>() should work just fine.

Find parent of a declaration in Clang AST

I'm using clang to do some analysis and I need to find parent of a declaration in AST. For instance, in the following code I have int x and I want to get its parent which should be the function declaration :
int main(int x) { return 0 }
I know as mentioned in this link http://comments.gmane.org/gmane.comp.compilers.clang.devel/2152 there is a ParentMap class to track parent nodes. However, this just represents a map from Stmt* -> Stmt* and I need to find parent of a declaration. Does anyone know how I could do this?
you can use AstContext::getParents() to find parent of a ast node。
example code like this:
const Stmt* ST = str;
while (true) {
//get parents
const auto& parents = pContext->getParents(*ST);
if ( parents.empty() ) {
llvm::errs() << "Can not find parent\n";
return false;
}
llvm::errs() << "find parent size=" << parents.size() << "\n";
ST = parents[0].get<Stmt>();
if (!ST)
return false;
ST->dump();
if (isa<CompoundStmt>(ST))
break;
}
the AstContext::getParents() can receive a stmt parameter or a decl parameter。
It is exactly ParentMap like described in the linked thread that you are looking for. In clang specific declarations all inherit from clang::Decl which provides
virtual Stmt* getBody() const;
Alternatively you might also be happy with the ready-made AST matchers which make creating queries on the AST much easier. The clang-tidy checks make heavy use of them and are pretty easy to follow, see the sources [git].
About parent of a FunctionDecl, there something to notice: a declaration of Function may be a member of a class or it may be an "independent" declaration.
If FunctionDecl is a member of a class then FunctionDecl is a CXXMethodDecl so you can check with:
isa<CXXMethodDecl>(FunctionDecl *FD).
Then you can get the parent of CXXMethodDecl with getParent() method. This method is absent in FunctionDecl.

Struct expression parameter vs. type parameter

I'm making an input range to iterate over a custom container that holds data points that need to remain accurately paired as inputs and targets. I need different Ranges for returning training data (double[][]), inputs (double[]) and the targets (also double[]). I managed to get the following code to compile and work perfectly, but I don't know why.
public struct DataRange(string type)
if( type == "TrainingData" ||
type == "InputData" ||
type == "TargetData" )
{
private immutable(int) length;
private uint next;
private Data data;
this(Data d){
this.length = d.numPoints;
this.next = 0;
this.data = d;
}
#property bool empty(){return next == length;}
#property auto front(){
static if(type == "TrainingData")
return this.data.getTrainingData(next);
else static if(type == "InputData")
return this.data.getInputData(next);
else return this.data.getTargetData(next);
}
void popFront(){++next;}
}
static assert(isInputRange!(DataRange!"TrainingData"));
static assert(isInputRange!(DataRange!"InputData"));
static assert(isInputRange!(DataRange!"TargetData"));
I've been reading the "The D Programming Language" by Alexandrescu, and I have found parameterized structs of the form
struct S(T){...} // or
struct S(T[]){...}
but these take type parameters, not expressions like I've done. I haven't been able to find any similar examples on dlang.org with parameterized types.
This compiles and works on DMD 2.066 and GDC 4.9.0.
I don't even know why I tried this, and looking back at it I don't know why it works. Anybody know what I'm missing? Where is this documented?
Ok, I found the answer. Though this wasn't specifically mentioned or described in any of the tutorials or anywhere in the book, I was eventually able to find it at http://dlang.org.template.html. Basically there are two things going on here.
1.) Though my code says struct, this is really a template (that results in a struct). I have seen examples of this online and in the book, though it wasn't described as a template. It was a bit confusing because I didn't use the template keyword, and in the book they are described as "parameterized."
2.) From the website linked above...
Template parameters can be types, values, symbols, or tuples
So in my case my template parameter was a symbol. The examples in the book used types.
Digging into the language specifications on the website reveals there is a lot more going on than is covered in the book!
Alternatively you could use an enum to simplify the constraint in such a way that a wrong template instantiation is impossible (even if in your code the template constraint does it perfectly). example:
enum rangeKind{training, input, target};
public struct DataRange(rangeKind Kind)
{
}
void main(string args[])
{
DataRange!(rangeKind.training) dr;
}

Syntax for std::binary_function usage

I'm a newbie at using the STL Algorithms and am currently stuck on a syntax error. My overall goal of this is to filter the source list like you would using Linq in c#. There may be other ways to do this in C++, but I need to understand how to use algorithms.
My user-defined function object to use as my function adapter is
struct is_Selected_Source : public std::binary_function<SOURCE_DATA *, SOURCE_TYPE, bool>
{
bool operator()(SOURCE_DATA * test, SOURCE_TYPE ref)const
{
if (ref == SOURCE_All)
return true;
return test->Value == ref;
}
};
And in my main program, I'm using as follows -
typedef std::list<SOURCE_DATA *> LIST;
LIST; *localList = new LIST;;
LIST* msg = GLOBAL_DATA->MessageList;
SOURCE_TYPE _filter_Msgs_Source = SOURCE_TYPE::SOURCE_All;
std::remove_copy(msg->begin(), msg->end(), localList->begin(),
std::bind1st(is_Selected_Source<SOURCE_DATA*, SOURCE_TYPE>(), _filter_Msgs_Source));
What I'm getting the following error in Rad Studio 2010. The error means "Your source file used a typedef symbol where a variable should appear in an expression. "
"E2108 Improper use of typedef 'is_Selected_Source'"
Edit -
After doing more experimentation in VS2010, which has better compiler diagnostics, I found the problem is that the definition of remove_copy only allows uniary functions. I change the function to uniary and got it to work.
(This is only relevant if you didn't accidentally omit some of your code from the question, and may not address the exact problem you're having)
You're using is_Selected_Source as a template even though you didn't define it as one. The last line in the 2nd code snippet should read std::bind1st(is_Selected_Source()...
Or perhaps you did want to use it as a template, in which case you need to add a template declaration to the struct.
template<typename SOURCE_DATA, typename SOURCE_TYPE>
struct is_Selected_Source : public std::binary_function<SOURCE_DATA *, SOURCE_TYPE, bool>
{
// ...
};
At a guess (though it's only a guess) the problem is that std::remove_copy expects a value, but you're supplying a predicate. To use a predicate, you want to use std::remove_copy_if (and then you'll want to heed #Cogwheel's answer).
I'd also note that:
LIST; *localList = new LIST;;
Looks wrong -- I'd guess you intended:
LIST *locallist = new LIST;
instead.