Get filename and location from Function

Get filename and location from Function - llvm

I have an LLVM pass that iterates over LLVM IR code and I would like to get a directory and a filename for both the functions and basic blocks for the original code. I know that when I have an Instruction Pointer, I can easily get the information using the code below: Thanks to #hailinzeng
(How to get the filename and directory from a LLVM Instruction?)
const llvm::DebugLoc &location = i_iter->getDebugLoc();
if (location && debugLocationInfoOn) {
std::string dbgInfo;
llvm::raw_string_ostream rso(dbgInfo);
location.print(rso);
std::cout << rso.str();
}
However, since the class Function and BasicBlock do not have a member function getDebugLoc(), this doesn't work. I saw another post here using the metadata but I do not know how to get to the DILocation or DIScope from the metadata. Using
MDNode *n = inst->getMetadata("dbg");
DILocation loc(n); `
Gives the error below
/usr/lib/llvm-3.9/include/llvm/IR/Metadata.def:83:42: note: forward declaration of 'llvm::DILocation'
HANDLE_SPECIALIZED_MDNODE_LEAF_UNIQUABLE(DILocation)
I'm using llvm 3.9.
UPDATE ::
Thanks Stanislav Pankevich. I wasn't including the right headers but now I have a new issue. DILocation requires LLVMContext, StorageType, and unsigned Line. How do I get the line number and and storage type from a function pointer?
DILocation(LLVMContext &C, StorageType Storage, unsigned Line,
For those working with a similar issue, you can get LLVMContext using
llvm::MDNode * testmd = F.getMetadata("dbg");
F.getContext ()

If you look at the .ll file for your code, you would see that every function has DINode associated with it, something like !<some_number>. That's the metadata node number that is has info about that function. The type of that node is DISubprogram You can access it like this:
SmallVector<std::pair<unsigned, MDNode *>, 4> MDs;
F.getAllMetadata(MDs);
for (auto &MD : MDs) {
if (MDNode *N = MD.second) {
if (auto *subProgram = dyn_cast<DISubprogram>(N)) {
errs() << subProgram->getLine();
}
}
}
You can use all the information that is there in the debug node.

How about if we want column details, which is not possible with DISubprogram.
I tried this:
DILocation *debugLocation = dyn_cast<DILocation>(N);
debugLocation->getLine();
The sample.ll file does contain these lines:
!10 = !DILocation(line: 1, column: 1, scope: !1)
However, it gives core dumped at run time. Any suggestions please how to get it working.

Related

Workflow of LLVM and clang

I am just a beginner in LLVM, and (https://www.cs.cornell.edu/~asampson/blog/llvm.html) webpage along with the stack overflow, and my fellow researcher has helped me a lot.
I would first like to illustrate what I am trying to work on (the problem) and then I will describe the approach that I have taken to work on the problem.
Then, I need your advice and guidance if I am missing anything.
Work Problem
My input is a C program and output is its SSA form in prefix representation printed in an output File.
For eg, if the C code segment is :
x=4;
x++;
z=x+7;
The output SSA form in prefix representation is :
( = x0 4)
( = x1 (+ x0 1) )
( = z (x1 + 7) )
Please ignore the actual IR instruction for now, just assume that I am able to read the IR and convert it to this form, with some extra statements (which I am not presenting here for readability).
My ignorant Approach of using LLVM (Please find the complete program below)
using namespace llvm;
namespace {
struct TestPass: public ModulePass {
IRssa::ptr ir_ssa = IRssa::ptr(new IRssa());
static char ID;
typedef std::list<std::pair<std::string, std::list<Instruction *> > > funcDump;
TestPass() : ModulePass(ID) { }
std::map<std::string, funcDump> workingList;
bool runOnModule(Module &M) {
std::string funcName, bkName;
for (Function &F : M) { //Found a new Function
if (isa<Function>(F) && !(F.getName().startswith("llvm."))) {
funcName = F.getName();
std::pair<std::string, std::list<Instruction *> > funcBlockList;
std::list<std::pair<std::string, std::list<Instruction *> > > wholeFuncBlocks;
for (BasicBlock &B : F) { //Blocks of the Function
if (isa<BasicBlock>(B)) {
bkName = B.getName();
}
std::list<Instruction *> listInst;
for (auto &I : B) {
Instruction *ins;
ins = &I;
listInst.push_back(ins);
}
funcBlockList.first = bkName;
funcBlockList.second = listInst;
wholeFuncBlocks.push_back(funcBlockList);
}
workingList[funcName] = wholeFuncBlocks;//Mapping of the functions
}
}
ir_ssa->setFunctionDump(workingList);
funcDump funcData;
funcData = workingList["start_program"]; //Starting from the start_program function
convertFunctionToSSA(funcData, ir_ssa);
std::ofstream outFile;
outFile.open("Out.ssa");
printSSA_toFile(outFile, ir_ssa);
return false;
}
};
}
char TestPass::ID = 0;
static RegisterPass<TestPass> X("testPass", "Testing A Pass");
static void registerTestPass(const PassManagerBuilder &, legacy::PassManagerBase &PM) {
PM.add(new TestPass());
}
static RegisterStandardPasses RegisterMyPass(PassManagerBuilder::EP_ModuleOptimizerEarly, registerTestPass);
static RegisterStandardPasses RegisterMyPass0(PassManagerBuilder::EP_EnabledOnOptLevel0, registerTestPass);
//Automatically enable the pass (http://adriansampson.net/blog/clangpass.html)
Description:
As shown above I am calling a runOnModule() and collecting all the IR Instructions of all the blocks for each function in the program into a workingList data structure (a std::map in this case). After all the functions in the given program is finished reading, I then do my required task of reading IR instructions one at a time, function by function and block by block (in the user defined function convertFunctionToSSA(funcData, ir_ssa) taking the whole function IR as argument and the result of processing these IR is returned in the argument ir_ssa). I am also printing the resulted value from ir_ssa onto the output file outFile.
Now How do I Run (I type the following)
clang -O1 -g -Xclang -emit-llvm -c someProgram.c -o test.bc
opt -O1 -instnamer -mem2reg -simplifycfg -loops -lcssa -loop-simplify -loop-rotate -loop-unroll -unroll-count=15 -unroll-allow-partial -load src/libTestPass.so -testPass test.bc -o test
My Expectation
I assume (as per my understanding) that the above two commands does the following.
First clang takes the program someProgram.c and generates IR as an output file "test.bc".
The next command opt, takes the file "test.bc" and then applies all the above passes one by one till the last pass "-unroll-allow-partial" and it also links my library libTestPass.so (this .so file is generated on compiling the above ModulePass program) then, finally the pass "-testPass" which I think is the pass where I am doing my process (of converting to SSA prefix representation).
Your Advice and Comments
I am not sure if LLVM is actually running in the sequence as I am assuming (My Expectation). Kindly comment if I am missing anything or if my assumption is not correct. Also please feel free to ask more details if necessary.
Current Problem Faced
I am able to successfully convert most of the C programs but on a specific program I stuck with some error. Debugging this error lead me to think that I am missing somethink or my assumption about this working of LLVM in regards to the calling order of clang and opt is not correct.
Your help is highly appreciated.

Forward declaration of function LLVM

I'm trying to use forward declaration of functions in LLVM, but I'm not able to do it... The reason for doing that is this error:
error: invalid forward reference to function 'f' with wrong type! "
Right now I'm trying to do it with this code:
std::vector<Type *> args_type = f->get_args_type();
Module* mod = get_module();
std::string struct_name("struct.");
struct_name.append(f->get_name());
Type* StructTy = mod->getTypeByName(struct_name);
if (!StructTy) {
StructTy = Type::getVoidTy(getGlobalContext());
}
FunctionType *ftype = FunctionType::get(StructTy, args_type, false);
//Function *func = Function::Create(ftype, GlobalValue::InternalLinkage, f->get_name(), get_module());
Constant* c = mod->getOrInsertFunction(f->get_name(), ftype);
Function *func = cast<Function>(c);
But it does not show in the IR when I generate the code. When I create the function again using this same code shown above, it works. I wonder if it's because I insert a BasicBlock right after when I start insert things within the function.
Right now that's how it is my IR
define internal void #main() {
entry:
...
}
define internal %struct.f #f(i32* %x) {
entry:
...
}
I believe that putting an declare %struct.f #f(i32*) before the #main function would fix this issue, but I can't figure out how to do it...
Summary: I just want to create something with a declare on top of the file, so I can use the define it later and start inserting instructions of the function

Ok, it seems LLVM does that 'automatically'.
I just realized that the functions changed their orders when I ran the code again. So, if you create a function before even though you don't insert any code (body), it will create the prototype and wait for any further declarations to the body, as long as you reference this function with the getOrInsert() method of the Module class.
I don't know if this is the right answer or if it's clear, but it solved my problem...

How can I get the name of function from StoreInst's Value In LLVM

I have a structure and it has a pointer to function as follows.
typedef struct
{
void (*p)();
int n;
} myStruct;
I used it as folllowing:
myStruct * a = malloc( sizeof(myStruct));
a->n=88;
a->p = &booooo;
a->p()
In LLVM, How can I get the name of function (booooo) and struct element (a->p) to save it in symbol table and print it later.
I could find the name of the function in StoreInst.
When I print its value I got this result:
void (...)* bitcast (void ()* #booooo to void (...)*)
How can I get only the name (booooo) from the value.

There are (at least) two kinds of casts in LLVM IR: BitCastInst and bitcast values. You have the later. Fortunately, there is a method for retrieving the original value within the bitcast: stripPointerCasts(). It took me sometime to figure out this distinction.
Here is my usage of the routine, where I was trying to identify the function called (BasicBlock::iterator I):
if (CallInst *ci = dyn_cast<CallInst>(&*I)) {
Function *f = ci->getCalledFunction();
if (f == NULL)
{
Value* v = ci->getCalledValue();
f = dyn_cast<Function>(v->stripPointerCasts());
if (f == NULL)
{
continue;
}
}
const char* fname = f->getName().data();

As explained in the previous question asking the same thing [marginally different], you are better off using the AST form that the Clang compiler produces, rather than the LLVM IR form. It is a much more direct representation of the C or C++ code than the LLVM IR, and easier to work with in general.
But from the StoreInst you can use getValueOperand to get the value that is being stored, and then getName of the value. Of course, like I also said in comments the previous answer, it's not very hard to make the code hard to derive what the original value stored was.
In otherwords, if we have an llvm::Instruction *inst, we could do this:
if (llvm::StoreInst* si = llvm::dyn_cast<llvm::StoreInst>(inst))
{
std::string name = si->getValueOperand()->getName();
}
[Code is not tested, not compiled, no guarantee provided, I just wrote it as part of this answer with the intention that it may work]

How to use VARIANT* with dynamicCall?

I'm trying to use a COM object and i'm having problem with the parameter type VARIANT*. I can use the functions of the COM object just fine, except when they have a parameter of this type.
The doc generated by generateDocumentation is :
QVariantList params = ...
object->dynamicCall("GetRanges(int,int,int&, QVariant&)", params);
According to the doc provided with the COM object, the parameters should be of type LONG, LONG, LONG* and VARIANT*, and it is precised that the VARIANT* is a pointer to a VARIANT containing an array of BSTR.
I should normally be able to retrieve the third and fourth parameter (of type LONG* and VARIANT*), and their values are not used by the function.
Here is my code (a and b are int previously initialized):
QStringList sl;
QVariantList params;
int i = -1;
params << QVariant (a);
params << QVariant (b);
params << QVariant (i);
params << QVariant (sl);
comobject->dynamicCall("GetRanges(int,int,int&,QVariant&)",params);
sl = params[3].toStringList();
i = param[2].toInt();
Now with that code, all i get is an error QAxBase: Error calling IDispatch member GetRanges: Unknown error, which is not very helpful.
I tried to change some things and I managed to progress (sort of) by using this code :
QStringList sl;
QVariant v = qVariantFromValue(sl);
QVariantList params;
int i = -1;
params << QVariant (a);
params << QVariant (b);
params << QVariant (i);
params << qVariantFromValue((void*)&v);
comobject->dynamicCall("GetRanges(int,int,int&,QVariant&)",params);
sl = params[3].toStringList();
i = param[2].toInt();
It gets rid of the error, and the value of i is correct at the end, but sl is still empty. And I know it should not be, because I have a sample demo in C# that works correctly.
So if anyone has an idea on how to make it works...
Otherwise I looked around a bit and saw that it was also possible to query the interface ans use it directly, but I didn't understand much, and I'm not sure it will solve my problems.
I'm on a Windows7 64 bits platform, and I'm using msvc2012 as compiler. I'm using Qt 5.1.0 right now, but it didn't work in the 5.0.2 either.

I guess you really can't do it with dynamicCall.
I finally found how to do it. It was easier than I'd thought. With the installation of Qt comes a tool called dumpcpp. Its full path for me was C:\Qt\Qt5.1.0x86\5.1.0\msvc2012\bin\dumpcpp.exe (obviously depends on settings). You can just add the bin folder to your path to make it easier to use.
Then I went into my project folder and executed this command :
dumpcpp -nometaobject {00062FFF-0000-0000-C000-000000000046} (the CLSID is just for the example, not the one I used)
It creates a header file, you can include it in the file where you're trying to use the COM Object.
In this file in my case there was two classes (IClassMeasurement and ClassMeasurement) in a namespace (MeasurementLib). Again, the names are not the real ones.
In your initial project file, you can call the desired function like this :
MeasurementLib::ClassMeasurement test; //Do not use IClassMeasurement, you only get write access violations
QVariant rangesVar;
int p1 = 0;
int p2 = 0;
int p3 = 0;
test.getRanges(p1,p2,p3,ranges);
QStringList ranges = ranges.toStringList();
Hopes that it helps someone !

In gdb, I can call some class functions, but others "cannot be resolved". Why?

I have not worked on shared pointers yet .. I just know the concept. I'm trying to debug functions in the following c++ class, which stores data of an XML file (read-in via the xerces library).
// header file
class ParamNode;
typedef boost::shared_ptr<ParamNode> PtrParamNode;
class ParamNode : public boost::enable_shared_from_this<ParamNode> {
public:
...
typedef enum { DEFAULT, EX, PASS, INSERT, APPEND } ActionType;
bool hasChildren() const;
PtrParamNode GetChildren();
PtrParamNode Get(const std::string& name, ActionType = DEFAULT );
protected:
....
ActionType defaultAction_;
}
Now if I'm debugging a piece of code in which I have an instance of the pointer to the class ParamNode, and it's called paramNode_
PtrParamNode paramNode_;
// read xml file with xerces
paramNode_ = xerces->CreateParamNodeInstance();
// now, the xml data is stored in paramNode_.
std::string probGeo;
// this works in the code, but not in (gdb)!!
paramNode_->Get("domain")->GetValue("gt",probGeo);
cout << probGeo << endl; // <-- breakpoint HERE
Using gdb I'm inspecting the paramNode_ object:
(gdb) p paramNode_
$29 = {px = 0x295df70, pn = {pi_ = 0x2957ac0}}
(gdb) p *paramNode_.px
$30 = {
<boost::enable_shared_from_this<mainclass::ParamNode>> = {weak_this_ = {px = 0x295df70, pn = {pi_ = 0x2957ac0}}},
_vptr.ParamNode = 0x20d5ad0 <vtable for mainclass::ParamNode+16>,
...
name_= {...},
children_ = {size_ = 6, capacity_ = 8, data_ = 0x2969798},
defaultAction_ = mainclass::ParamNode::EX, }
and print its members:
(gdb) ptype *paramNode_.px
type = class mainclass::ParamNode : public boost::enable_shared_from_this<mainclass::ParamNode> {
protected:
...
mainclass::ParamNode::ActionType defaultAction_;
public:
bool HasChildren(void) const;
mainclass::PtrParamNode GetChildren(void);
mainclass::PtrParamNode Get(const std::string &, mainclass::ParamNode::ActionType);
However, I can only call the functions HasChildren or GetChildren, whereas calling Get from gdb results in an error:
(gdb) p (paramNode_.px)->HasChildren()
$7 = true
(gdb) p (paramNode_.px)->GetChildren()
$8 = (mainclass::ParamNodeList &) #0x295dfb8: {
size_ = 6,
capacity_ = 8,
data_ = 0x29697a8
}
(gdb) p (paramNode_.px)->Get("domain")
Cannot resolve method mainclass::ParamNode::Get to any overloaded instance
(gdb) set overload-resolution off
(gdb) p (paramNode_.px)->Get("domain")
One of the arguments you tried to pass to Get could not be converted to what the function wants.
(gdb) p (paramNode_.px)->Get("domain", (paramNode_.px).defaultAction_)
One of the arguments you tried to pass to Get could not be converted to what the function wants.
In the code, executing the Get("domain") function works just fine. Why is that? I'm thankful if you include explanations in your answer, due to my limited knowledge of shared pointers.

gdb is not a compiler, it will not do the (not-so-)nice user-defined type conversions for you. If you wish to call a function that wants a string, you need to give it a string, not a const char*.
Unfortunately, gdb cannot construct an std::string for you on the command line, again because it is not a compiler and object creation is not a simple function call.
So you will have to add a little helper function to your program, that would take a const char* and return an std::string&. Note the reference here. It cannot return by value, because then gdb will not be able to pass the result by const reference (it's not a compiler!) You can choose to return a reference to a static object, or to an object allocated on the heap. In the latter case it will leak memory, but this is not a big deal since the function is meant to be called only from the debugger anyway.
std::string& SSS (const char* s)
{
return *(new std::string(s));
}
Then in gdb
gdb> p (paramNode_.px)->Get(SSS("domain"))
should work.

In such a situation I just had success after giving the command
set overload-resolution off

A couple additions to the previous answer --
gdb will probably eventually learn how to do conversions like this. It can't now; but there is active work on improving support for C++ expression parsing.
gdb doesn't understand default arguments, either. This is partly a bug in gdb; but also partly a bug in g++, which doesn't emit them into the DWARF. I think DWARF doesn't even define a way to emit non-trivial default arguments.

n.m.'s answer is great, but to avoid having to edit your code and recompile take a look at the solution given in Creating C++ string in GDB.
In that solution they demonstrate allocating space for a std::string on the heap and then initializing a std::string to pass into the function they'd like to call from gdb.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Get filename and location from Function - llvm

Related

Workflow of LLVM and clang

Forward declaration of function LLVM

How can I get the name of function from StoreInst's Value In LLVM

How to use VARIANT* with dynamicCall?

In gdb, I can call some class functions, but others "cannot be resolved". Why?

Categories

Resources