How to find out shared variables among functions by using LLVM API? - llvm

Recently I used LLVM API to test C++ program. Now I want to find out the shared variables among different functions, is there any way to do that? It seems that the AliasAnalysis doesn't work!
I write a Function Pass as following:
bool EscapeAnalysis::runOnFunction(Function& F) {
EscapePoints.clear();
TargetData& TD = getAnalysis<TargetData>();
AliasAnalysis& AA = getAnalysis<AliasAnalysis>();
Module* M = F.getParent();
// errs() << *M << "\n";
// Walk through all instructions in the function, identifying those that
// may allow their inputs to escape.
for(inst_iterator II = inst_begin(F), IE = inst_end(F); II != IE; ++II) {
Instruction* I = &*II;
// The most obvious case is stores. Any store that may write to global
// memory or to a function argument potentially allows its input to escape.
if (StoreInst* S = dyn_cast<StoreInst>(I)) {
Type* StoreType = S->getOperand(0)->getType();
unsigned StoreSize = TD.getTypeStoreSize(StoreType);
Value* Pointer = S->getPointerOperand();
bool inserted = false;
for (Function::arg_iterator AI = F.arg_begin(), AE = F.arg_end();
AI != AE; ++AI) {
if (!isa<PointerType>(AI->getType())) continue;
AliasAnalysis::AliasResult R = AA.alias(Pointer, StoreSize, AI, ~0UL);
if (R != AliasAnalysis::NoAlias) {
EscapePoints.insert(S);
inserted = true;
break;
}
}
if (inserted)
continue;
for (Module::global_iterator GI = M->global_begin(), GE = M->global_end();
GI != GE; ++GI) {
errs() << *GI << "\n";
AliasAnalysis::AliasResult R = AA.alias(Pointer, StoreSize, GI, ~0UL);
errs() << "R: " << R << " , NoAlias: " << AliasAnalysis::NoAlias << "\n";
if (R != AliasAnalysis::NoAlias) {
EscapePoints.insert(S);
break;
}
}
// Calls and invokes potentially allow their parameters to escape.
// FIXME: This can and should be refined. Intrinsics have known escape
// behavior, and alias analysis may be able to tell us more about callees.
} else if (isa<CallInst>(I) || isa<InvokeInst>(I)) {
EscapePoints.insert(I);
// Returns allow the return value to escape. This is mostly important
// for malloc to alloca promotion.
} else if (isa<ReturnInst>(I)) {
EscapePoints.insert(I);
// Branching on the value of a pointer may allow the value to escape through
// methods not discoverable via def-use chaining.
} else if(isa<BranchInst>(I) || isa<SwitchInst>(I)) {
EscapePoints.insert(I);
}
// FIXME: Are there any other possible escape points?
}
return false;
}
Test the main.cpp as following:
#include
using namespace std;
int X = 0;
int foo() {
X = 1;
int b = 1;
return 0;
}
int bar(int param) {
int y = X;
int z = 9;
int a = z;
++a;
return 0;
}
int main(int argc, char *argv[])
{
cout << "Hello world!" << endl;
return 0;
}
the global variable X is the shared variable between function bar and function foo.
But when I use the command as following to run the pass:
opt -load ./EscapeAnalysis.so -escape-analysis main.o | llc > main.ss
I get the result:
R: 1 , NoAlias: 0
all result are the same.
I print out the variables in escapePoint, find that variable a, z, y in function bar are in escapePoint. It is not right!
Note: I write a opt pass to test program.

Alias analysis is required if you want to identify when two different variables might point to the same memory. If you just want to check which variables are shared with other functions in the same module, you can:
Iterate over all instructions, and for each:
Iterate over all its operands, and for each:
Check whether it's a GlobalVariable (via isa, for instance), and if so:
Iterate over all the global's uses (via use_begin and use_end), and for each:
Check whether it's an Instruction, and if so:
Retrieve the enclosing function (via getParent()->getParent()), and for that function:
Check whether it is the currently-processed function. If not, it means you found a variable shared between the current function and another function.
There are also other ways of checking this, for example going over all the globals in the current module.

Related

Make compiler assume that all cases are handled in switch without default

Let's start with some code. This is an extremely simplified version of my program.
#include <stdint.h>
volatile uint16_t dummyColorRecepient;
void updateColor(const uint8_t iteration)
{
uint16_t colorData;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2:
colorData = 345;
break;
}
dummyColorRecepient = colorData;
}
// dummy main function
int main()
{
uint8_t iteration = 0;
while (true)
{
updateColor(iteration);
if (++iteration == 3)
iteration = 0;
}
}
The program compiles with a warning:
./test.cpp: In function ‘void updateColor(uint8_t)’:
./test.cpp:20:25: warning: ‘colorData’ may be used uninitialized in this function [-Wmaybe-uninitialized]
dummyColorRecepient = colorData;
~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
As you can see, there is an absolute certainty that the variable iteration is always 0, 1 or 2. However, the compiler doesn't know that and it assumes that switch may not initialize colorData. (Any amount of static analysis during compilation won't help here because the real program is spread over multiple files.)
Of course I could just add a default statement, like default: colorData = 0; but this adds additional 24 bytes to the program. This is a program for a microcontroller and I have very strict limits for its size.
I would like to inform the compiler that this switch is guaranteed to cover all possible values of iteration.
As you can see, there is an absolute certainty that the variable iteration is always 0, 1 or 2.
From the perspective of the toolchain, this is not true. You can call this function from someplace else, even from another translation unit. The only place that your constraint is enforced is in main, and even there it's done in a such a way that might be difficult for the compiler to reason about.
For our purposes, though, let's take as read that you're not going to link any other translation units, and that we want to tell the toolchain about that. Well, fortunately, we can!
If you don't mind being unportable, then there's GCC's __builtin_unreachable built-in to inform it that the default case is not expected to be reached, and should be considered unreachable. My GCC is smart enough to know that this means colorData is never going to be left uninitialised unless all bets are off anyway.
#include <stdint.h>
volatile uint16_t dummyColorRecepient;
void updateColor(const uint8_t iteration)
{
uint16_t colorData;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2:
colorData = 345;
break;
// Comment out this default case to get the warnings back!
default:
__builtin_unreachable();
}
dummyColorRecepient = colorData;
}
// dummy main function
int main()
{
uint8_t iteration = 0;
while (true)
{
updateColor(iteration);
if (++iteration == 3)
iteration = 0;
}
}
(live demo)
This won't add an actual default branch, because there's no "code" inside it. In fact, when I plugged this into Godbolt using x86_64 GCC with -O2, the program was smaller with this addition than without it — logically, you've just added a major optimisation hint.
There's actually a proposal to make this a standard attribute in C++ so it could be an even more attractive solution in the future.
Use the "immediately invoked lambda expression" idiom and an assert:
void updateColor(const uint8_t iteration)
{
const auto colorData = [&]() -> uint16_t
{
switch(iteration)
{
case 0: return 123;
case 1: return 234;
}
assert(iteration == 2);
return 345;
}();
dummyColorRecepient = colorData;
}
The lambda expression allows you to mark colorData as const. const variables must always be initialized.
The combination of assert + return statements allows you to avoid warnings and handle all possible cases.
assert doesn't get compiled in release mode, preventing overhead.
You can also factor out the function:
uint16_t getColorData(const uint8_t iteration)
{
switch(iteration)
{
case 0: return 123;
case 1: return 234;
}
assert(iteration == 2);
return 345;
}
void updateColor(const uint8_t iteration)
{
const uint16_t colorData = getColorData(iteration);
dummyColorRecepient = colorData;
}
You can get this to compile without warnings simply by adding a default label to one of the cases:
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
case 2: default:
colorData = 345;
break;
}
Alternatively:
uint16_t colorData = 345;
switch(iteration)
{
case 0:
colorData = 123;
break;
case 1:
colorData = 234;
break;
}
Try both, and use the shorter of the two.
I know there have been some good solutions, but alternatively If your values are going to be known at compile time, instead of a switch statement you can use constexpr with a static function template and a couple of enumerators; it would look something like this within a single class:
#include <iostream>
class ColorInfo {
public:
enum ColorRecipient {
CR_0 = 0,
CR_1,
CR_2
};
enum ColorType {
CT_0 = 123,
CT_1 = 234,
CT_2 = 345
};
template<const uint8_t Iter>
static constexpr uint16_t updateColor() {
if constexpr (Iter == CR_0) {
std::cout << "ColorData updated to: " << CT_0 << '\n';
return CT_0;
}
if constexpr (Iter == CR_1) {
std::cout << "ColorData updated to: " << CT_1 << '\n';
return CT_1;
}
if constexpr (Iter == CR_2) {
std::cout << "ColorData updated to: " << CT_2 << '\n';
return CT_2;
}
}
};
int main() {
const uint16_t colorRecipient0 = ColorInfo::updateColor<ColorInfo::CR_0>();
const uint16_t colorRecipient1 = ColorInfo::updateColor<ColorInfo::CR_1>();
const uint16_t colorRecipient2 = ColorInfo::updateColor<ColorInfo::CR_2>();
std::cout << "\n--------------------------------\n";
std::cout << "Recipient0: " << colorRecipient0 << '\n'
<< "Recipient1: " << colorRecipient1 << '\n'
<< "Recipient2: " << colorRecipient2 << '\n';
return 0;
}
The cout statements within the if constexpr are only added for testing purposes, but this should illustrate another possible way to do this without having to use a switch statement provided your values will be known at compile time. If these values are generated at runtime I'm not completely sure if there is a way to use constexpr to achieve this type of code structure, but if there is I'd appreciate it if someone else with a little more experience could elaborate on how this could be done with constexpr using runtime values. However, this code is very readable as there are no magic numbers and the code is quite expressive.
-Update-
After reading more about constexpr it has come to my attention that they can be used to generate compile time constants. I also learned that they can not generate runtime constants but they can be used within a runtime function. We can take the above class structure and use it within a runtime function as such by adding this static function to the class:
static uint16_t colorUpdater(const uint8_t input) {
// Don't forget to offset input due to std::cin with ASCII value.
if ( (input - '0') == CR_0)
return updateColor<CR_0>();
if ( (input - '0') == CR_1)
return updateColor<CR_1>();
if ( (input - '0') == CR_2)
return updateColor<CR_2>();
return updateColor<CR_2>(); // Return the default type
}
However I want to change the naming conventions of the two functions. The first function I will name colorUpdater() and this new function that I just shown above I will name it updateColor() as it seems more intuitive this way. So the updated class will now look like this:
class ColorInfo {
public:
enum ColorRecipient {
CR_0 = 0,
CR_1,
CR_2
};
enum ColorType {
CT_0 = 123,
CT_1 = 234,
CT_2 = 345
};
static uint16_t updateColor(uint8_t input) {
if ( (input - '0') == CR_0 ) {
return colorUpdater<CR_0>();
}
if ( (input - '0') == CR_1 ) {
return colorUpdater<CR_1>();
}
if ( (input - '0') == CR_2 ) {
return colorUpdater<CR_2>();
}
return colorUpdater<CR_0>(); // Return the default type
}
template<const uint8_t Iter>
static constexpr uint16_t colorUpdater() {
if constexpr (Iter == CR_0) {
std::cout << "ColorData updated to: " << CT_0 << '\n';
return CT_0;
}
if constexpr (Iter == CR_1) {
std::cout << "ColorData updated to: " << CT_1 << '\n';
return CT_1;
}
if constexpr (Iter == CR_2) {
std::cout << "ColorData updated to: " << CT_2 << '\n';
return CT_2;
}
}
};
If you want to use this with compile time constants only you can use it just as before but with the function's updated name.
#include <iostream>
int main() {
auto output0 = ColorInfo::colorUpdater<ColorInfo::CR_0>();
auto output1 = ColorInfo::colorUpdater<ColorInfo::CR_1>();
auto output2 = ColorInfo::colorUpdater<ColorInfo::CR_2>();
std::cout << "\n--------------------------------\n";
std::cout << "Recipient0: " << output0 << '\n'
<< "Recipient1: " << output1 << '\n'
<< "Recipient2: " << output2 << '\n';
return 0;
}
And if you want to use this mechanism with runtime values you can simply do the following:
int main() {
uint8_t input;
std::cout << "Please enter input value [0,2]\n";
std::cin >> input;
auto output = ColorInfo::updateColor(input);
std::cout << "Output: " << output << '\n';
return 0;
}
And this will work with runtime values.
Well, if you are sure you won't have to handle other possible values, you can just use arithmetic. Gets rid of he branching and the load.
void updateColor(const uint8_t iteration)
{
dummyColorRecepient = 123 + 111 * iteration;
}
I'm going to extend the Lightness Races in Orbit's answer.
The code I'm using currently is:
#ifdef __GNUC__
__builtin_unreachable();
#else
__assume(false);
#endif
__builtin_unreachable() works in GCC and Clang but not MSVC. I used __GNUC__ to check whether it is one of the first two (or another compatible compiler) and used __assume(false) for MSVC instead.

openmpi/c++: defining a mpi data type for class with members of variable length (pointers pointing to malloced memory)?

i am currently learning to use openmpi, my aim is to parallelize a simple program whose code i will post bellow.
The program is for testing my concept of paralleling a much bigger program, i hope to learn all i need to know for my actual problem if i succeed with this.
Basically it is a definition of a simple c++ class for lists. A list consists of two arrays, one integer and one double. Entries with the same indicies belong together, in a way that the integer entry is some kind of list entry identifier (maybe an object ID) and the double entry is some kind of quantifier (maybe the weight if an object).
The basic purpose of the program is to add lists together (this is the task i want to parallelize). Adding works as follows: For each entry in one list it is checked if there is the same integer entry in the the other list, if so then the double entry gets added to the double entry in the other list, if there is no such entry in the other list then both the integer and the double entries gets added to the end of the list.
Basically each summand in this list addition represents a storage and each entry is a type of object with a given amount (int is the type and double is the amount), so adding two lists means putting the stuff from the second storage to the first.
The order of the list entries is irrelevant, this means that the addition of lists is not only associative but commutative too!
My plan is to add a very large number of such lists (a few billions) so parallelizing could be to let each thread add a subset of lists first and when this is finished distribute all such sublists (one for each thread) to all of the threads.
My current understanding of openmpi is that only the last step (distributing of finished sublists) needs any special non standard stuff. Basically i need a AllReduce but with a custom data type and a custom operaton.
The first problem i have is understanding how to create a fitting MPI data type. I came to the conclusion that i probably need MPI_Type_create_struct to create a struct type.
I found this site with a nice example: http://mpi.deino.net/mpi_functions/MPI_Type_create_struct.html
from which i learned a lot but the problem is, that in this case there are fixed member arrays. In my case i have lists with arbitrary sized member variables or better with pointers pointing to memory blocks of arbitrary size. So doing it like in the example would lead to creating a new MPI datatype for each list size (using fixed sized lists could help but only in this minimalistic case, but i want to learn how to do it with arbitrary sized lists are preparation for my actual problem).
So my question is: how to create a data type for this special case? What is the best way?
I even thought to maybe write some non mpi code to serialize my class/object, (which would be a lot of work for my real problem but in this example it should be easy) to a single block of bits. Then i could simply use a MPI function to distribute those blocks to all threads and then i just have to translate it back to the actual object, and then i could let each thread simply add the "number-of-threads" lists together to have the same full reduced list on all threads (because the operation is commutative it is not important if the order is the same on each thread in the end).
The problem is that i do not know which MPI function to use to distribute a such memory blocks to each thread so that in the end each thread has an array of "number-of-threads" such blocks (similar like AllReduce but with blocks).
But thats just another idea, i would like to hear from you whats the best way.
Thank you, here is my fully working example program (ignore the MPI parts thats just preparation, you can simply compile with: g++)
As you can see, i needed to create custom copy constructors because standard of the pointer members. I hope thats not a problem for MPI?
#include <iostream>
#include <cstdlib>
#if (CFG_MPI > 0)
#include <mpi.h>
#else
#define MPI_Barrier(xxx) // dummy code if not parallel
#endif
class list {
private:
int *ilist;
double *dlist;
int n;
public:
list(int n, int *il, double *dl) {
int i;
if (n>0) {
this->ilist = (int*)malloc(n*sizeof(int));
this->dlist = (double*)malloc(n*sizeof(double));
if (!ilist || !dlist) std::cout << "ERROR: malloc in constructor failed!" << std::endl;
} else {
this->ilist = NULL;
this->dlist = NULL;
}
for (i=0; i<n; i++) {
this->ilist[i] = il[i];
this->dlist[i] = dl[i];
}
this->n = n;
}
~list() {
free(ilist);
free(dlist);
ilist = NULL;
dlist = NULL;
this->n=0;
}
list(const list& cp) {
int i;
this->n = cp.n;
this->ilist = NULL;
this->dlist = NULL;
if (this->n > 0) {
this->ilist = (int*)malloc(this->n*sizeof(int));
this->dlist = (double*)malloc(this->n*sizeof(double));
if (!ilist || !dlist) std::cout << "ERROR: malloc in copy constructor failed!" << std::endl;
}
for (i=0; i<this->n; i++) {
this->ilist[i] = cp.ilist[i];
this->dlist[i] = cp.dlist[i];
}
}
list& operator=(const list& cp) {
if(this == &cp) return *this;
this->~list();
int i;
this->n = cp.n;
if (this->n > 0) {
this->ilist = (int*)malloc(this->n*sizeof(int));
this->dlist = (double*)malloc(this->n*sizeof(double));
if (!ilist || !dlist) std::cout << "ERROR: malloc in copy constructor failed!" << std::endl;
} else {
this->ilist = NULL;
this->dlist = NULL;
}
for (i=0; i<this->n; i++) {
this->ilist[i] = cp.ilist[i];
this->dlist[i] = cp.dlist[i];
}
return *this;
}
void print() {
int i;
for (i=0; i<this->n; i++)
std::cout << i << " : " << "[" << this->ilist[i] << " - " << (double)dlist[i] << "]" << std::endl;
}
list& operator+=(const list& cp) {
int i,j;
if(this == &cp) {
for (i=0; i<this->n; i++)
this->dlist[i] *= 2;
return *this;
}
double *dl;
int *il;
il = (int *) realloc(this->ilist, (this->n+cp.n)*sizeof(int));
dl = (double *) realloc(this->dlist, (this->n+cp.n)*sizeof(double));
if (!il || !dl)
std::cout << "ERROR: 1st realloc in operator += failed!" << std::endl;
else {
this->ilist = il;
this->dlist = dl;
il = NULL;
dl = NULL;
}
for (i=0; i<cp.n; i++) {
for (j=0; j<this->n; j++) {
if (this->ilist[j] == cp.ilist[i]) {
this->dlist[j] += cp.dlist[i];
break;
}
} if (j == this->n) {// no matching entry found in this
this->ilist[this->n] = cp.ilist[i];
this->dlist[this->n] = cp.dlist[i];
this->n++;
}
}
il = (int *) realloc(this->ilist, (this->n)*sizeof(int));
dl = (double *) realloc(this->dlist, (this->n)*sizeof(double));
if (!il || !dl)
std::cout << "ERROR: 2nd realloc in operator += failed!" << std::endl;
else {
this->ilist = il;
this->dlist = dl;
}
return *this;
}
};
int main(int argc, char **argv) {
int npe, myid;
#if (CFG_MPI > 0)
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&npe);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
#else
npe=1;
myid=0;
#endif
if (!myid) // reduce output
std::cout << "NPE = " << npe << " MYID = " << myid << std::endl;
int ilist[5] = {14,17,4,29,0};
double dlist[5] = {0.0, 170.0, 0.0, 0.0, 24.523};
int ilist2[6] = {14,117,14,129,0, 34};
double dlist2[6] = {0.5, 170.5, 0.5, 0.5, 24.0, 1.2};
list tlist(5, ilist, dlist);
list tlist2(6, ilist2, dlist2);
if (!myid) {
tlist.print();
tlist2.print();
}
tlist +=tlist2;
if (myid) tlist.print();
#if (CFG_MPI > 0)
MPI_Finalize();
#endif
return 0;
}

How to get the arguments of a function pointer from a CallExpr in Clang?

I am trying to analyse C++ source code with function calls within them. I am able to analyse normal function calls to get their arguments without problem using the source code below where ce is a CallExpr object:
1. if(ce != NULL) {
2. QualType q = ce->getType();
3. const Type *t = q.getTypePtrOrNull();
4.
5. if (t != NULL) {
6. llvm::errs() << "TYPE: " << t->isFunctionPointerType() << " " << q.getAsString() << " " << t->isPointerType() << "\n";
7. } else {
8. llvm::errs() << "FUNCTION CE HAS NO TYPE?\n";
9. }
10.
11.
12. const Decl* D = ce ->getCalleeDecl();
13. while(D->getPreviousDecl() != NULL)
14. D = D->getPreviousDecl();
15.
16. llvm::errs() << "Kind: " << D->getDeclKindName() << "\n";
17.
18. FunctionDecl* fd = (FunctionDecl*) llvm::dyn_cast<FunctionDecl>(D);
19. for(int x = 0; x< fd ->getNumParams(); x++) {
20. if(fd ->getParamDecl(x)->getType()->isAnyPointerType()) {
21. // Do Stuff Here
22. }
23. }
24. }
The problem with the above source code comes on line 18, when I try to typecast the Decl from the CallExpr to a FunctionDecl, this results in fd becoming NULL if the CallExpr is from a function pointer call.
I tried to debug by trying to print the kind on line 16. For function pointers, it specifies the Decl on 12 is a VarDecl, not a FunctionDecl like normal function calls.
I also tried using the isFunctionPointerType(), but this is returning false.
Here is a piece of source code that results in a segfault:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
void* (*mp)(size_t size);
void *mpp;
mp = &malloc;
mpp = mp(30);
free(mpp);
return (0);
}
Is there a way using clang to detect whether a CallExpr is a function pointer call? and if so, how to get a list of the arguments?
I am using clang 3.1
Thanks
Use getDirectCallee() function (I am not sure if it is available in clang 3.1 or not)
FunctionDecl *func = ce->getDirectCallee();
if (func != NULL){
for(int i = 0; i < func->getNumParams(); i++){
if(func->getParamDecl(i)->getType()->isFunctionPointerType()){
// Do stuff here
}
}
}
You should get function prototype from pointer declaration, after that you will be able to get information about return type and parameters types:
clang::CallExpr* expr;
...
auto decl = expr->getCalleeDecl();
if (decl != nullptr) {
if (decl->getKind() == clang::Decl::Var) {
clang::VarDecl *varDecl = clang::dyn_cast<clang::VarDecl>(decl);
if(varDecl->getType()->isFunctionPointerType() == true) {
const clang::PointerType *pt = varDecl->getType()->getAs<clang::PointerType>();
const clang::FunctionProtoType *ft = pt->getPointeeType()->getAs<clang::FunctionProtoType>();
if (ft != nullptr) {
std::string retTypeName = ft->getReturnType().getAsString();
...
auto paramsCount = funcType->getNumParams();
for (size_t i = 0; i < paramsCount; ++i) {
clang::QualType paramType = funcType->getParamType(i);
std::string paramTypeName = paramType.getAsString();
...
}
}
}
}
}
May be you can also use getArg(position) to get particular argument and before that you should use getNumArgs to get details about number of argument that function contains.

Modern equivalent of LLVM AnnotationManager?

Now that LLVM's AnnotationManager is gone (it disappeared in the 2.6 release, I think?), how can I get the annotations for specific functions, globals, and instructions?
(For example, I have bitcode compiled from C void myFunction(__attribute__((annotate("foo"))) int var) --- given an Argument * reference to this int var argument, how might I determine which annotate attributes are attached to it?)
To get annotations for a specific function, traverse the entry BasicBlock of the function to find its calls to the #llvm.var.annotation intrinsic, as follows:
Module *module;
[...]
std::string getGlobalVariableString(std::string name)
{
// assumption: the zeroth operand of a Value::GlobalVariableVal is the actual Value
Value *v = module->getNamedValue(name)->getOperand(0);
if(v->getValueID() == Value::ConstantArrayVal)
{
ConstantArray *ca = (ConstantArray *)v;
return ca->getAsString();
}
return "";
}
void dumpFunctionArgAnnotations(std::string funcName)
{
std::map<Value *,Argument*> mapValueToArgument;
Function *func = module->getFunction(funcName);
if(!func)
{
std::cout << "no function by that name.\n";
return;
}
std::cout << funcName << "() ====================\n";
// assumption: #llvm.var.annotation calls are always in the function's entry block.
BasicBlock *b = &func->getEntryBlock();
// run through entry block first to build map of pointers to arguments
for(BasicBlock::iterator it = b->begin();it!=b->end();++it)
{
Instruction *inst = it;
if(inst->getOpcode()!=Instruction::Store)
continue;
// `store` operands: http://llvm.org/docs/LangRef.html#i_store
mapValueToArgument[inst->getOperand(1)] = (Argument *)inst->getOperand(0);
}
// run through entry block a second time, to associate annotations with arguments
for(BasicBlock::iterator it = b->begin();it!=b->end();++it)
{
Instruction *inst = it;
if(inst->getOpcode()!=Instruction::Call)
continue;
// assumption: Instruction::Call's operands are the function arguments, followed by the function name
Value *calledFunction = inst->getOperand(inst->getNumOperands()-1);
if(calledFunction->getName().str() != "llvm.var.annotation")
continue;
// `llvm.var.annotation` operands: http://llvm.org/docs/LangRef.html#int_var_annotation
Value *annotatedValue = inst->getOperand(0);
if(annotatedValue->getValueID() != Value::InstructionVal + Instruction::BitCast)
continue;
Argument *a = mapValueToArgument[annotatedValue->getUnderlyingObject()];
if(!a)
continue;
Value *annotation = inst->getOperand(1);
if(annotation->getValueID() != Value::ConstantExprVal)
continue;
ConstantExpr *ce = (ConstantExpr *)annotation;
if(ce->getOpcode() != Instruction::GetElementPtr)
continue;
// `ConstantExpr` operands: http://llvm.org/docs/LangRef.html#constantexprs
Value *gv = ce->getOperand(0);
if(gv->getValueID() != Value::GlobalVariableVal)
continue;
std::cout << " argument " << a->getType()->getDescription() << " " << a->getName().str()
<< " has annotation \"" << getGlobalVariableString(gv->getName().str()) << "\"\n";
}
}
AnnotationManager was deleted because it was useless (and it won't solve your problem). All the annotations are handled via the global named 'llvm.global.annotations' and annotation intrinsics, which you can surely parse and obtain the information you needed.
Look into IR to have an idea, how your C code was transformed into IR and what annotation attribute was turned into.

Segmentation Fault when trying to push a string to the back of a list

I am trying to write a logger class for my C++ calculator, but I'm experiencing a problem while trying to push a string into a list.
I have tried researching this issue and have found some information on this, but nothing that seems to help with my problem. I am using a rather basic C++ compiler, with little debugging utilities and I've not used C++ in quite some time (even then it was only a small amount).
My code:
#ifndef _LOGGER_H_
#define _LOGGER_H_
#include <iostream>
#include <list>
#include <string>
using std::cout;
using std::cin;
using std::endl;
using std::list;
using std::string;
class Logger
{
private:
list<string> mEntries;
public:
Logger() {}
~Logger() {}
// Public Methods
void WriteEntry(const string& entry)
{
mEntries.push_back(entry);
}
void DisplayEntries()
{
cout << endl << "**********************" << endl
<< "* Logger Entries *" << endl
<< "**********************" << endl
<< endl;
for(list<string>::iterator it = mEntries.begin();
it != mEntries.end(); it++)
{
// *** BELOW LINE IS MARKED WITH THE ERROR ***
cout << *it << endl;
}
}
};
#endif
I am calling the WriteEntry method by simply passing in a string, like so:
mLogger->WriteEntry("Testing");
Any advice on this would be greatly appreciated.
* CODE ABOVE HAS BEEN ALTERED TO HOW IT IS NOW *
Now, the line:
cout << *it << endl;
causes the same error. I'm assuming this has something to do with how I am trying to get the string value from the iterator.
The code I am using to call it is in my main.cpp file:
#include <iostream>
#include <string>
#include <sstream>
#include "CommandParser.h"
#include "CommandManager.h"
#include "Exceptions.h"
#include "Logger.h"
using std::string;
using std::stringstream;
using std::cout;
using std::cin;
using std::endl;
#define MSG_QUIT 2384321
#define SHOW_LOGGER true
void RegisterCommands(void);
void UnregisterCommands(void);
int ApplicationLoop(void);
void CheckForLoggingOutput(void);
void ShowDebugLog(void);
// Operations
double Operation_Add(double* params);
double Operation_Subtract(double* params);
double Operation_Multiply(double* params);
double Operation_Divide(double* params);
// Variable
CommandManager *mCommandManager;
CommandParser *mCommandParser;
Logger *mLogger;
int main(int argc, const char **argv)
{
mLogger->WriteEntry("Registering commands...\0");
// Make sure we register all commands first
RegisterCommands();
mLogger->WriteEntry("Command registration complete.\0");
// Check the input to see if we're using the program standalone,
// or not
if(argc == 0)
{
mLogger->WriteEntry("Starting application message pump...\0");
// Full version
int result;
do
{
result = ApplicationLoop();
} while(result != MSG_QUIT);
}
else
{
mLogger->WriteEntry("Starting standalone application...\0");
// Standalone - single use
// Join the args into a string
stringstream joinedStrings(argv[0]);
for(int i = 1; i < argc; i++)
{
joinedStrings << argv[i];
}
mLogger->WriteEntry("Parsing argument '" + joinedStrings.str() + "'...\0");
// Parse the string
mCommandParser->Parse(joinedStrings.str());
// Get the command names from the parser
list<string> commandNames = mCommandParser->GetCommandNames();
// Check that all of the commands have been registered
for(list<string>::iterator it = commandNames.begin();
it != commandNames.end(); it++)
{
mLogger->WriteEntry("Checking command '" + *it + "' is registered...\0");
if(!mCommandManager->IsCommandRegistered(*it))
{
// TODO: Throw exception
mLogger->WriteEntry("Command '" + *it + "' has not been registered.\0");
}
}
// Get each command from the parser and use it's values
// to invoke the relevant command from the manager
double results[commandNames.size()];
int currentResultIndex = 0;
for(list<string>::iterator name_iterator = commandNames.begin();
name_iterator != commandNames.end(); name_iterator++)
{
string paramString = mCommandParser->GetCommandValue(*name_iterator);
list<string> paramStringArray = StringHelper::Split(paramString, ' ');
double params[paramStringArray.size()];
int index = 0;
for(list<string>::iterator param_iterator = paramStringArray.begin();
param_iterator != paramStringArray.end(); param_iterator++)
{
// Parse the current string to a double value
params[index++] = atof(param_iterator->c_str());
}
mLogger->WriteEntry("Invoking command '" + *name_iterator + "'...\0");
results[currentResultIndex++] =
mCommandManager->InvokeCommand(*name_iterator, params);
}
// Output all results
for(int i = 0; i < commandNames.size(); i++)
{
cout << "Result[" << i << "]: " << results[i] << endl;
}
}
mLogger->WriteEntry("Unregistering commands...\0");
// Make sure we clear up our resources
UnregisterCommands();
mLogger->WriteEntry("Command unregistration complete.\0");
if(SHOW_LOGGER)
{
CheckForLoggingOutput();
}
system("PAUSE");
return 0;
}
void RegisterCommands()
{
mCommandManager = new CommandManager();
mCommandParser = new CommandParser();
mLogger = new Logger();
// Known commands
mCommandManager->RegisterCommand("add", &Operation_Add);
mCommandManager->RegisterCommand("sub", &Operation_Subtract);
mCommandManager->RegisterCommand("mul", &Operation_Multiply);
mCommandManager->RegisterCommand("div", &Operation_Divide);
}
void UnregisterCommands()
{
// Unregister each command
mCommandManager->UnregisterCommand("add");
mCommandManager->UnregisterCommand("sub");
mCommandManager->UnregisterCommand("mul");
mCommandManager->UnregisterCommand("div");
// Delete the logger pointer
delete mLogger;
// Delete the command manager pointer
delete mCommandManager;
// Delete the command parser pointer
delete mCommandParser;
}
int ApplicationLoop()
{
return MSG_QUIT;
}
void CheckForLoggingOutput()
{
char answer = 'n';
cout << endl << "Do you wish to view the debug log? [y/n]: ";
cin >> answer;
switch(answer)
{
case 'y':
ShowDebugLog();
break;
}
}
void ShowDebugLog()
{
mLogger->DisplayEntries();
}
// Operation Definitions
double Operation_Add(double* values)
{
double accumulator = 0.0;
// Iterate over all values and accumulate them
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator += values[i];
}
// Return the result of the calculation
return accumulator;
}
double Operation_Subtract(double* values)
{
double accumulator = 0.0;
// Iterate over all values and negativel accumulate them
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator -= values[i];
}
// Return the result of the calculation
return accumulator;
}
double Operation_Multiply(double* values)
{
double accumulator = 0.0;
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator *= values[i];
}
// Return the value of the calculation
return accumulator;
}
double Operation_Divide(double* values)
{
double accumulator = 0.0;
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator /= values[i];
}
// Return the result of the calculation
return accumulator;
}
Did you remember to call mLogger = new Logger at some point? Did you accidantally delete mLogger before writing to it?
Try running your program in valgrind to see whether it finds any memory errors.
After your edit, the solution seem clear:
Your first line in main() is :
mLogger->WriteEntry("Registering commands...\0");
Here mLogger is a pointer that has never been initialized. This is "undefined behaviour", meaning anything can appen, often bad things.
To fix this you can either make it a "normal" variable, not a pointer or create a Logger instance using new (either at the declaration or as the first line in main).
I suggest you to not use a pointer to be sure the logger is always there and is automatically destroyed.
By the way, it seems like you want to create every instance of objects on the heap using pointers. It's not recommanded if it's not necessary. You should use pointers ONLY if you want to explicitely state the creation (using new) and destruction (using delete) of the instance object. If you just need it in a specific scope, don't use a pointer. You might come from another language like Java or C# where all objects are referenced. If so, you should start learning C++ like a different language to avoid such kind of problem. You should learn about RAII and other C++ scpecific paradigm that you cannot learn in those languages. If you come from C you should too take it as a different language. That might help you avoid complex problems like the one you showed here. May I suggest you read some C++ pointer, references and RAII related questions on stackoverflow.
First, you don't need to create the std::list on the heap. You should just use it as a normal member of the class.
class Logger
{
private:
list<string> mEntries; // no need to use a pointer
public:
Logger() // initialization is automatic, no need to do anything
{
}
~Logger() // clearing and destruction is automatic too, no need to do anything
{
}
//...
};
Next, entryData don't exist in this code so I guess you wanted to use entry. If it's not a typo then you're not providing the definition of entryData that is certainly the source of your problem.
In fact I would have written your class that way instead:
class Logger
{
private:
list<string> mEntries;
public:
// no need for constructor and destructor, use the default ones
// Public Methods
void WriteEntry(const string& entry) // use a const reference to avoid unnecessary copy (even with optimization like NRVO)
{
mEntries.push_back( entry ); // here the list will create a node with a string inside, so this is exactly like calling the copy constructor
}
void DisplayEntries()
{
cout << endl << "**********************" << endl
<< "* Logger Entries *" << endl
<< "**********************" << endl
<< endl;
for(list<string>::iterator it = mEntries.begin();
it != mEntries.end(); ++it) // if you want to avoid unnecessary copies, use ++it instead of it++
{
cout << *it << endl;
}
}
};
What's certain is that your segfault is from usage outside of this class.
Is an instance of Logger being copied anywhere (either through a copy constructor or operator=)? Since you have mEntries as a pointer to a list, if you copy an instance of Logger, they will share the value of the pointer, and when one is destructed, it deletes the list. The original then has a dangling pointer. A quick check is to make the copy constructor and operator= private and not implemented:
private:
void operator=(const Logger &); // not implemented
Logger(const Logger &); // not implemented
When you recompile, the compiler will flag any copies of any Logger instances.
If you need to copy instances of Logger, the fix is to follow the Rule of 3:
http://en.wikipedia.org/wiki/Rule_of_three_%28C%2B%2B_programming%29
You can do this by eliminating the need for the destructor (by not using a pointer: list<string> mEntries), or by adding the needed code to the copy constructor and operator= to make a deep copy of the list.
You only need to do
list<string> entries;
entries.push_back();
You do not need to create a pointer to entries.
Nothing too obvious, though you typed
mEntries->push_back(string(entryData));
and I htink you meant entry instead of entryData. You also don't need the string conversion on that line, and your function should take entry by const reference.
However, none of these things would cause your program to segfault. What compiler are you using?
You're missing the copy constructor. If the Logger object is copied and the original deleted, you'll be dereferencing memory that was previously deleted.
A simplified example of the problem
Logger a;
{
Logger b;
a=b;
}
a.WriteEntry("Testing");
Add a copy constructor.
Logger(const Logger& item)
{
mEntries = new list<string>();
std::copy(item.mEntries->begin(), item.mEntries->end(), std::back_inserter(*mEntries));
}