LLVM storing Loop* in std::vector - c++

I've stumbled into something very peculiar - I'm writing an LLVM module Pass. I iterate over all functions of the module and then all loops of every non-declaration function and I store pointers to loops in a std::vector. Here's the source:
virtual bool runOnModule(Module& Mod){
std::vector<Loop*> loops;
// first gather all loop info
for(Module::iterator f = Mod.begin(), fend = Mod.end(); f != fend; ++f){
if (!(*f).isDeclaration()){
LoopInfo& LI = getAnalysis<LoopInfo>(*f);
for(LoopInfo::iterator l = LI.begin(), lend = LI.end(); l != lend; ++l){
loops.push_back(*l);
}
}
}
for (auto& l: loops) errs () << *l << " ";
}
Now if I run this I get a runtime error - it can't print the loops, somehow I'm doing a null pointer dereference or sth. Any ideas?

You have to make sure that the LoopInfo pass actually runs before your pass. Here is a complete example - stanalone from opt:
class AnalyzeLoops : public FunctionPass {
public:
AnalyzeLoops()
: FunctionPass(ID) {}
void getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<LoopInfo>();
}
virtual bool runOnFunction(Function &F) {
LoopInfo &LI = getAnalysis<LoopInfo>();
for (LoopInfo::iterator L = LI.begin(), LE = LI.end(); L != LE; ++L) {
(*L)->dump();
}
return false;
}
static char ID;
};
In addition, when creating the passes, do:
PassManager PM;
PM.add(new LoopInfo());
PM.add(new AnalyzeLoops());
PM.run(*Mod);
I suspect that to make opt actually run LoopInfo before your pass, you should pass -loops too.
Also, note that I define getAnalysisUsage - this will make LLVM complain if LoopInfo didn't run before this pass, making the problem more obvious.
Note that LoopInfo is specifically a FunctionPass, and as an analysis it has to be used from another FunctionPass. The LoopInfo data structure doesn't really survive between different functions, and since it owns its data (those Loop* objects) they will be destroyed as well.
One thing you could do if you really need a ModulePass is just invoke LoopInfo manually and not as an analysis. When you iterate the functions in the module, for each function create a new LoopInfo object and use its runOnFunction method. Though even in this case, you have to make sure the LoopInfo that owns a given Loop* survives if you want to use the latter.

First of all LoopInfo should run just once before the for loop.
Secondly LoopInfo::iterator just includes top level loops of the Function. in order to visit all loops you also need to iterate over subloops of every loop. it can be implemented either as recursive function or by WorkList, like this`
virtual bool runOnFunction(Function &F) {
LoopInfo *loopinfo;
loopinfo = &getAnalysis<LoopInfo>();
std::vector<Loop*> allLoops;
for (LoopInfo::iterator Li = loopinfo->begin(), Le = loopinfo->end();
Li != Le; Li++) {
Loop *L = *Li;
allLoops.push_back(L);
dfsOnLoops(L, loopinfo, allLoops);
}
}
void dfsOnLoops(Loop *L, LoopInfo *loopinfo, std::vector<Loop*> LoopS) {
std::vector<Loop *> subloops = L->getSubLoops();
if (subloops.size()) {
// recursive on subloops
for (std::vector<Loop *>::iterator Li = subloops.begin();Li != subloops.end(); Li++){
LoopS.push_back(*Li);
dfsOnLoops(*Li, loopinfo, LoopS);
}
}
}
`

None of the answers really helped but I managed to solve the problem myself. Basically, each llvm pass can define a releaseMemory() method, read more here. The LoopInfo class had that method implemented and thus the analysis information would be lost every time we get out of scope from the call to getAnalysis. I simply removed the releaseMemory() method in Loopinfo.h and the memory was no longer released. Note that this triggered a big change in the codebase and even opt had to be rebuilt so doing this in general is probably a bad idea and this would definitely not be easily accepted as a change to llvm (I speculate, not sure).

I think the best way to solve this is to explicitly create LoopInfo objects and save them. Here is the Code for LLVM 3.5
using LoopInfoType=llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>;
std::vector<llvm::Loop*> loopVec;
std::vector<LoopInfoType*> loopInfoVec;
for(llvm::Module::iterator F = M.begin(); F!= M.end(); F++){
//skip declrations
if(F->isDeclaration()){
continue;
}
//TODO that scope problem
llvm::DominatorTree DT = llvm::DominatorTree();
DT.recalculate(*F);
LoopInfoType *loopInfo = new LoopInfoType();
loopInfo->releaseMemory();
loopInfo->Analyze(DT);
loopInfoVec.push_back(loopInfo);
for(llvm::LoopInfo::iterator lit = loopInfo->begin(); lit != loopInfo->end(); lit++){
Loop * L = * lit;
loopVec.push_back(L);
//L->dump();
}
}//for all functions
cin.get();
for(auto loop : loopVec){
std::cout << "loop\n";
loop->dump();
for(llvm::Loop::block_iterator bit = loop->block_begin(); bit != loop->block_end(); bit++){
llvm::BasicBlock * B = * bit;
B->dump();
std::cout << "\n\n";
}
}

Related

How to use F.getValueSymbolTable()

I wanna get all the local variables in a function.
void getLocalVariables(Function &F) {
ValueSymbolTable *vst = F.getValueSymbolTable();
for (auto vs : vst) { // here it says: This scope-based "for" statement required the appropriate "begin" function, but was not found
auto s = vs.getKey();
auto v = vs.getValue();
}
}
The error is that: This scope-based "for" statement required the appropriate "begin" function, but was not found. So how can I correct my code? Tks.
I check the documentation for ValueSymbolTable, and finally find how to use it. But actually, as arnt said, they are not local variables in source code. They are temporary variables generated by IR.
void getLocalVariables(Function &F) {
// not test yet
ValueSymbolTable *vst = F.getValueSymbolTable();
errs() << (*vst).size() << "\n.";
for (ValueSymbolTable::iterator VI = vst->begin(), VE = vst->end(); VI != VE; ++VI) {
Value *V = VI->getValue();
if (!isa<GlobalValue>(V) || cast<GlobalValue>(V)->hasLocalLinkage()) {
if (!V->getName().startswith("llvm.dbg"))
// Set name to "", removing from symbol table!
V->setName("");
}
}
}

std::list and garbage Collection algorithm

I have a server that puts 2 players together on request and starts a game Game in a new thread.
struct GInfo {Game* game; std::thread* g_thread};
while (true) {
players_pair = matchPlayers();
Game* game = new Game(players_pair);
std::thread* game_T = new std::thread(&Game::start, game);
GInfo ginfo = {game, game_T}
_actives.push_back(ginfo); // std::list
}
I am writing a "Garbage Collector", that runs in another thread, to clean the memory from terminated games.
void garbageCollector() {
while (true) {
for (std::list<Ginfo>::iterator it = _actives.begin(); it != _actives.end(); ++it) {
if (! it->game->isActive()) {
delete it->game; it->game = nullptr;
it->g_thread->join();
delete it->g_thread; it->g_thread = nullptr;
_actives.erase(it);
}
}
sleep(2);
}
}
This generates a segfault, I suspect it is because of the _active.erase(it) being in the iteration loop.
For troubleshooting, I made _actives an std::vector (instead of std::list) and applied the same algorithm but using indexes instead of iterators, it works fine.
Is there a way around this?
Is the algorithm, data structure used fine? Any better way to do the garbage collection?
Help is appreciated!
If you have a look at the documentation for the erase method it returns an iterator to the element after the one that was removed.
The way to use that is to assign the returned value to your iterator like so.
for (std::list<Ginfo>::iterator it = _actives.begin(); it != _actives.end();) {
if (! it->game->isActive()) {
delete it->game; it->game = nullptr;
it->g_thread->join();
delete it->g_thread; it->g_thread = nullptr;
it = _actives.erase(it);
}
else {
++it;
}
}
Since picking up the return value from erase advances the iterator to the next element, we have to make sure not to increment the iterator when that happens.
On an unrelated note, variable names starting with underscore is generally reserved for the internals of the compiler and should be avoided in your own code.
Any better way to do the garbage collection?
Yes, don't use new,delete or dynamic memory alltogether:
struct Players{};
struct Game{
Game(Players&& players){}
};
struct GInfo {
GInfo(Players&& players_pair):
game(std::move(players_pair)),g_thread(&Game::start, game){}
Game game;
std::thread g_thread;
};
std::list<GInfo> _actives;
void someLoop()
{
while (true) {
GInfo& ginfo = _actives.emplace_back(matchPlayers());
}
}
void garbageCollector() {
while (true) {
//Since C++20
//_active.remove_if([](GInfo& i){ return !i.game.isActive();});
//Until C++20
auto IT =std::remove_if(_actives.begin(),_actives.end(),
[](GInfo& i){ return !i.game.isActive();});
_active.erase(IT,_active.end());
//
sleep(2);
}
}
There might be a few typos, but that's the idea.

Iterating over the BasicBlocks of loops of function in LLVM IR in module pass

does anybody know how to iterate over the basic of Loops of functions in module pass.I was trying :
bool runOnModule(Module &M) override
{
for(Module::iterator f = M.begin(), fend = M.end(); f != fend; ++f)
{
LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
for(Loop *L : LI)
{
for(BasicBlock *BB : L->getBlocks())
{
dbgs() << "basicb name: "<< BB->getName() <<"\n";
}
}
}
return true;
}
and it always gives the error
opt: /home/anurag/polly/llvm/include/llvm/PassAnalysisSupport.h:235: AnalysisType& llvm::Pass::getAnalysisID(llvm::AnalysisID) const [with AnalysisType = llvm::LoopInfoWrapperPass; llvm::AnalysisID = const void*]: Assertion `ResultPass && "getAnalysis*() called on an analysis that was not " "'required' by pass!"' failed.
There are two updates needed for this code. The first was also noted in this question, that when requesting the loop info from a module pass, you need to specify the function (adding the iterator access):
LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>(*f).getLoopInfo();
The second issue is that some functions in the module are "empty", declarations without definitions. Adding a check for the size should skip those and avoiding any issues with trying to find loops in empty functions.
if ((*f).size() == 0) continue;

Why can't one clone a `Space` in Gecode before solving the original one?

I'm looking for a way to copy Space instances in Gecode and then analyze the difference between the spaces later.
However it goes already wrong after the first copy. When one copies the code in the book Modelling and Programming in Gecode, as shown here below, and simply modifies it such that a copy is made first (SendMoreMoney* smm = m->copy(true);), one gets a Segmentation fault, regardless whether the shared option is true or false.
#include <gecode/int.hh>
#include <gecode/search.hh>
using namespace Gecode;
class SendMoreMoney : public Space {
protected:
IntVarArray l;
public:
SendMoreMoney(void) : l(*this, 8, 0, 9) {
IntVar s(l[0]), e(l[1]), n(l[2]), d(l[3]),
m(l[4]), o(l[5]), r(l[6]), y(l[7]);
// no leading zeros
rel(*this, s, IRT_NQ, 0);
rel(*this, m, IRT_NQ, 0);
// all letters distinct
distinct(*this, l);
// linear equation
IntArgs c(4+4+5); IntVarArgs x(4+4+5);
c[0]=1000; c[1]=100; c[2]=10; c[3]=1;
x[0]=s; x[1]=e; x[2]=n; x[3]=d;
c[4]=1000; c[5]=100; c[6]=10; c[7]=1;
x[4]=m; x[5]=o; x[6]=r; x[7]=e;
c[8]=-10000; c[9]=-1000; c[10]=-100; c[11]=-10; c[12]=-1;
x[8]=m; x[9]=o; x[10]=n; x[11]=e; x[12]=y;
linear(*this, c, x, IRT_EQ, 0);
// post branching
branch(*this, l, INT_VAR_SIZE_MIN(), INT_VAL_MIN());
}
// search support
SendMoreMoney(bool share, SendMoreMoney& s) : Space(share, s) {
l.update(*this, share, s.l);
}
virtual SendMoreMoney* copy(bool share) {
return new SendMoreMoney(share,*this);
}
// print solution
void print(void) const {
std::cout << l << std::endl;
}
};
// main function
int main(int argc, char* argv[]) {
// create model and search engine
SendMoreMoney* m = new SendMoreMoney;
SendMoreMoney* mc = m->copy(true);
DFS<SendMoreMoney> e(m);
delete m;
// search and print all solutions
while (SendMoreMoney* s = e.next()) {
s->print(); delete s;
}
return 0;
}
How can one make a real copy?
You have to call status() on the Space first.
I found this exchange in the Gecode mailing list archives: https://www.gecode.org/users-archive/2006-March/000439.html
It would seem that internally, Gecode uses the copy function and constructor for its own internal purposes, so to make a "copy-by-value" copy of a space, you need to use the clone() function defined in the Space interface. However, as noted in #Anonymous answer, you need to call status() before calling clone or it will throw an exception of type SpaceNotStable
I augmented my space with the function below to automatically call status, make the clone, and return a pointer of my derived type:
struct Example : public Space {
...
Example * cast_clone() {
status();
return static_cast<Example *>(this->clone());
}
...
}
As a workaround, one can create a totally independent space and then use equality constraints
on the variable level to reduce the domains of these variables.
Example:
void cloneHalfValues(SendMoreMoney* origin) {
int n = l.size();
for(int i = 0x00; i < n/2; i++) {
if(origin->l[i].assigned()) {
rel(*this, l[i], IRT_EQ, origin->l[i].val());
}
}
}
The reason why one can't clone a Space is however still a mystery.

How can I find the depth of a recursive function in C++

How can I find the current depth inside a recursive function in C++ without passing in the previous level? i.e. is it possible to know how many times the function was called without using a parameter to keep track of the level and passing that number in as a parameter each time the function is called?
For example my recursive function looks like this:
DoSomething(int level)
{
print level;
if (level > 10)
return;
DoSomething(++level);
}
main
{
DoSomething(0);
}
Building on the answer already given by JoshD:
void recursive()
{
static int calls = 0;
static int max_calls = 0;
calls++;
if (calls > max_calls)
max_calls = calls;
recursive();
calls--;
}
This resets the counter after the recursive function is complete, but still tracks the maximum depth of the recursion.
I wouldn't use static variables like this for anything but a quick test, to be deleted soon after. If you really need to track this on an ongoing basis there are better methods.
You could use a static variable in the function...
void recursive()
{
static int calls = 0;
calls++;
recursive();
}
Of course, this will keep counting when you start a new originating call....
If you want it to be re-entrant and thread-safe, why not:
void rec(int &level) // reference to your level var
{
// do work
rec(++level); // go down one level
}
main()
{
//and you call it like
int level=0;
rec(level);
cout<<level<<" levels."<<endl;
}
No static/global variables to mess up threading and you can use different variables for different recursive chains for re-entrancy issues.
You can use a local static variable, if you don't care about thread-safety.
Although, this will only give you a proper count the first time you run your recursive routine. A better technique would be a RAII guard-type class which contains an internal static variable. At the start of the recursive routine, construct the guard class. The constructor would increment the internal static variable, and the destructor would decrement it. This way, when you create a new stack-frame the counter increments by one, and when you return from each stack-frame the counter would decrement by one.
struct recursion_guard
{
recursion_guard() { ++counter; }
~recursion_guard() { --counter; }
static int counter;
};
int recursion_guard::counter = 0;
void recurse(int x)
{
recursion_guard rg;
if (x > 10) return;
recurse(x + 1);
}
int main()
{
recurse(0);
recurse(0);
}
Note however, that this is still not thread-safe. If you need thread-safety, you can replace the static-storage variable with a thread-local-storage variable, either using boost::thread_specific_ptr or the C++0x thread local facilities.
You could also pass in the level as a template parameter, if it can be determined at compile-time. You could also use a function object. This is by far and away the best option - less hassle, and static variables should be avoided wherever possible.
struct DoSomething {
DoSomething() {
calls = 0;
}
void operator()() {
std::cout << calls;
calls++;
if (calls < 10)
return operator()();
return;
}
int calls;
};
int main() {
DoSomething()(); // note the double ().
std::cin.get();
}
convert level to an instance variable of a new object (typically a template) capable of containing the arguments and (possibly) the function. then you can reuse the recursion accumulator interface.
You can also try using a global variable to log the depth.
var depth = 0;
DoSomething()
{
print ++depth;
if (depth > 10)
return;
DoSomething();
}
main
{
DoSomething(0);
}
I came here when I sensed that some recursion is required, because I was implementing a function that can validate the chain of trust in a certificate chain. This is not X.509 but instead it is just the basics wherein the issuer key of a certificate must match the public key of the signer.
bool verify_chain(std::vector<Cert>& chain,
Cert* certificate,
unsigned char* pOrigin = nullptr, int depth = 0)
{
bool flag = false;
if (certificate == nullptr) {
// use first element in case parameter is null
certificate = &chain[0];
}
if (pOrigin == nullptr) {
pOrigin = certificate->pubkey;
} else {
if (std::memcmp(pOrigin, certificate->pubkey, 32) == 0) {
return false; // detected circular chain
}
}
if (certificate->hasValidSignature()) {
if (!certificate->isRootCA()) {
Cert* issuerCert = certificate->getIssuer(chain);
if (issuerCert) {
flag = verify_chain(chain, issuerCert, pOrigin, depth+1);
}
} else {
flag = true;
}
}
if (pOrigin && depth == 1) {
pOrigin = nullptr;
}
return flag;
}
I needed to know the recursion depth so that I can correctly clean up pOrigin. at the right stack frame during the unwinding of the call stack.
I used pOrigin to detect a circular chain, without which the recursive call can go on forever. For example,
cert0 signs cert1
cert1 signs cert2
cert2 signs cert0
I later realized that a simple for-loop can do it for simple cases when there is only one common chain.
bool verify_chain2(std::vector<Cert> &chain, Cert& cert)
{
Cert *pCert = &cert;
unsigned char *startkey = cert.pubkey;
while (pCert != nullptr) {
if (pCert->hasValidSignature()) {
if (!pCert->isRootCA()) {
pCert = pCert->getIssuer(chain);
if (pCert == nullptr
|| std::memcmp(pCert->pubkey, startkey, 32) == 0) {
return false;
}
continue;
} else {
return true;
}
} else {
return false;
}
}
return false;
}
But recursion is a must when there is not one common chain but instead the chain is within each certificate. I welcome any comments. Thank you.