How to obtain the basic blocks that are reachable from basic block A? - llvm

I want to know all of the basic blocks that a basic block can reach.
How might I go about doing this?
#include "llvm/IR/CFG.h"
BasicBlock *BB = ...;
for (BasicBlock *Pred : predecessors(BB)) {
}

You can walk through the successors of successors (and so on), e.g.:
std::unordered_set<BasicBlock *> reachable;
std::queue<BasicBlock *> worklist;
worklist.insert(BB);
while (!worklist.empty()) {
BasicBlock *front = worklist.front();
worklist.pop();
for (BasicBlock *succ : successors(front)) {
if (reachable.count(succ) == 0) {
/// We need the check here to ensure that we don't run
/// infinitely if the CFG has a loop in it
/// i.e. the BB reaches itself directly or indirectly
worklist.push(succ);
reachable.push(succ);
}
}
}
If I didn't make any mistakes, then you'll get the all the reachable basic blocks in the end of the while loop :)

Related

std::list and garbage Collection algorithm

I have a server that puts 2 players together on request and starts a game Game in a new thread.
struct GInfo {Game* game; std::thread* g_thread};
while (true) {
players_pair = matchPlayers();
Game* game = new Game(players_pair);
std::thread* game_T = new std::thread(&Game::start, game);
GInfo ginfo = {game, game_T}
_actives.push_back(ginfo); // std::list
}
I am writing a "Garbage Collector", that runs in another thread, to clean the memory from terminated games.
void garbageCollector() {
while (true) {
for (std::list<Ginfo>::iterator it = _actives.begin(); it != _actives.end(); ++it) {
if (! it->game->isActive()) {
delete it->game; it->game = nullptr;
it->g_thread->join();
delete it->g_thread; it->g_thread = nullptr;
_actives.erase(it);
}
}
sleep(2);
}
}
This generates a segfault, I suspect it is because of the _active.erase(it) being in the iteration loop.
For troubleshooting, I made _actives an std::vector (instead of std::list) and applied the same algorithm but using indexes instead of iterators, it works fine.
Is there a way around this?
Is the algorithm, data structure used fine? Any better way to do the garbage collection?
Help is appreciated!
If you have a look at the documentation for the erase method it returns an iterator to the element after the one that was removed.
The way to use that is to assign the returned value to your iterator like so.
for (std::list<Ginfo>::iterator it = _actives.begin(); it != _actives.end();) {
if (! it->game->isActive()) {
delete it->game; it->game = nullptr;
it->g_thread->join();
delete it->g_thread; it->g_thread = nullptr;
it = _actives.erase(it);
}
else {
++it;
}
}
Since picking up the return value from erase advances the iterator to the next element, we have to make sure not to increment the iterator when that happens.
On an unrelated note, variable names starting with underscore is generally reserved for the internals of the compiler and should be avoided in your own code.
Any better way to do the garbage collection?
Yes, don't use new,delete or dynamic memory alltogether:
struct Players{};
struct Game{
Game(Players&& players){}
};
struct GInfo {
GInfo(Players&& players_pair):
game(std::move(players_pair)),g_thread(&Game::start, game){}
Game game;
std::thread g_thread;
};
std::list<GInfo> _actives;
void someLoop()
{
while (true) {
GInfo& ginfo = _actives.emplace_back(matchPlayers());
}
}
void garbageCollector() {
while (true) {
//Since C++20
//_active.remove_if([](GInfo& i){ return !i.game.isActive();});
//Until C++20
auto IT =std::remove_if(_actives.begin(),_actives.end(),
[](GInfo& i){ return !i.game.isActive();});
_active.erase(IT,_active.end());
//
sleep(2);
}
}
There might be a few typos, but that's the idea.

What is the alternative to `TiXmlNode::FirstChild(const char *)` in TinyXML-2?

I am updating code that uses the legacy TinyXml library, to use new TinyXML-2 version instead.
While editing, I noticed that the function TiXmlNode::FirstChild(const char *) has no direct replacement in TinyXML-2.
My questions are:
Is there a convenient replacement for the aforementioned function that I missed?
In case there isn't, how should the example code below be updated for TinyXML-2?
// TiXmlElement *element; // assume this was correctly loaded
TiXmlNode *node;
if ((node = element->FirstChild("example")) != nullptr)
{
for (TiXmlElement *walk = node->FirstChildElement();
walk != nullptr;
walk = walk->NextSiblingElement())
{
// ...
}
}
tinyxml2 has
const XMLElement * XMLNode::FirstChildElement (const char *value=0) const
Your code block is much the same:
if (auto example = element -> FirstChildElement ("example")
{
for (auto walk = example -> FirstChildElement();
walk;
walk -> NextSiblingElement())
{
// walk the walk
}
}
Or you might look at my add-on for tinyxml2 with which your snippet would be:
for (auto walk : selection (element, "example/")
{
// walk the walk
}

LLVM storing Loop* in std::vector

I've stumbled into something very peculiar - I'm writing an LLVM module Pass. I iterate over all functions of the module and then all loops of every non-declaration function and I store pointers to loops in a std::vector. Here's the source:
virtual bool runOnModule(Module& Mod){
std::vector<Loop*> loops;
// first gather all loop info
for(Module::iterator f = Mod.begin(), fend = Mod.end(); f != fend; ++f){
if (!(*f).isDeclaration()){
LoopInfo& LI = getAnalysis<LoopInfo>(*f);
for(LoopInfo::iterator l = LI.begin(), lend = LI.end(); l != lend; ++l){
loops.push_back(*l);
}
}
}
for (auto& l: loops) errs () << *l << " ";
}
Now if I run this I get a runtime error - it can't print the loops, somehow I'm doing a null pointer dereference or sth. Any ideas?
You have to make sure that the LoopInfo pass actually runs before your pass. Here is a complete example - stanalone from opt:
class AnalyzeLoops : public FunctionPass {
public:
AnalyzeLoops()
: FunctionPass(ID) {}
void getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<LoopInfo>();
}
virtual bool runOnFunction(Function &F) {
LoopInfo &LI = getAnalysis<LoopInfo>();
for (LoopInfo::iterator L = LI.begin(), LE = LI.end(); L != LE; ++L) {
(*L)->dump();
}
return false;
}
static char ID;
};
In addition, when creating the passes, do:
PassManager PM;
PM.add(new LoopInfo());
PM.add(new AnalyzeLoops());
PM.run(*Mod);
I suspect that to make opt actually run LoopInfo before your pass, you should pass -loops too.
Also, note that I define getAnalysisUsage - this will make LLVM complain if LoopInfo didn't run before this pass, making the problem more obvious.
Note that LoopInfo is specifically a FunctionPass, and as an analysis it has to be used from another FunctionPass. The LoopInfo data structure doesn't really survive between different functions, and since it owns its data (those Loop* objects) they will be destroyed as well.
One thing you could do if you really need a ModulePass is just invoke LoopInfo manually and not as an analysis. When you iterate the functions in the module, for each function create a new LoopInfo object and use its runOnFunction method. Though even in this case, you have to make sure the LoopInfo that owns a given Loop* survives if you want to use the latter.
First of all LoopInfo should run just once before the for loop.
Secondly LoopInfo::iterator just includes top level loops of the Function. in order to visit all loops you also need to iterate over subloops of every loop. it can be implemented either as recursive function or by WorkList, like this`
virtual bool runOnFunction(Function &F) {
LoopInfo *loopinfo;
loopinfo = &getAnalysis<LoopInfo>();
std::vector<Loop*> allLoops;
for (LoopInfo::iterator Li = loopinfo->begin(), Le = loopinfo->end();
Li != Le; Li++) {
Loop *L = *Li;
allLoops.push_back(L);
dfsOnLoops(L, loopinfo, allLoops);
}
}
void dfsOnLoops(Loop *L, LoopInfo *loopinfo, std::vector<Loop*> LoopS) {
std::vector<Loop *> subloops = L->getSubLoops();
if (subloops.size()) {
// recursive on subloops
for (std::vector<Loop *>::iterator Li = subloops.begin();Li != subloops.end(); Li++){
LoopS.push_back(*Li);
dfsOnLoops(*Li, loopinfo, LoopS);
}
}
}
`
None of the answers really helped but I managed to solve the problem myself. Basically, each llvm pass can define a releaseMemory() method, read more here. The LoopInfo class had that method implemented and thus the analysis information would be lost every time we get out of scope from the call to getAnalysis. I simply removed the releaseMemory() method in Loopinfo.h and the memory was no longer released. Note that this triggered a big change in the codebase and even opt had to be rebuilt so doing this in general is probably a bad idea and this would definitely not be easily accepted as a change to llvm (I speculate, not sure).
I think the best way to solve this is to explicitly create LoopInfo objects and save them. Here is the Code for LLVM 3.5
using LoopInfoType=llvm::LoopInfoBase<llvm::BasicBlock, llvm::Loop>;
std::vector<llvm::Loop*> loopVec;
std::vector<LoopInfoType*> loopInfoVec;
for(llvm::Module::iterator F = M.begin(); F!= M.end(); F++){
//skip declrations
if(F->isDeclaration()){
continue;
}
//TODO that scope problem
llvm::DominatorTree DT = llvm::DominatorTree();
DT.recalculate(*F);
LoopInfoType *loopInfo = new LoopInfoType();
loopInfo->releaseMemory();
loopInfo->Analyze(DT);
loopInfoVec.push_back(loopInfo);
for(llvm::LoopInfo::iterator lit = loopInfo->begin(); lit != loopInfo->end(); lit++){
Loop * L = * lit;
loopVec.push_back(L);
//L->dump();
}
}//for all functions
cin.get();
for(auto loop : loopVec){
std::cout << "loop\n";
loop->dump();
for(llvm::Loop::block_iterator bit = loop->block_begin(); bit != loop->block_end(); bit++){
llvm::BasicBlock * B = * bit;
B->dump();
std::cout << "\n\n";
}
}

How to refactor this while loop to get rid of "continue"?

I have a while (!Queue.empty()) loop that processes a queue of elements. There are a series of pattern matchers going from highest-priority to lowest-priority order. When a pattern is matched, the corresponding element is removed from the queue, and matching is restarted from the top (so that the highest-priority matchers get a chance to act first).
So right now it looks something like this (a simplified version):
while (!Queue.empty())
{
auto & Element = *Queue.begin();
if (MatchesPatternA(Element)) { // Highest priority, since it's first
// Act on it
// Remove Element from queue
continue;
}
if (MatchesPatternB(Element)) {
// Act on it
// Remove Element from queue
continue;
}
if (MatchesPatternC(Element)) { // Lowest priority, since it's last
// Act on it
// Remove Element from queue
continue;
}
// If we got this far, that means no pattern was matched, so
// Remove Element from queue
}
This works, but I want to refactor this loop in some way to remove the use of the keyword continue.
Why? Because if I want to outsource a pattern matching to an external function, it obviously breaks. E.g.
void ExternalMatching(...)
{
if (MatchesPatternB(Element)) {
// Act on it
// Remove Element from queue
continue; // This won't work here
}
}
while (!Queue.empty())
{
auto & Element = *Queue.begin();
if (MatchesPatternA(Element)) {
// Act on it
// Remove Element from queue
continue;
}
ExternalMatching(...);
if (MatchesPatternC(Element)) {
// Act on it
// Remove Element from queue
continue;
}
// If we got this far, that means no pattern was matched, so
// Remove Element from queue
}
I don't want to have to write repetitive if statements like if (ExternalMatching(...)) { ... continue; }, I'd rather find a cleaner way to express this logic.
This simplified example might make it seem like a good idea to make pattern matching more general rather than having distinct MatchesPatternA, MatchesPatternB, MatchesPatternC, etc. functions. But in my situation the patterns are quite complicated, and I'm not quite ready to generalize them yet. So I want to keep that part as is, separate functions.
Any elegant ideas? Thank you!
If you have access to C++11 I would like to suggest another solution. Basicaly I created a container of handlers and actions that can be adjusted in runtime. It may be a pro or con for your design depending on your requirements. Here it is:
#include <functional>
typedef std::pair<std::function<bool(const ElementType &)>,
std::function<void(ElementType &)> > HandlerData;
typedef std::vector<HandlerData> HandlerList;
HandlerList get_handlers()
{
HandlerList handlers;
handlers.emplace_back([](const ElementType &el){ return MatchesPatternA(el); },
[](ElementType &el){ /* Action */ });
handlers.emplace_back([](const ElementType &el){ return MatchesPatternB(el); },
[](ElementType &el){ /* Action */ });
handlers.emplace_back([](const ElementType &el){ return MatchesPatternC(el); },
[](ElementType &el){ /* Action */ });
return handlers;
}
int main()
{
auto handlers = get_handlers();
while(!Queue.empty()) {
auto &Element = *Queue.begin();
for(auto &h : handlers) {
// check if handler matches the element
if(h.first(Element)) {
// act on element
h.second(Element);
break;
}
}
// remove element
Queue.pop_front();
}
}
I would recommend using a function that does the pattern matching (but does not act on the result) and then a set of functions that act on the different options:
enum EventType {
A, B, C //, D, ...
};
while (!queue.empty()) {
auto & event = queue.front();
EventType e = eventType(event); // Internally does MatchesPattern*
// and returns the match
switch (e) {
case A:
processA(event);
break;
case B:
processB(event);
This way you clearly separate the matching from the processing, the loop is just a simple dispatcher
Consider an interface:
class IMatchPattern
{
public:
virtual bool MatchesPattern(const Element& e) = 0;
};
Then you can organize a container of objects implementing IMatchPattern, to allow for iterative access to each pattern match method.
You can change your ExternalMatching to return bool, indicating that the processing has been done. This way the caller would be able to continue evaluating if necessary:
bool ExternalMatching(...)
{
if (MatchesPatternB(Element) {
// Act on it
// Remove Element from queue
return true;
}
return false;
}
Now you can call it like this:
if (ExternalMatchin1(...)) continue;
if (ExternalMatchin2(...)) continue;
...
if (ExternalMatchingN(...)) continue;
Ok, I ended up rewriting the loop more akin to this.
Huge thanks and credit goes to Yuushi, dasblinkenlight, David Rodríguez for their help; this answer is based on a combination of their answers.
bool ExternalMatching(...)
{
bool Match;
if ((Match = MatchesPatternX(Element))) {
// Act on it
} else if ((Match = MatchesPatternY(Element))) {
// Act on it
}
return Match;
}
while (!Queue.empty())
{
auto & Element = Queue.front();
if (MatchesPatternA(Element)) { // Highest priority, since it's first
// Act on it
} else if (MatchesPatternB(Element)) {
// Act on it
} else if (ExternalMatching(...)) {
} else if (MatchesPatternC(Element)) { // Lowest priority, since it's last
// Act on it
}
// Remove Element from queue
}
Now, I know there's further room for improvement, see answers of Mateusz Pusz and Michael Sh. However, this is good enough to answer my original question, and it'll do for now. I'll consider improving it in the future.
If you're curious to see the real code (non-simplified version), see here:
https://github.com/shurcooL/Conception/blob/38f731ccc199d5391f46d8fce3cf9a9092f38c65/src/App.cpp#L592
Thanks everyone again!
I would like to suggest a Factory function that would take the Element and create an appropriate handler and return the interface pointer to the handler.
while (!Queue.empty())
{
auto & Element = *Queue.begin();
// get the appropriate handler object pointer e.g.
IPatternHandler *handler = Factory.GetHandler(Element);
handler->handle();
// clean up handler appropriately
}

In LLVM, how do you check if a block is a merge block

I'm writing an LLVM Pass. My pass needs to know which block is a merge block, that is, a block which has more than 1 predecessors. How can I test for this in my code?
You can iterate over all predecessors like this:
#include "llvm/Support/CFG.h"
BasicBlock *BB = ...;
for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
BasicBlock *Pred = *PI;
// ...
}
you can verify if an BB have more than one predecessor using this:
BasicBlock *BB = ...;
if (BB->getSinglePredecessor() != null) /// one predecessor
{ ... }
else /// more than one predecessor
{ ... }