Remove Successors for each BB - llvm

I need to remove successors from every basic block to insert new ones
I tried this code, but it doesn't work
void RemoveSuccessor(TerminatorInst *TI, unsigned SuccNum) {
assert(SuccNum < TI->getNumSuccessors() &&
"Trying to remove a nonexistant successor!");
// If our old successor block contains any PHI nodes, remove the entry in the
// PHI nodes that comes from this branch...
//
BasicBlock *BB = TI->getParent();
TI->getSuccessor(SuccNum)->removePredecessor(BB);
TerminatorInst *NewTI = 0;
switch (TI->getOpcode()) {
case Instruction::Br:
// If this is a conditional branch... convert to unconditional branch.
if (TI->getNumSuccessors() == 2) {
cast<BranchInst>(TI)->setUnconditionalDest(TI->getSuccessor(1-SuccNum));
} else { // Otherwise convert to a return instruction...
Value *RetVal = 0;
// Create a value to return... if the function doesn't return null...
if (!(BB->getParent()->getReturnType())->isVoidTy())
RetVal = Constant::getNullValue(BB->getParent()->getReturnType());
// Create the return...
NewTI = 0;
}
break;
case Instruction::Invoke: // Should convert to call
case Instruction::Switch: // Should remove entry
default:
case Instruction::Ret: // Cannot happen, has no successors!
assert(0 && "Unhandled terminator instruction type in RemoveSuccessor!");
abort();
}
if (NewTI) // If it's a different instruction, replace.
ReplaceInstWithInst(TI, NewTI);
}
Example of the results:
if.then: ; preds = %for.inc, %entry, %for.body
preds should not include %for.body according to remove the successors of %for.body before inserting the new successors

Beyond my comment above asking for clarifications, this line seems fishy:
cast<BranchInst>(TI)->setUnconditionalDest(TI->getSuccessor(1-SuccNum));
From what I know the setUnconditionalDest() method have been removed years ago, what version are you using? In any case, I recommend creating a new unconditional BranchInst, then using ReplaceInstWithInst() to replace the conditional one.

Related

How can I optimize Astar for vast empty spaces?

I am creating a game with a 3D grid for flying entities, So I have a lot of points and connections in the air where there aren't any obstructions. I didn't want to decrease the resolution of my grid so I thought I could just skip over chunks (or empties as I call them) of the Astar map while they're not containing any obstructions, and I modified Godot's Astar algorithm to do this.
Unfortunately this ended up being slower than looping through points one at a time due to the way I implemented this modification, which needs to loop through all the edge points of an empty.
2D representation of how one edge point of an empty connects to all other edge points:
This ends up looping through a larger number of points than letting the A* algorithm work it's way through the empty.
So I'm sorta stumped on how to make this more efficient while still preserving the most optimal path.
I could potentially narrow down what faces of the empty should be scanned over by first comparing the center points of all 8 faces of the empty (as my grid consists of hexagonal prisms). Or maybe I should somehow use the face center points of the empty's faces exclusively instead of all edge points.
I mainly want to know if anyone has worked on an issue like this before, and if so what would be the recommended solution?
Here is the astar loop for reference:
bool AStar::_solve(Point *begin_point, Point *end_point, int relevant_layers) {
pass++;
//make sure parallel layers are supported
// or if *relevant_layers is 0 then use all points
bool supported = relevant_layers == 0 || (relevant_layers & end_point->parallel_support_layers) > 0;
if (!end_point->enabled || !supported) {
return false;
}
bool found_route = false;
Vector<Point *> open_list;
SortArray<Point *, SortPoints> sorter;
begin_point->g_score = 0;
begin_point->f_score = _estimate_cost(begin_point->id, end_point->id);
open_list.push_back(begin_point);
while (!open_list.empty()) {
Point *p = open_list[0]; // The currently processed point
if (p == end_point) {
found_route = true;
break;
}
sorter.pop_heap(0, open_list.size(), open_list.ptrw()); // Remove the current point from the open list
open_list.remove(open_list.size() - 1);
p->closed_pass = pass; // Mark the point as closed
//if the point is part of an empty, look through all of the edge points of said empty (as to skip over any points within the empty).
OAHashMap<int, Point*> connections;
PoolVector<Empty*> enabled_empties;
int size = p->empties.size();
PoolVector<Empty*>::Read r = p->empties.read();
for (int i = 0; i < size; i++) {
Empty* e = r[i];
supported = relevant_layers == 0 || (relevant_layers & e->parallel_support_layers) > 0;
//if the empty is enabled and the end point is not within the empty
if (e->enabled && supported && !end_point->empties.has(e)) {
enabled_empties.append(e);
//can travel to any edge point
for (OAHashMap<int, Point*>::Iterator it = e->edge_points.iter(); it.valid; it = e->edge_points.next_iter(it)) {
int id = *it.key;
Point* ep = *(it.value);
ep->is_neighbour = false;
//don't connect to the same point
if (id != p->id && (i == 0 || !connections.has(id))) {
connections.set(id, ep);
}
}
}
}
//add neighbours to connections
for (OAHashMap<int, Point*>::Iterator it = p->neighbours.iter(); it.valid; it = p->neighbours.next_iter(it)) {
int id = *it.key;
Point* np = *(it.value);// The neighbour point
np->is_neighbour = true;
//don't need to check for duplicate point connections if no empties
if (size == 0 || !connections.has(id)) {
//don't add points within enabled empties since they're meant to be skipped over
if (np->empties.size() > 0 && !np->on_empty_edge) {
bool in_enabled_empty = false;
PoolVector<Empty*>::Read r1 = np->empties.read();
for (int i = 0; i < np->empties.size(); i++) {
if (enabled_empties.has(r1[i])) {
in_enabled_empty = true;
break;
}
}
if (!in_enabled_empty) {
connections.set(id, np);
}
}
else {
connections.set(id, np);
}
}
}
for (OAHashMap<int, Point *>::Iterator it = connections.iter(); it.valid; it = connections.next_iter(it)) {
Point *e = *(it.value); // The neighbour point
//make sure parallel layers are supported
// or if *relevant_layers is 0 then use all points
supported = relevant_layers == 0 || (relevant_layers & e->parallel_support_layers) > 0;
if (!e->enabled || e->closed_pass == pass || !supported) {
continue;
}
real_t tentative_g_score = p->g_score + _compute_cost(p->id, e->id) * e->weight_scale;
bool new_point = false;
if (e->open_pass != pass) { // The point wasn't inside the open list.
e->open_pass = pass;
open_list.push_back(e);
new_point = true;
} else if (tentative_g_score >= e->g_score) { // The new path is worse than the previous.
continue;
}
e->prev_point = p;
e->prev_point_connected = e->is_neighbour;
e->g_score = tentative_g_score;
e->f_score = e->g_score + _estimate_cost(e->id, end_point->id);
if (new_point) { // The position of the new points is already known.
sorter.push_heap(0, open_list.size() - 1, 0, e, open_list.ptrw());
} else {
sorter.push_heap(0, open_list.find(e), 0, e, open_list.ptrw());
}
}
}
return found_route;
}
Note: I'm still not exactly sure what the sorter does.
the entire code can be seen here in a_star.cpp and a_star.h
Edit:
if anyone wants to reference or use this, I've modified the Astar code to add user-defined octants and to use a user-defined straight line function (they are user-defined so they can work with any type of grid) to be used between octants when possible to further decrease runtime, and it works very well in terms of speed. Though the pathing is not optimal, especially when adding a lot of obstacles/restricting the available positions.

What is the differences between BasicBlock::getSingleSuccessor() and BasicBlock::getUniqueSuccessor() in LLVM?

I still don't understand the differences even after referring to the doxygen page:
getSingleSuccessor()
Return the successor of this block if it has a single successor.
Otherwise return a null pointer.
getUniqueSuccessor()
Return the successor of this block if it has a unique successor.
Otherwise return a null pointer.
and looking to the source code:
// BasicBlock.cpp
const BasicBlock *BasicBlock::getSingleSuccessor() const {
const_succ_iterator SI = succ_begin(this), E = succ_end(this);
if (SI == E) return nullptr; // no successors
const BasicBlock *TheSucc = *SI;
++SI;
return (SI == E) ? TheSucc : nullptr /* multiple successors */;
}
const BasicBlock *BasicBlock::getUniqueSuccessor() const {
const_succ_iterator SI = succ_begin(this), E = succ_end(this);
if (SI == E) return nullptr; // No successors
const BasicBlock *SuccBB = *SI;
++SI;
for (;SI != E; ++SI) {
if (*SI != SuccBB)
return nullptr;
// The same successor appears multiple times in the successor list.
// This is OK.
}
return SuccBB;
}
LLVM IR code generally has one successor for each case label or similar, so for code like this example, getUniqueSuccessor() and getSingleSuccessor() deliver different results:
switch(foo) {
case 0:
case 1:
case 2:
default:
printf("Hello, world\n";
}
The first block has four successors, all of them equal.

Recursive insert of a chain into memory fails

This meight be a long question but i hope someone can help me figuring out whats going wrong.
I am inserting a JSON Object into already allocated Memory with my own Datatype which basically holds a Union with Data and a ptrdiff_t to the next Datatype in 8bit steps.
template <typename T>
class BaseType
{
public:
BaseType();
explicit BaseType(T& t);
explicit BaseType(const T& t);
~BaseType();
inline void setNext(const ptrdiff_t& next);
inline std::ptrdiff_t getNext();
inline void setData(T& t);
inline void setData(const T& t);
inline T getData() const;
protected:
union DataUnion
{
T data;
::std::ptrdiff_t size;
DataUnion()
{
memset(this, 0, sizeof(DataUnion));
} //init with 0
explicit DataUnion(T& t);
explicit DataUnion(const T& t);
} m_data;
long long m_next;
};
The implementation is streight so nothing special happes there just setting/getting the values of the definition. (i'll skip the impl. here)
So here starts the code where something goes wrong:
std::pair<void*, void*> Page::insertObject(const rapidjson::GenericValue<rapidjson::UTF8<>>& value,
BaseType<size_t>* last)
{
//return ptr to the first element
void* l_ret = nullptr;
//prev element ptr
BaseType<size_t>* l_prev = last;
//position pointer
void* l_pos = nullptr;
//get the members
for (auto it = value.MemberBegin(); it != value.MemberEnd(); ++it)
{
switch (it->value.GetType())
{
case rapidjson::kNullType:
LOG_WARN << "null type: " << it->name.GetString();
continue;
case rapidjson::kFalseType:
case rapidjson::kTrueType:
{
l_pos = find(sizeof(BaseType<bool>));
void* l_new = new (l_pos) BaseType<bool>(it->value.GetBool());
if (l_prev != nullptr)
l_prev->setNext(dist(l_prev, l_new));
}
break;
case rapidjson::kObjectType:
{
//pos for the obj id
//and insert the ID of the obj
l_pos = find(sizeof(BaseType<size_t>));
std::string name = it->name.GetString();
void* l_new = new (l_pos) BaseType<size_t>(common::FNVHash()(name));
if (l_prev != nullptr)
l_prev->setNext(dist(l_prev, l_new));
//TODO something strange happens here!
// pass the objid Object to the insertobj!
// now recursive insert the obj
// the second contains the last element inserted
// l_pos current contains the last inserted element and get set to the
// last element of the obj we insert
l_pos = (insertObject(it->value, reinterpret_cast<BaseType<size_t>*>(l_new)).second);
}
break;
case rapidjson::kArrayType:
{//skip this at the moment till the bug is fixed
}
break;
case rapidjson::kStringType:
{
// find pos where the string fits
// somehow we get here sometimes and it does not fit!
// which cant be since we lock the whole page
l_pos = find(sizeof(StringType) + strlen(it->value.GetString()));
//add the String Type at the pos of the FreeType
auto* l_new = new (l_pos) StringType(it->value.GetString());
if (l_prev != nullptr)
l_prev->setNext(dist(l_prev, l_new));
}
break;
case rapidjson::kNumberType:
{
//doesnt matter since long long and double are equal on x64
//find pos where the string fits
l_pos = find(sizeof(BaseType<long long>));
void* l_new;
if (it->value.IsInt())
{
//insert INT
l_new = new (l_pos) BaseType<long long>(it->value.GetInt64());
}
else
{
//INSERT DOUBLE
l_new = new (l_pos) BaseType<double>(it->value.GetDouble());
}
if (l_prev != nullptr)
l_prev->setNext(dist(l_prev, l_new));
}
break;
default:
LOG_WARN << "Unknown member Type: " << it->name.GetString() << ":" << it->value.GetType();
continue;
}
//so first element is set now, store it to return it.
if(l_ret == nullptr)
{
l_ret = l_pos;
}
//prev is the l_pos now so cast it to this;
l_prev = reinterpret_cast<BaseType<size_t>*>(l_pos);
}
//if we get here its in!
return{ l_ret, l_pos };
}
I am starting to insert like this:
auto firstElementPos = insertObject(value.MemberBegin()->value, nullptr).first;
While value.MemberBegin()->value is Object to be inserted and ->name holds the Name of the object. In the case below its Person and everything between {}.
The problem is, if i insert a JSON Object which has one Object inside like so:
"Person":
{
"age":25,
"double": 23.23,
"boolean": true,
"double2": 23.23,
"firstInnerObj":{
"innerDoub": 12.12
}
}
It works properly and i can reproduce the Object. But if i have more inner objects like so:
"Person":
{
"age":25,
"double": 23.23,
"boolean": true,
"double2": 23.23,
"firstInnerObj":{
"innerDoub": 12.12
},
"secondInnerObj":{
"secInnerDoub": 12.12
}
}
It fails and i lose data so i think that my recursion goes wrong but i dont see why. If you need any more informations let me know. Meight take a look here and the client here.
The test.json need to contain a json object like above. And the find only need to contain {"oid__":2} to get the second object that was inserted.
I could track the issue down to the Point where i recreate the Object recursively in the code. Some of the Nextpointers seem to be incorrect:
void* Page::buildObject(const size_t& hash, void* start, rapidjson::Value& l_obj,
rapidjson::MemoryPoolAllocator<>& aloc)
{
//get the meta information of the object type
//to build it
auto& l_metaIdx = meta::MetaIndex::getInstance();
//get the meta dataset
auto& l_meta = l_metaIdx[hash];
//now we are already in an object here with l_obj!
auto l_ptr = start;
for (auto it = l_meta->begin(); it != l_meta->end(); ++it)
{
//create the name value
rapidjson::Value l_name(it->name.c_str(), it->name.length(), aloc);
//create the value we are going to add
rapidjson::Value l_value;
//now start building it up again
switch (it->type)
{
case meta::OBJECT:
{
auto l_data = static_cast<BaseType<size_t>*>(l_ptr);
//get the hash to optain the metadata
auto l_hash = l_data->getData();
//set to object and create the inner object
l_value.SetObject();
//get the start pointer which is the "next" element
//and call recursive
l_ptr = static_cast<BaseType<size_t>*>(buildObject(l_hash,
(reinterpret_cast<char*>(l_data) + l_data->getNext()), l_value, aloc));
}
break;
case meta::ARRAY:
{
l_value.SetArray();
auto l_data = static_cast<ArrayType*>(l_ptr);
//get the hash to optain the metadata
auto l_size = l_data->size();
l_ptr = buildArray(l_size, static_cast<char*>(l_ptr) + l_data->getNext(), l_value, aloc);
}
break;
case meta::INT:
{
//create the data
auto l_data = static_cast<BaseType<long long>*>(l_ptr);
//with length attribute it's faster ;)
l_value = l_data->getData();
}
break;
case meta::DOUBLE:
{
//create the data
auto l_data = static_cast<BaseType<double>*>(l_ptr);
//with length attribute it's faster ;)
l_value = l_data->getData();
}
break;
case meta::STRING:
{
//create the data
auto l_data = static_cast<StringType*>(l_ptr);
//with length attribute it's faster
l_value.SetString(l_data->getString()->c_str(), l_data->getString()->length(), aloc);
}
break;
case meta::BOOL:
{
//create the data
auto l_data = static_cast<BaseType<bool>*>(l_ptr);
l_value = l_data->getData();
}
break;
default:
break;
}
l_obj.AddMember(l_name, l_value, aloc);
//update the lptr
l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
}
//return the l_ptr which current shows to the next lement. //see line above
return l_ptr;
}
After houers and houres of debugging i found the small issue which causes this. The method which builds up the Object after it was inserted returns a pointer to the actuall last element->next which was inserted and after the switch case i did call the ->next again which causes a loss of data because it scipped one element in the single chained list.
The Fix to this is to put the line
l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
Only into the switch cases where it is not an Object or Array. Fix Commit This actually also gave me the fix for an Issue with inserting Array.
Of cause the real issue could not know someone here who did not took a deep look into the code but i still want to show the fix here. Thanks to #sehe who helped alot with figuring out whats going wrong here.

In LLVM, how do you check if a block is a merge block

I'm writing an LLVM Pass. My pass needs to know which block is a merge block, that is, a block which has more than 1 predecessors. How can I test for this in my code?
You can iterate over all predecessors like this:
#include "llvm/Support/CFG.h"
BasicBlock *BB = ...;
for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
BasicBlock *Pred = *PI;
// ...
}
you can verify if an BB have more than one predecessor using this:
BasicBlock *BB = ...;
if (BB->getSinglePredecessor() != null) /// one predecessor
{ ... }
else /// more than one predecessor
{ ... }

faster than Stackwalk

Does anybody know of a better/ faster way to get the call stack than "StackWalk"?
I also think that stackwalk can also be slower on methods with a lot of variables...
(I wonder what commercial profilers do?)
I'm using C++ on windows. :)
thanks :)
I don't know if it's faster, and it won't show you any symbols, and I'm sure you can do better than that, but this is some code I wrote a while back when I needed this info (only works for Windows):
struct CallStackItem
{
void* pc;
CallStackItem* next;
CallStackItem()
{
pc = NULL;
next = NULL;
}
};
typedef void* CallStackHandle;
CallStackHandle CreateCurrentCallStack(int nLevels)
{
void** ppCurrent = NULL;
// Get the current saved stack pointer (saved by the compiler on the function prefix).
__asm { mov ppCurrent, ebp };
// Don't limit if nLevels is not positive
if (nLevels <= 0)
nLevels = 1000000;
// ebp points to the old call stack, where the first two items look like this:
// ebp -> [0] Previous ebp
// [1] previous program counter
CallStackItem* pResult = new CallStackItem;
CallStackItem* pCurItem = pResult;
int nCurLevel = 0;
// We need to read two pointers from the stack
int nRequiredMemorySize = sizeof(void*) * 2;
while (nCurLevel < nLevels && ppCurrent && !IsBadReadPtr(ppCurrent, nRequiredMemorySize))
{
// Keep the previous program counter (where the function will return to)
pCurItem->pc = ppCurrent[1];
pCurItem->next = new CallStackItem;
// Go the the previously kept ebp
ppCurrent = (void**)*ppCurrent;
pCurItem = pCurItem->next;
++nCurLevel;
}
return pResult;
}
void PrintCallStack(CallStackHandle hCallStack)
{
CallStackItem* pCurItem = (CallStackItem*)hCallStack;
printf("----- Call stack start -----\n");
while (pCurItem)
{
printf("0x%08x\n", pCurItem->pc);
pCurItem = pCurItem->next;
}
printf("----- Call stack end -----\n");
}
void ReleaseCallStack(CallStackHandle hCallStack)
{
CallStackItem* pCurItem = (CallStackItem*)hCallStack;
CallStackItem* pPrevItem;
while (pCurItem)
{
pPrevItem = pCurItem;
pCurItem = pCurItem->next;
delete pPrevItem;
}
}
I use Jochen Kalmbachs StackWalker.
I speedet it up this way:
The most time is lost in looking for the PDB files in the default directories and PDB Servers.
I use only one PDB path and implemented a white list for the images I want to get resolved (no need for me to look for user32.pdb)
Sometimes I dont need to dive to the bottom, so I defined a max deep
code changes:
BOOL StackWalker::LoadModules()
{
...
// comment this line out and replace to your pdb path
// BOOL bRet = this->m_sw->Init(szSymPath);
BOOL bRet = this->m_sw->Init(<my pdb path>);
...
}
BOOL StackWalker::ShowCallstack(int iMaxDeep /* new parameter */ ... )
{
...
// define a maximal deep
// for (frameNum = 0; ; ++frameNum )
for (frameNum = 0; frameNum < iMaxDeep; ++frameNum )
{
...
}
}
Check out http://msdn.microsoft.com/en-us/library/bb204633%28VS.85%29.aspx - this is "CaptureStackBackTrace", although it's called as "RtlCaptureStackBackTrace".