How could I insert/remove an edge in LLVM? - llvm

Could I insert a new edge by changing its destination, and remove another edge by its source and destination. In other words, could I replace the basic block destination of an edge by another one, to make some modification in the CFG?
I tried getEdge() function in ProfileInfo, but it didn't work:
// to replace the basic block
Bb->getTerminator()->replaceUsesOfWith((*SI), (*rit));
// trying to set the new basic block as a new destination
xx = ProfileInfo::getEdge(Bb,(*rit));

A basic block has a single terminator instruction. However, this terminator can be one of several instructions which are quite different. Some can have multiple edges. So it's not quite as simple as it seems you assume.
What you can do is look at the terminator of a block and modify the instruction to branch to a different destination. This depends on the instruction, and (of course) on your specific needs.

Related

eraseFromParent() vs removeFromParent() in llvm

I Understand the difference between eraseFromParent() and removeFromParent() is that former unlinks and deletes instruction from the BasicBlock, while later just unlinks but not delete it.
When should I use one over the other?. Looking for some example scenarios.
I'll use instructions in basic blocks as the example. The other kinds of lists are similar.
Delete: If you want to drop an instruction completely, eraseFromParent() does that with one line of code.
Delete slightly later: If you want to drop an instruction, but use it to create something else, then it may make sense to remove it from its basic block, compute the replacement based on the instruction, and only then delete the instruction. For example, if you have a pass that replaces some computations with reads from global variables this approach can make sense. It depends on how you compute the replacement.
Don't delete: If you want to move an instruction elsewhere, then it may make sense to remove it from its basic block, do whatever else is necessary, and then insert it into its new home. For example, if you need to consider and perhaps move each instruction in a function once, then it's simple to write one loop that removes the instruction followed by one that inserts the instructions in their new home. That way you don't risk matching newly moved instructions.

LLVM - given a register, get where it was last used in the IR representation

I am trying to keep track of data flow in my source code. For that, I'm looking at instructions of type load and obtaining which register they're loading the value from with the use of
*(LI->getPointerOperand())
LI being the instruction of type LoadInst. Now I need to know where this register was last accessed so that I can point that check the data flow from that instruction to this one. Any suggestions will be highly appreciated.
Initially, simplify the problem by excluding loops and functions with multiple exits, so that you have a function CFG as a single entry and single exit graph.
One (probably simplistic) way would be to first find all its users by doing something like:
llvm::Instruction i = [the register for that LoadInst];
auto users = i->users();
Then using the PostDominatorTree and the getLevel method of the DomTreeNodeBase (I think this was introduced with LLVM 5.0.0, if not available in your version you could use getChildren and perform a BFS traversal), you could filter through those with the highest level number.
I'm not sure what you want to do with loops, but if nothing special, the above should suffice. For dealing with multiple exits from functions you could make use of the mergereturn pass prior to any processing.

Fftw3 library and plans reuse

I'm about to use fftw3 library in my very certain task.
I have a heavy load packets stream with variable frame size, which is produced like that:
while(thereIsStillData){
copyDataToInputArray();
createFFTWPlan();
performExecution();
destroyPlan();
}
Since creating plans is rather expensive, I want to modify my code to something like this:
while(thereIsStillData){
if(inputArraySizeDiffers()) destroyOldAndCreateNewPlan();
copyDataToInputArray(); // e.g. `memcpy` or `std::copy`;
performExecution();
}
Can I do this? I mean, does plan contain some important information based on data such, that plan created for one array with size N, when executed will give incorrect results for the other array of same size N.
The fftw_execute() function does not modify the plan presented to it, and can be called multiple times with the same plan. Note, however, that the plan contains pointers to the input and output arrays, so if copyDataToInputArray() involves creating a different input (or output) array then you cannot afterwards use the old plan in fftw_execute() to transform the new data.
FFTW does, however, have a set of "New-array Execute Functions" that could help here, supposing that the new arrays satisfy some additional similarity criteria with respect to the old (see linked docs for details).
The docs do recommend:
If you are tempted to use the new-array execute interface because you want to transform a known bunch of arrays of the same size, you should probably go use the advanced interface instead
but that's talking about transforming multiple arrays that are all in memory simultaneously, and arranged in a regular manner.
Note, too, that if your variable frame size is not too variable -- that is, if it is always one of a relatively small number of choices -- then you could consider keeping a separate plan in memory for each frame size instead of recomputing a plan every time one frame's size differs from the previous one's.

Efficient lookup of a buffer with stack of data modifications applied

I am trying to write a C++11 library as part of a wider project that implements a stack of changes (modification, insertion and deletion) implemented on top of an original buffer. Then, the aim is to be able to quickly look "through" the changes and get the modified data out.
My current approach is:
Maintain an ordered list of changes, ordered by offset of the start of the change
Also maintain a stack of the same changes, so they can be rolled back in order
New changes are pushed onto the stack and inserted into the list at the right place
The changes-by-offset list may be modified if the change interacts with others
For example, a modification of bytes 5-10 invalidates the start of an earlier modification from 8-12
Also, insertion or deletion changes will change the apparent offset of data occurring after them (deleting bytes 5-10 means that what used to be byte 20 is now found at 15)
To find the modified data, you can look though the list for the change that applies (and the offset within that change that applies - another change might have invalidated some of it), or find the right offset in the original data if no change touched that offset
The aim here is to make the lookup fast - adding a change might take some effort to mess with the list, but lookups later, which will outnumber the modifications greatly, in an ordered list should be pretty straightforward.
Also you don't need to continuously copy data - each change's data is kept with it, and the original data is untouched
Undo is then implemented by popping the last change off the stack and rolling back any changes made to it by this change's addition.
This seems to be quite a difficult task - there are a lot of things to take care of and I am quickly piling up complex code!
I feel sure that this must be problem that has been dealt with in other software, but looking around various hex editors and so on hasn't pointed me to a useful implementation. Is there a name for this problem ("data undo stack" and friends hasn't got me very far!), or a library that can be used, even as a reference, for this kind of thing?
I believe the most common approach (one I have used successfully in the past) is to simply store the original state and then put each change operation (what's being done + arguments) on the undo stack. Then, to get to a particular prior state you start from the original and apply all changes except the ones you want undone.
This is a lot easier to implement than trying to identify what parts of the data changed, and it works well unless the operations themselves are very time-consuming (and therefore slow to "replay" onto the original state).
I would look at persistent data structures, such as https://en.wikipedia.org/wiki/Persistent_data_structure and http://www.toves.org/books/persist/#s2 - or websearch on terms from these. I think you could do this with a persistent tree whose leaves carry short strings.

Remove bytes in the middle of a file without moving the end?

For example if i have lots of data entry's stored in a file, each with different sizes, and i have 1000 entries which makes the file like 100MB large, if i then wanted to remove an entry in the middle of the file which is size of 50KB, how can i remove that empty 50KB of bytes in the file without moving all the end bytes up to fill it?
I am using winapi functions such as these for the file management:
CreateFile, WriteFile, ReadFile and SetFilePointerEx
If you really want to do that, set a flag in your entry. When you want to remove an entry from your file, simply invalidate that flag(logical removal) w/o deleting it physically. The next time you add an entry, just go through the file, look for the first invalidated entry, and overwrite it. If all are validated, append it to the end. This takes O(1) time to remove an entry and O(n) to add a new entry, assuming that reading/writing a single entry from/to disk is the basic operation.
You can even optimize it further. At the beginning of the file, store a bit map(1 for invalidated). E.g., 0001000... represents that the 4th entry in your file is invalidated. When you add an entry, search for the first 1 in the bit map and use Random file I/O (in contrast with sequential file I/O) to redirect the file pointer to that entry-to-overwrite directly. Adding in this way only takes O(1) time.
Oh, I notice your comment. If you want to do it efficiently with entry removed physically, a simple way is to swap the entry-to-remove with the very last one in your file and remove the last one, assuming your entries are not sorted. The time is also good, which is O(1) for both adding and removing.
Edit: Just as Joe mentioned, this requires that all of your entries have the same size. You can implement one with variable length of entries, but that'll be more complicated than the one in discussion here.
Let A = start of file, B = start of block to remove, C = end of block to remove
CreateFile with flag FILE_FLAG_RANDOM_ACCESS
SetFilePointerEx to position C, read to EOF into buffer (this may be a large buffer given your file size. Be careful with gigantic records, because any File IO operation now has to allocate virtual memory of the record size to do any simple operation such as move).
Copy buffer to position B in file
Should now be at position B + sizeof(block C). Call SetEndOfFile to truncate the file at that position, then close.
Note that this could be done way easier with the memmove function. However this requires you to map the entire file into memory, make the move, and write it back out. This is great for small files, but files larger than 50-100MB I would caution you about having enough available contiguous virtual address space.
You can simply keep flagging the unused space, and after some time when the internal fragmentation exceeds a certain ratio then you can run a routine which will compact the file. With this scheme the removals would be fast, but some periodic reorganization is needed. If you have a separate file handling scheme, then you can divide the file in some chunks and then keep track of the free chunks and when deleting mark the chunk as unused and keep track of it, and later in the case of an insertion reuse it. This scheme will depend on the type of records in your file, fixed or variable length records.