LLVM - given a register, get where it was last used in the IR representation - llvm

I am trying to keep track of data flow in my source code. For that, I'm looking at instructions of type load and obtaining which register they're loading the value from with the use of
*(LI->getPointerOperand())
LI being the instruction of type LoadInst. Now I need to know where this register was last accessed so that I can point that check the data flow from that instruction to this one. Any suggestions will be highly appreciated.

Initially, simplify the problem by excluding loops and functions with multiple exits, so that you have a function CFG as a single entry and single exit graph.
One (probably simplistic) way would be to first find all its users by doing something like:
llvm::Instruction i = [the register for that LoadInst];
auto users = i->users();
Then using the PostDominatorTree and the getLevel method of the DomTreeNodeBase (I think this was introduced with LLVM 5.0.0, if not available in your version you could use getChildren and perform a BFS traversal), you could filter through those with the highest level number.
I'm not sure what you want to do with loops, but if nothing special, the above should suffice. For dealing with multiple exits from functions you could make use of the mergereturn pass prior to any processing.

Related

How can I use Intel PIN to catch all loads to an array?

I'm profiling an application I have written using PIN. The source code of the application uses an array - I want PIN to catch every load instruction made to the array.
Currently, I have annotated the source code of the application I am trying to profile. Every time I read from the array, I first call a function startRegionOfInterest(). Once I finish reading from the array I call another function endRegionOfInterest(). I can use PIN to easily catch calls to these two functions - whenever a load exists between these two calls I assume it's a load to the array I'm interested in.
However, this is pretty coarse grained, and so I end up classifying a lot of loads that are NOT to the array of interest as reads to the array.
Is there an easier way for me to more precisely catch all loads made to the array I'm interested in?
In your startRegionOfInterest method, you can use some kind of indicator sequence to pass the array address to your PIN tool. E.g., store a magic constant, then store the array address, something like:
volatile void *sink;
void startRegionOfInterest(void *array) {
sink = (void *)0x48829d2f384be;
sink = array;
}
In your PIN tool, you look for a store of the magic constant (within the startRegionOfInterest call for extra specificity, if you want), and then you know the next store is the address of the array. You can communicate the length similarly.
Implementing the sequence with inline asm instead you can remove the variability associated with compiler and optimizer behavior, although I think the volatile approach should work in practice (although you might have to skip some intervening non-store instructions. A godbolt.

How do I identify a loop in LLVM bitcode?

I have a LLVM bitcode file and I'm running a loop pass on it. Every time I arrive at a loop ("runOnLoop"), I extract several pieces of information about the loop body (i.e. the number of instructions in the body) and print it. However, I need a way to associate this information with a specific loop - in other words, I need to print the "name" of the loop the information was extracted from.
I'm not sure what you mean by "name", but one way is to print debugging information (line number/column) associated with the loop latch block or something similar.
Another way is to use metadata to uniquely identify each loop and associate the extracted information with that identifier.
I had a similar need too, so I created a pass for that. Please note that this approach is sensitive to compiler optimizations and it does not preserve the ID when that happens (e.g. if a function that contains a loop is inlined). For best results (closer to the source) use it over IR that has been compiled with -O0. Further, optimizations can be applied afterwards, when you're done with your information gathering.
However, for something simple, I'd go with the first approach.

How to uniquely identify an instruction in LLVM Pass?

So I am trying to keep a count of how many times certain call instructions are called and I am struggling with identifying the instructions uniquely. I couldn't find something as an instruction ID in the documentation. I want to get the ID and pass it on to an external function that knows how to do the job.
So the question is how can I get a unique ID for those instructions (preferably as an integer)?
I take it you perform counting on runtime, and in the pass you are just inserting code that performs that counting near call instructions you are interested in. In this case Instruction pointer should work just fine. The pointer would not change if you move an Instruction around, it can only become invalid if you delete Instruction.
To convert a pointer into an integer use static_cast<uintptr_t>(i).
If you know the type of call instructions that are possible then you can just declare an enum for all possible type of call instructions and pass the enum value to the counting function whenever you come across that type of call instruction based on the parameter value.
If you don't know all the possible call instructions, then you can pass the name of the function that is being called by the call instruction to the counting function. In this case you would have to implement the counting function in such a way that it maintains a map of function names and the count for that function.
Since a call instruction returns a value (Value*) for that particular call, I think all the Instruction* pointers that you get would be unique. So it won't serve your purpose if you use the pointer value as ID.

Use value from Input Port in Parameter of block - Simulink

I have a simulink model that I plan on converting to C code and using elsewhere. I have defined 'input ports' in order to set variables in the simulink model.
I am trying to find a way to use the input variables as part of a State Space block but have tried everything and not sure how else to go about it.
As mentioned this will be converted to C/C++ code so there is no option to use matlab in anyway.
Say I use matrix A in the state-space block parameter. Matrix A is defined liek so A= [Input1 0; Input2 0; 0 Input3]
I want to be able to change the values of the inputs through the code by setting the values of Input1 2 3 etc.
There is a very clear distinction in Simulink between Parameters and Signals. A parameter is something entered into a dialog, while a signal is something fed into or coming out of a block.
The matrices in the State-Space block are defined as parameters, and hence you will never be able to feed your signals into them.
You have two options.
Don't use the State-Space block. Rather develop the state-space model yourself using more fundamental blocks (i.e. integrators, sums and product blocks). This is feasible for small models, but not really recommended.
Note that the Parameters of a block a typically tunable. When you generate code, one of the files will be model_name_data.c and this will contain a parameter structure allowing you to change, the parameters.
Note that in either case, merely from a model design perspective, it'll be up to you to ensure that the changes to the model make sense (for instance don't make any loop, etc. go unstable).
You can not tune the parameter after generating code, because it is inlined with a constant value, this is typically done because it results in the fastest code. To have full control over the behaviour, you have to use tunable parameters. There is a table with different code versions, depending on what you want you can choose the right type of parameter.
Another lazy way to achieve this in many cases is using base workspace variables, very simple to achieve and works fine in the most cases.

Is it possible to update a list of list in real time as a program is running?

I have a code that is running 24/7. And, I am wondering if there is any methodology which I could use to allow me to make changes to the variables in real-time without invoking any error? Had been using raw_input() but this 'stops' the program since it's running sequentially.
My idea is to use a while true loop:
while true:
...
...
and for the first few loops, it'll use the default catch all values that i have pre-programmed into the system. As it's running, I'll like to make changes to some constant terms (which act as control) in 'real-time'. So, in the next loop and beyond, it'll use the new values rather than the pre-programmed version.
Some of your code or details of what you are trying to do would help.
But one way to do it is to have two processes, one process that reads from standard in with raw_input(), we can call it p1; and one that handles the data structure, in this case the list, we call it p2.
The two processes could communicate with message passing using sockets or what ever you want.
Then to be sure to avoid race conditions that new data is read in p1, but not yet updated in p2. Thus p2 will carry on and use the out of date data. One way to do this is using locks.