How to uniquely identify an instruction in LLVM Pass? - llvm

So I am trying to keep a count of how many times certain call instructions are called and I am struggling with identifying the instructions uniquely. I couldn't find something as an instruction ID in the documentation. I want to get the ID and pass it on to an external function that knows how to do the job.
So the question is how can I get a unique ID for those instructions (preferably as an integer)?

I take it you perform counting on runtime, and in the pass you are just inserting code that performs that counting near call instructions you are interested in. In this case Instruction pointer should work just fine. The pointer would not change if you move an Instruction around, it can only become invalid if you delete Instruction.
To convert a pointer into an integer use static_cast<uintptr_t>(i).

If you know the type of call instructions that are possible then you can just declare an enum for all possible type of call instructions and pass the enum value to the counting function whenever you come across that type of call instruction based on the parameter value.
If you don't know all the possible call instructions, then you can pass the name of the function that is being called by the call instruction to the counting function. In this case you would have to implement the counting function in such a way that it maintains a map of function names and the count for that function.
Since a call instruction returns a value (Value*) for that particular call, I think all the Instruction* pointers that you get would be unique. So it won't serve your purpose if you use the pointer value as ID.

Related

LLVM - given a register, get where it was last used in the IR representation

I am trying to keep track of data flow in my source code. For that, I'm looking at instructions of type load and obtaining which register they're loading the value from with the use of
*(LI->getPointerOperand())
LI being the instruction of type LoadInst. Now I need to know where this register was last accessed so that I can point that check the data flow from that instruction to this one. Any suggestions will be highly appreciated.
Initially, simplify the problem by excluding loops and functions with multiple exits, so that you have a function CFG as a single entry and single exit graph.
One (probably simplistic) way would be to first find all its users by doing something like:
llvm::Instruction i = [the register for that LoadInst];
auto users = i->users();
Then using the PostDominatorTree and the getLevel method of the DomTreeNodeBase (I think this was introduced with LLVM 5.0.0, if not available in your version you could use getChildren and perform a BFS traversal), you could filter through those with the highest level number.
I'm not sure what you want to do with loops, but if nothing special, the above should suffice. For dealing with multiple exits from functions you could make use of the mergereturn pass prior to any processing.

Why does std::atomic_compare_exchange update the expected value?

Why does std::atomic_compare_exchange and all its brothers and sisters update the passed expected value?
I am wondering if the are any reasons besides the given simplicity in loops, e.g.: is there an intrinsic function which can do that in one operation to improve performance?
The processor has to load the current value, in order to do the "compare" part of the operation. When the comparison fails the caller needs to know the new value, to retry the compare-exchange (you almost always use it in a loop), so if it wasn't returned (e.g. by modifying the expected value that is passed by reference) then the caller would need to do another atomic load to get the new value. That's wasteful, because the processor has already loaded the value. You should only be messing about with low-level atomic operations when extreme performance is the only option, so in that case you do not want to perform two operations when one will do.
is there an intrinsic function which can do that in one operation to improve performance
That can do what, specifically? The instruction has to load the current value to do the comparison, so on a mismatch yielding the current value costs nothing and is pretty much guaranteed to be useful.

Which costs more, computed goto/jump vs fastcall through function pointer?

I am in a dilemma, what would be the more performing option for the loop of a VM:
option 1 - force inline for the instruction functions, use computed goto for switch to go the call (effectively inlined code) of the instruction on that label... or...
option 2 - use a lookup array of function pointers, each pointing to a fastcall function, and the instruction determines the index.
Basically, what is better, a lookup table with jump addresses and in-line code or a lookup table with fastcall function addresses. Yes, I know, both are effectively just memory addresses and jumps back and forth, but I think fastcall may still cause some data to be pushed on the stack if out of register space, even if forced to use registers for the parameters.
Compiler is GCC.
I assume, that with "virtual machine", you refer to a simulated processor executing some sort of bytecode, similiar to the "Java virtual machine", and not a whole simulated computer that allows installation of another OS (like in VirtualBox/VMware).
My suggestion is to let the compiler do the decision, about what has the best performance, and create a big traditional "switch" on the current item of the byte code stream. This will likely result in a jump table created by the compiler, so it it as fast (or slow) as your computed goto variant, but more portable.
Your variant 2 - lookup array of function pointers - is likely slower than inlined functions, as there is likely extra overhead with non-inlined functions, such as the handling of return values. After all, some of your VM-op functions (like "goto" or "set-register-to-immediate") have to modify the instruction pointer, others don't need to.
Generally, calls to function pointers (or jumps via a jump table) are slow on current CPUs, as they are hardly predicted right by branch prediction. So, if you think about optimizing your VM, try to find a set of instructions, that requires as few code points as necessary.

Interprocess Memory Editing - Finding changed addresses

I'm currently making one of those game trainers as a small project. I've already ran into a problem; when you "go into a different level", the addresses for things such as fuel, cash, bullets, their addresses change. This would also happen say, if you were to restart the application.
How can I re-locate these addresses?
I feel like it's a fairly basic question, but it's one of those "it is or is not possible" questions to me. Should I just stop looking and forget the concept entirely? "Too hard?"
It's a bit hard to describe exactly how to do this since it heavily dependents on the program you're studying and whether the author went out if his way to make your life difficult. Note that I've only done this once but it worked reasonably well even if I only knew a little assembly.
What is probably happening is that the values are allocated on the heap using a call to malloc/new and everytime you change level they are cleaned up and re-allocated somewhere else. So the idea is to look at the assembly code of the program to find where the pointer returned by malloc is stored and figure out a way to reliably read the content of the pointer and find the value you're looking for.
First thing you'll want is a debugger like OllyDbg and a basic knowledge of assembly. After that, start by setting a read and write breakpoint on the variable you want to examine. Since you said that you can't tell exactly where the variable is, you'll have to pause the process while it's running and search the program's memory for the value. Hopefully you'll end up with only a few results to sift through but be suspicious of anything that is on the stack since it might just be a copy for a function call or for local use.
Once the breakpoint is set just run the program until a break occurs. Now all you have to do is look at the code and examine how the variable is being accessed. If it's being passed as a parameter, go examine the call site of the function. If it's being accessed through a pointer, make a note of it and start examining the pointer. If it's being accessed as an offset of a pointer, that means it's part of a data structure so make a note of it and start examining the other variable. And so on.
Stay focused on your variable and just keep examining the code until you eventually find the root which can be one of two things:
A global variable that has a static address. This is the easiest scenario since you have a static address hardcoded straight into the code that you can use to reliably walk through the data structures.
A stack allocated variable. This is trickier and I'm not entirely sure how to deal with this scenario reliably. It's possible that its address will have the same offset from the beginning of the stack most of the time but it might not. You could also walk the stack to find the corresponding function and its parameters but this a bit tricky to get right.
Once you have an address all that's left to do is use ReadProcessMemory to locate your variable using the information you found. For example, if the address you have represents a pointer to a data structure where at offset 0x40 your fuel value is stored, then you'll have to read the value at the address, add 0x40 to it and do another read on the result.
Note that the address is only valid as long as the executable doesn't change in any way. If it's recompiled or patched then you have to start over. I believe you'll also have to be careful about Windows' ASLR which might change the address around every time you start the program.
Comment box was too small to fit this so I'll put it here.
If it's esp plus a constant then I believe that this is a parameter and not a local variable (do confirm by checking the layout of the calling convention). If that's the case, then you should step the program until it returns to its caller, figure out how the parameter is being set (look for push instructions before the call instruction) and continue exploring from there. When I did this I had to unwind the stack once or twice before I found the global pointer to the data structure.
Also the esi register is not related to the stack (I had to look it up) so I'd check how it's being set. It could be that it contains the address of the data structure and the constant is the offset to the variable. If you figure out how the register is set you'll be that much closer to the pointer.

Lua garbage collection and C userdata

In my game engine I expose my Vector and Color objects to Lua, using userdata.
Now, for every even locally created Vector and Color from within Lua scripts, Luas memory usage goes up a bit, it doesn't fall until the garbage collector runs.
The garbage collector causes a small lagspike in my game.
Shouldn't the Vector and Color objects be immediately deleted if they are only used as arguments? For example like: myObject:SetPosition( Vector( 123,456 ) )
They aren't right now - the memory usage of Lua rises to 1,5 MB each second, then the lag spike occurs and it goes back to about 50KB.
How can I solve this problem, is it even solvable?
You can run a lua_setgcthreshold(L,0) to force an immediate garbage collection after you exit the function.
Edit: for 5.1 I'm seeing the following:
int lua_gc (lua_State *L, int what, int data);
Controls the garbage collector.
This function performs several tasks, according to the value of the parameter what:
* LUA_GCSTOP: stops the garbage collector.
* LUA_GCRESTART: restarts the garbage collector.
* LUA_GCCOLLECT: performs a full garbage-collection cycle.
* LUA_GCCOUNT: returns the current amount of memory (in Kbytes) in use by Lua.
* LUA_GCCOUNTB: returns the remainder of dividing the current amount of bytes of memory in use by Lua by 1024.
* LUA_GCSTEP: performs an incremental step of garbage collection. The step "size" is controlled by data (larger values mean more steps) in a non-specified way. If you want to control the step size you must experimentally tune the value of data. The function returns 1 if the step finished a garbage-collection cycle.
* LUA_GCSETPAUSE: sets data as the new value for the pause of the collector (see §2.10). The function returns the previous value of the pause.
* LUA_GCSETSTEPMUL: sets data as the new value for the step multiplier of the collector (see §2.10). The function returns the previous value of the step multiplier.
In Lua, the only way an object like userdata can be deleted is by the garbage collector. You can call the garbage collector directly, like B Mitch wrote (use lua_gc(L, LUA_CGSTEP, ...)), but there is no warranty that exactly your temporary object will be freed.
The best way to solve this is to avoid the creation of temporary objects. If you need to pass fixed parameters to methods like SetPosition, try to modify the API so that it also accepts numeric arguments, avoiding the creation of a temporary object, like so:
myObject:SetPosition(123, 456)
Lua Gems has a nice piece about optimization for Lua programs.
Remember, Lua doesn't know until runtime whether or not you saved those objects- you could have put them in a table in the registry, for example. You shouldn't even notice the impacts of collecting 1.5MB, there's another problem here.
Also, you're really being a waste making a new object for that. Remember that in Lua every object has to be dynamically allocated, so you're calling malloc to .. make a Vector object to hold two numbers? Write your function to take a pair of numeric arguments as an overload.