LLVM insert opcode before instruction

LLVM insert opcode before instruction - llvm

I want to insert specific opcode before BasicBlock terminator, in my case before ReturnInst.
It is possible?
Example:
TerminatorInst* terminator = BasicBlock->getTerminator();
if (isa<ReturnInst>(terminator))
{
//Insert opcode here.
}

Problem solved by using llvm::InlineAsm
llvm::InlineAsm *AsmCode = llvm::InlineAsm::get(Asm, nopInstruction, "", true, false, llvm::InlineAsm::AD_Intel);
where Asm - llvm::FunctionType, nopInstruction - llvm::StringRef(char*)

By "opcode" I guess you mean "instruction".
All instructions have a constructor which receives another instruction as its last parameter; that constructor creates the new instruction and then inserts it right before the instruction that was given as the last argument, precisely what you're looking for.
For more information about this, see the "creating and inserting new instruction" section of the user guide.

Related

LLVM Metadata reference instruction

I want to a create an MDNode that references another instruction.
MDNode *mdNode = MDNode::get(ctx, llvm::LocalAsMetadata::get(decider)); // decider is an instruction
phi->setMetadata("carry", mdNode); // phi is an instruction
Unfortunately the verifier fails with "Invalid operand for global metadata!" I'm starting to think that this is not possible with the current Metadata API (it seems it might have been handled in the past). Any thoughts?

Are MachineBasicBlocks supposed to implicitly fall through to their successors?

I'm debugging an LLVM target backend, and I am chasing a problem where a certain basic block ends up jumping to "nothing", i.e. just after the end of the function, when compiled with optimizations turned on.
One thing I noticed is that after instruction selection, the machine basic block has a successor but no instruction to actually jump there:
BB#1: derived from LLVM BB %switch.lookup
Predecessors according to CFG: BB#0
%vreg5<def> = SEXT %vreg2, %SREG<imp-def,dead>; DLDREGS:%vreg5 GPR8:%vreg2
%vreg6<def,tied1> = ANDIWRdK %vreg5<tied0>, -2, %SREG<imp-def,dead>; DLDREGS:%vreg6,%vreg5
%vreg7<def> = LDIWRdK 4; DLDREGS:%vreg7
%vreg8<def> = LDIRdK 0; LD8:%vreg8
%vreg9<def> = LDIRdK 1; LD8:%vreg9
CPWRdRr %vreg6<kill>, %vreg7<kill>, %SREG<imp-def>; DLDREGS:%vreg6,%vreg7
%vreg0<def> = Select8 %vreg9<kill>, %vreg8<kill>, 1, %SREG<imp-use>; GPR8:%vreg0 LD8:%vreg9,%vreg8
Successors according to CFG: BB#2(?%)
I see similar ISel results from the x86 LLVM backend and the end result doesn't have a jump-to-nothingness, so I assume this, on its own, is not a problem:
BB#1: derived from LLVM BB %switch.lookup
Predecessors according to CFG: BB#0
%vreg7<def> = MOVSX32rr8 %vreg3; GR32:%vreg7 GR8:%vreg3
%vreg8<def,tied1> = AND32ri %vreg7<tied0>, 65534, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg7
%vreg9<def,tied1> = SUB32ri8 %vreg8<tied0>, 4, %EFLAGS<imp-def>; GR32:%vreg9,%vreg8
%vreg0<def> = SETNEr %EFLAGS<imp-use>; GR8:%vreg0
Successors according to CFG: BB#2(?%)
So my question is: What is the mechanism by which these CFG-specified successors are supposed to be turned into real jumps? Does the x86 backend implement something special for this to work that the backend I'm debuggig doesn't?
Should I change my ISelLowering class to lower Select8 into something that ends with an explicit jump, or is that unnecessary (maybe potentially even detrimental for some optimization to kick in) and there's some other magic that I need to do so that these implicit successors are correctly lowered?

It is perfectly valid for a MachineBasicBlock to fall through to the next Block:
That is valid. Passes that want to reorder basic blocks should only do
so if the AnalyzeBranch and related target hooks (Insert/Remove) allow
it.

winDBG .call equivalent in visual studio

For a homemade debugging tool built as a VS add-in, I need to:
break at some arbitrary point in my application
call into another method and break there (without adding code in that spot before runtime)
run some other command from my VS add-in at that second breakpoint
My first instinct on how to do this hit a wall at Hans' excellent answer here.
My second idea would be to set up the call to the other method from the breakpoint and have it execute when the application is allowed to continue (if you can see another way to do what I need, feel free to point it out, though !).
This would be trivial with WinDBG : just use .call and go. Unfortunately, I need to do this in Visual Studio.
Hence my question : is there some way to do this in VS ? I cannot find an equivalent to .call, nor a way to manipulate the registers and stack and emulate .call myself.

After some investigation, I believe the answer to this question is: there is no equivalent to .call in VS.
The only solution is to emulate the behavior of .call yourself by manipulating the stack pointer, instruction pointer, etc. This will obviously have limitations, e.g. mine will only work for the Microsoft x64 calling convention. Conversion to x86 and its myriad of calling conventions is left as an exercise for the reader ;)
Depending on your actual need, I have found two ways to do this. Remember, this is for calling into a function the next time the debuggee runs (so that you can break into it, since nested breakpoints are not supported). If you just need to call a function without breaking again, you are much better off just using the Immediate window to call it directly !
The easy way:
This will trick VS into thinking the current frame is in a DLL and method of your choosing. This is useful is the Expression Evaluator does not want to work in the DLL you are stopped in and needs to be in a different one.
WARNING: You cannot actually execute the method call you are faking without corrupting your stack (unless the method you are calling is very simple and you are very lucky).
Use the following to do this directly in the debugger via the Immediate window:
#rsp=#rsp-8
*((__int64*)$rsp)=#rip
#rip={,,<DLL to jump in.dll>}<method to call>
Now VS sees the DLL and method you specified as your current frame. Once you are done, use the following to return to the previous state:
#rip=*((__int64*)$rsp)
#rsp=#rsp+8
This can also be automated in a VS add-in by running these statements through EnvDTE.Debugger.GetExpression(), as demonstrated with the other method below.
The hard way:
This will work for actually calling the DLL and function you want and later returning from it cleanly. It is more complicated and more dangerous; any mistake will corrupt your stack.
It is also harder to get right for both debug and release mode, since the optimizer might have done complex things you were not expecting with the code of your callee and caller.
The idea is to emulate the Microsoft x64 calling convention (documented here) and break in the function called. We need to do the following things:
push parameters beyond the first 4 to the stack, in right to left order
create the shadow space on the stack(1)
push the return address, i.e. the current value of RIP
set RIP to the address of the function to call, just like above
save all the registers that the callee may change, and the caller might not expect to change. This basically means saving everything marked 'volatile'here.
set a breakpoint in the callee
run the debuggee
when the debuggee breaks again, perform whatever operations we want
step out
restore the saved registers
return RSP to the correct location (i.e. tear down the shadow space)
remove the breakpoint
(1) 32 bytes of scratch space for the callee to spill the first 4 arguments that are passed by registers (usually; the callee can actually use this however it likes).
Here is a simplified chunk of my VS addin to do this for a very basic case (non member function taking one parameter set to 0 and not touching too many registers). Anything beyond this is again left as an exercise for the reader ;)
EnvDTE90a.Debugger4 dbg = (EnvDTE90a.Debugger4)DTE.Debugger;
string method = "{,,dllname.dll}function";
string RAX = null, RCX = null, flags = null;
// get the address of the function to call and the address to break at (function address + a bit, to skip some prolog and help our breakpoint actually hit)
Expression expr = dbg.GetExpression3(method, dbg.CurrentThread.StackFrames.Item(1), false, false, false, 0);
string addr = expr.Value;
string addrToBreak = (UInt64.Parse(addr.Substring(2), NumberStyles.HexNumber) + 2).ToString();
if (!expr.IsValidValue)
return;
// set a breakpoint in the function to jump into
EnvDTE.Breakpoints bpsAdded = dbg.Breakpoints.Add("", "", 0, 0, "", dbgBreakpointConditionType.dbgBreakpointConditionTypeWhenTrue, "c++", "", 0, addrToBreak, 0, dbgHitCountType.dbgHitCountTypeNone);
if (bpsAdded.Count != 1)
return;
// set up the shadow space and parameter space
// NB: for 1 parameter : 4 words of shadow space, no further parameters... BUT, since the stack needs to be 16 BYTES aligned (i.e. 2 words) and the return address takes a single word, we need to offset by 5 !
dbg.GetExpression3("#rsp=#rsp-8*5", dbg.CurrentStackFrame, false, true, false, 0);
// set up the return address
dbg.GetExpression3("#rsp=#rsp-8*1", dbg.CurrentStackFrame, false, true, false, 0);
dbg.GetExpression3("*((__int64*)$rsp)=#rip", dbg.CurrentStackFrame, false, true, false, 0);
// save the registers
RAX = dbg.GetExpression3("#rax", dbg.CurrentStackFrame, false, true, false, 0).Value;
RCX = dbg.GetExpression3("#rcx", dbg.CurrentStackFrame, false, true, false, 0).Value;
// save the flags
flags = dbg.GetExpression3("#efl", dbg.CurrentStackFrame, false, true, false, 0).Value;
// set up the parameter for the call
dbg.GetExpression3("#rcx=0x0", dbg.CurrentStackFrame, false, true, false, 0);
// set the instruction pointer to our target function
dbg.GetExpression3("#rip=" + addr, dbg.CurrentStackFrame, false, true, false, 0);
dbg.Go(true);
// DO SOMETHING USEFUL HERE ! ;)
dbg.StepOut(true);
// restore all registers
dbg.GetExpression3("#rax=" + RAX, dbg.CurrentStackFrame, false, true, false, 0);
dbg.GetExpression3("#rcx=" + RCX, dbg.CurrentStackFrame, false, true, false, 0);
// restore flags
dbg.GetExpression3("#efl=" + flags, dbg.CurrentStackFrame, false, true, false, 0);
// tear down the shadow space
dbg.GetExpression3("#rsp=#rsp+8*5", dbg.CurrentStackFrame, false, true, false, 0);
}

Gdb conditional step based on memory address?

I 'm wondering if it 's possible to create a script that will continue the program 's execution (after a break) step by step based on the memory address value.
So, if I 'm tracing a function and it goes into a high memory value, I 'd call the gdb script until the memory value is below a set value - then it would break again.
I 'm very new to gdb and still reading the manual/tutorials, but I 'd like to know if my goal is possible :) - and if you could bump me to the proper direction, even better ;)
Thanks!
Edit, updated with pseudocode:
while (1) {
cma = getMemoryAddressForCurrentInstruction();
if (cma > 0xdeadbeef) {
stepi;
} else {
break;
}
}

You're talking about the Program Counter (sometimes called the instruction pointer). It's available in gdb as $pc. Your pseudocode can be translated into this actual gdb command:
while $pc <= 0xdeadbeef
stepi
It'll be slow, since it's starting and stopping the program for every instruction, but as far as I know there's no fast way to do it if you don't know exactly what address you're looking for. If you do, then you can just set a breakpoint there:
break *0xf0abcdef
cont
will run until the program counter hits 0xf0abcdef

Searching for thread start parameters at top of stack

I've inherited some code that worked on Windows 2000 thats using a small piece of assembly code to locate the base address of the stack, then it uses an offset to grab the parameter value passed to the thread start function.
However this doesnt work in Windows 2008 Server. The offset is obviously different.
#define TEB_OFFSET 4
DWORD * pStackBase;
__asm { mov eax,fs:[TEB_OFFSET]}
__asm { mov pStackBase,eax}
// Read the parameter off the stack
#define PARAM_0_OF_BASE_THEAD_START_OFFSET -3
g_dwCtrlRoutineAddr = pStackBase[PARAM_0_OF_BASE_THEAD_START_OFFSET];
After experimenting, I modified the code to look up the stack till it finds the first non-NULL value. hack
DWORD* pStack = pStackBase;
do
{
pStack--;
}
while (*pStack == NULL);
// Read the parameter off the stack
g_dwCtrlRoutineAddr = *pStack;
Its works! But I want a 'correct' solution.
Does anyone know a safer/better solution for getting the parameter passed to the starting function of a thread on Windows 2008 Server?
The thread start function is ntdll!_RtlUserThreadStart
And the first parameter I'm trying to locate is the address of the function kernel32!CtrlRoutine

Bizarre. Looking with the debugger, the first non-null value on the thread's stack is the value of the argument, the 4th argument to CreateThread. Which gets handed to you on a silver platter when you write the thread procedure like this:
DWORD WINAPI threadProc(void* arg) {
// arg is the value you are looking for
// etc..
}
Not sure how that's related to "kernel32!CtrlRoutine" unless this is a thread used in a service.

To get address of kernel32!CtrlRoutine you can get it by RVA, using a table with all (kernel32 version, CtrlRoutine RVA) pairs. It's the most reliable way.

This is also discussed http://www.latenighthacking.com/projects/2003/sendsignal/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

LLVM insert opcode before instruction - llvm

I want to insert specific opcode before BasicBlock terminator, in my case before ReturnInst. It is possible? Example: TerminatorInst* terminator = BasicBlock->getTerminator(); if (isa<ReturnInst>(terminator)) { //Insert opcode here. }

Problem solved by using llvm::InlineAsm llvm::InlineAsm AsmCode = llvm::InlineAsm::get(Asm, nopInstruction, "", true, false, llvm::InlineAsm::AD_Intel); where Asm - llvm::FunctionType, nopInstruction - llvm::StringRef(char)

Related

LLVM Metadata reference instruction

Are MachineBasicBlocks supposed to implicitly fall through to their successors?

winDBG .call equivalent in visual studio

Gdb conditional step based on memory address?

Searching for thread start parameters at top of stack

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

LLVM insert opcode before instruction - llvm

I want to insert specific opcode before BasicBlock terminator, in my case before ReturnInst. It is possible? Example: TerminatorInst* terminator = BasicBlock->getTerminator(); if (isa<ReturnInst>(terminator)) { //Insert opcode here. }

Problem solved by using llvm::InlineAsm llvm::InlineAsm *AsmCode = llvm::InlineAsm::get(Asm, nopInstruction, "", true, false, llvm::InlineAsm::AD_Intel); where Asm - llvm::FunctionType, nopInstruction - llvm::StringRef(char*)

Related

LLVM Metadata reference instruction

Are MachineBasicBlocks supposed to implicitly fall through to their successors?

winDBG .call equivalent in visual studio

Gdb conditional step based on memory address?

Searching for thread start parameters at top of stack

Categories

Resources

Problem solved by using llvm::InlineAsm llvm::InlineAsm AsmCode = llvm::InlineAsm::get(Asm, nopInstruction, "", true, false, llvm::InlineAsm::AD_Intel); where Asm - llvm::FunctionType, nopInstruction - llvm::StringRef(char)