How to know whether a SDNode has physical instruction associated with it?

How to know whether a SDNode has physical instruction associated with it? - llvm

As both target independent IR and target specific instruction can be represented as SDNode, is there a function to tell whether the SDNode has physical instructions associated with it which means it has been lowered or passes instruction selection?

You can use dyn_cast<MachineSDNode>(N) or isa<MachineSDNode>(N)

Related

Trace32 CMM script : understanding the Data.Set command

What does the following command mean?
sYmbol.NEW _VectorTable 0x34000000
sYmbol.NEW _ResetVector 0x34000020
sYmbol.NEW _InitialSP 0x34000100
Data.Set EAXI:_VectorTable %Long _InitialSP _ResetVector+1

The command Data.Set writes data values to your target's memory. The syntax of the command is
Data.Set <address>|<address_range> [<access_width>] {value(s)}
The <address> to which the data is written to has the form:
<access_class>:<address_offset>
A full address, just the address offset and the values (you want to write), can also be represented by debug symbols. These symbols are usually the variables, function names and labels defined in your target application and are declared to the debugger, by loading the target application's ELF file.
In this case however the symbols are declared in the debugger manually by the command sYmbol.NEW.
Anyway: By replacing the symbols with their value in the command Data.Set EAXI:_VectorTable %Long _InitialSP _ResetVector+1 we get the command
Data.Set EAXI:0x34000000 %Long 0x34000100 0x34000021
So what does this command actually do?
The access-width specifier %Long indicate that 32-bit values should be written. As a result the address will increment automatically by 4 for each specified data value.
The value 0x34000100 is written to address EAXI:0x34000000
The value 0x34000021 is written to address EAXI:0x34000004
The <access_class> "EAXI" indicates that the debugger should access the address 0x34000000 directly via the AXI bus (Advanced eXtensible Interface). By writing directly to the AXI bus, you bypass your target's CPU core (bypassing any MMU, MPU or caches). The leading 'E' of the access class EAXI indicates that the write operation may also performed while the CPU core is running (or considered to be running (e.g. in Prepare mode)). The description of all possible access classes is specific to the target's core-architecture and thus, you can find the description in the debugger's "Target Architecture Manual".
And what does this exactly mean for your target and the application running on it?
Well, I don't know you chip or SoC (nor do I know your application).
But from the data I see, I guess that you are debugging a chip with an ARM architecture - probably Cortex-M. Your chip's Boot-ROM seems to start at address 0x34000000, while your actual application's start-up code starts at 0x34000020 (maybe with symbol _start).
For Cortex-M cores you have to program at offset 0 of your vector table (in the boot ROM) the initial value of the stack-pointer, while at offset 4 you have to write the initial value of the program counter. In your case the program counter should be initialized with 0x34000021. Why 0x34000021 and not 0x34000020? Because your start-up code is probably encoded in ARM Thumb. (Cortex-M can only execute Thumb code). By setting the least significant bit of the initial value for the program counters to 1, the core knows, that it should start decoding Thumb instructions. (Not setting the least significant bit to 1 on a Cortex-M will cause an exception).

understanding DAG: node creation

I have an intrinsic and it must be lowered to instruction, which works with fixed register.
So, it takes contents of this register and another argument as input. After execution, the result of the instruction is placed into that same fixed register.
What should I do in SelectionDAGBuilder::visitIntrinicCall to describe such behaviour for a SDNode?

Why this branch instruction of ARM doesn't work

Now I am writing a library to mock the trivial function for C/C++. It is used like this: MOCK(mocked, substitute)
If you call the mocked function, the substitute function will be called instead.
I modify the attribute of code page and inject the jump code into the function to implement it. I have implemented it for x86 CPU and I want to port it to ARM CPU. But I have a problem when I inject binary code.
For example, the address of substitute function is 0x91f1, and the address of function to mock is 0x91d1. So I want to inject the ARM branch code into 0x91d1 to jump to the substitute function.
According to the document online, the relative address is
(0x91f1 - (0x91d1 + 8)) / 4 = 6
so the binary instruction is:
0xea000006
Because my arm emulator(I use Android arm v7 emulator) is little endian, so the binary code to inject is:
0x060000ea
But when I executed the mocked function after injecting branch code, segment fault occurred. I don't know why the branch instruction is wrong. I have not learned ARM architecture so I don't know whether the branch instruction of ARM has some limits.

Addresses you are branching to is odd numbered, meaning they are in Thumb mode.
There is an obvious problem with your approach.
If target is in Thumb mode, you either need to be in Thumb mode at the point you are branching from or you need to use a bx (Branch and Exchange) instruction.
Your functions are in Thumb mode (+1 at the target) but you are using ARM mode branch coding (B A1 coding?), so obviously either you are not in Thumb mode or you are using ARM mode instruction in Thumb mode.

The ARM family allows loading of registers with values. One of those registers is the PC (Program Counter).
Some alternatives:
You could use a function to load the PC register with the
destination address (absolute).
Add the PC register with an offset.
Use a multiply-and-add instruction with the PC register.
Push the destination register onto the stack and pop into PC
register.
These choices plus modifying the destination of the branch instructions are all different options at are not "best". Pick one that suits you best and is easiest to maintain.

Count number of instructions of various types using LLVM

I'm a new user to the LLVM Compiler Infrastructure. I've gone through the LLVM Programmer's Manual Documentation and understood how to iterate over basic blocks. I wanted to know whether there are any predefined passes for counting instructions. I understand there is instcount, but that returns the total number of instructions. I'm targeting primarily integer and floating point operations. Also what should I do in cases where there are operands of different types in an expression?

The InstCount pass already has a separate counter for each instruction type, in addition to a total instruction count. For example, the number of add instructions will be stored in the NumAddInst statistic variable. You can use that pass or reuse some of its code.

gdb : findind every jumps to an address

I'm trying to understand a small binary using gdb but there is something I can't find a way to achieve : how can I find the list of jumps that point to a specified address?
I have a small set of instructions in the disassembled code and I want to know where it is called.
I first thought about searching the corresponding instruction in .text, but since there are many kind of jumps, and address can be relative, this can't work.
Is there a way to do that?
Alternatively, if I put a breakpoint on this address, is there a way to know the address of the previous instruction (in this case, the jump)?

If this is some subroutine being called from other places, then it must respect some ABI while it's called.
Depending on a CPU used, the return address (and therefore a place from where it was called) will be stored somewhere (on stack or in some registers). If you replace original code with the one that examines this, you can create a list of return addresses. Or simpler, as you suggested, if you use gdb and put a breakpoint at that routine, you can see from where it was called by using a bt command.
If it was actual jump (versus a "jump to subroutine") that led you there (which I doubt, if it's called from many places, unless it's a kind of longjmp/setjmp), then you will probably not be able to determine where this was called from, unless the CPU you are using allows you to trace the execution in some way.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js