How can i track a specific loop in binary instrumentation by using pin tool? - c++

I am fresh in using intel pin tool, and want to track a certain loop in a binary file, but i found in each run the address of the instructions changed in each run, how can i find a specific instruction or a specific loop even it change in each run ? Edit 0: I have the following address, which one of them is the RVA:( the first section of address(small address) are constant for each run, but the last section(big address) changed for each run) Address loop_repeation No._of_Instruction_In_Loop
4195942 1 8
4195972 1 3
....... ... ...
140513052566480 1 2
...... ... ...

the address of the instructions changed in each run, how can i find a specific instruction or a specific loop even it change in each run ?
This is probably because you have ASLR enabled (which is enabled by default on Ubuntu). If you want your analyzed program to load at the same address in each run, you might want to:
1) Disable ASLR:
Disable it system-wide: sysctl -w kernel.randomize_va_space=0 as explained here.
Disable it per process: $> setarch $(uname -m) -R /bin/bash as explained here.
2) Calculate delta (offsets) in your pintool:
For each address that you manipulate, you need to use a RVA (Relative Virtual Address) rather than a full VA (Virtual Address).
Example:
Let's say on your first run your program loads at 0x80000000 (this is the "Base Address"), and a loop starts at 0x80000210
On the second run, the program loads at 0x90000000 ("Base Address") and the loops starts at 0x90000210
Just calculate the offsets of the loops from the Base Address:
Base_Address - Program_Address = offset
0x80000210 - 0x80000000 = 0x210
0x90000210 - 0x90000000 = 0x210
As both resulting offsets are the same, you know you have the exactly the same instruction, independently of the base address of the program.
How to do that in your pintool:
Given an (instruction) address, use IMG_FindByAddress to find the corresponding image (module).
From the image, use IMG_LowAddress to get the base address of the module.
Subtract the module base from the instruction: you have the RVA.
Now you can compare RVA between them and see if they are the same (they also must be in the same module).
Obviously this doesn't work for JITed code as JITed code has no executable module (think mmap() [linux] or VirtualAlloc() [windows])...
Finally there's a good paper (quite old now, but still applicable) on doing a loop detection with pin, if that can help you.

Related

Why do the first digits of disassembled instructions in gdb not match the value in rip? Can anyone provide background?

First time disassembling a program in a few months using GDB and on a new linux VM. Last time, when I disassembled a program, set a breakpoint, and ran, the value returned by "i r rip" would EXACTLY match the address of one of the program instructions.
This time, the value returned by "i r rip" == 0x5...54699 <main+15" while the assembly address shown for <+15> == "0x0...0699".
Is GDB now using relative addressing and zeroing the more significant (irrelevant?) address bits similar to what Wireshark does for sequence numbers?
This is my screen dump:
Disassembled code and rip query
You are looking at position-independent executable (PIE).
This executable is linked to load at address 0, and is relocated to 0x54... address on execution.
If you disas main before first running the binary, GDB will show the original linked-at addresses. If you do the same command after first run, GDB will show relocated (actual) addresses.
You can also link non-PIE binary with gcc t.c -no-pie. That binary will exhibit the behavior you expect: the output of disas main will not change between before and after first run, and the disassembly will match the actual value of rip at runtime.

With ASLR turned on, are all sections of an image get loaded at the same offsets relative to the image base address every time?

Do different sections of libc (such as .text, .plt, .got, .bss, .rodata, and others) get loaded at the same offset relative to the libc base address every time?
I know the loader loads libc at a random location every time I run my program.
Thank you in advance.
I guess I found the answer to my own question. I wrote a pin-tool using Intel PIN that on every libc section get loaded outputs the section offset relative to the address of libc. Here are the sections having get loaded at same offsets with their corresponding offsets (the thing before - is the library name which is libc with its version and what comes after that is the section name):
libc.so.6-.note.gnu.build-id 0x0000000000000270
libc.so.6-.note.ABI-tag 0x0000000000000294
libc.so.6-.gnu.hash 0x00000000000002b8
libc.so.6-.dynsym 0x0000000000003d80
libc.so.6-.dynstr 0x0000000000010ff8
libc.so.6-.gnu.version 0x00000000000169d8
libc.so.6-.gnu.version_d 0x0000000000017b68
libc.so.6-.gnu.version_r 0x0000000000017ee0
libc.so.6-.rela.dyn 0x0000000000017f10
libc.so.6-.rela.plt 0x000000000001f680
libc.so.6-.plt 0x000000000001f7c0
libc.so.6-.plt.got 0x000000000001f8a0
libc.so.6-.text 0x000000000001f8b0
libc.so.6-__libc_freeres_fn 0x0000000000172b10
libc.so.6-__libc_thread_freeres_fn 0x0000000000175030
libc.so.6-.rodata 0x0000000000175300
libc.so.6-.stapsdt.base 0x0000000000196650
libc.so.6-.interp 0x0000000000196660
libc.so.6-.eh_frame_hdr 0x000000000019667c
libc.so.6-.eh_frame 0x000000000019bb38
libc.so.6-.gcc_except_table 0x00000000001bc3cc
libc.so.6-.hash 0x00000000001bc810
libc.so.6-.tdata 0x00000000003c07c0
libc.so.6-.tbss 0x00000000003c07d0
libc.so.6-.init_array 0x00000000003c07d0
libc.so.6-__libc_subfreeres 0x00000000003c07e0
libc.so.6-__libc_atexit 0x00000000003c08d8
libc.so.6-__libc_thread_subfreeres 0x00000000003c08e0
libc.so.6-.data.rel.ro 0x00000000003c0900
libc.so.6-.dynamic 0x00000000003c3ba0
libc.so.6-.got 0x00000000003c3d80
libc.so.6-.got.plt 0x00000000003c4000
libc.so.6-.data 0x00000000003c4080
libc.so.6-.bss 0x00000000003c5720
And there are indeed sections that get loaded at different offsets every time. You may see them in below. However, as I do not recognize them and to me not important, I would like to conclude that yes the sections we most concern about get loaded at the same offset each time the program runs.
libc.so.6-.note.stapsdt
libc.so.6-.gnu.warning.sigstack
libc.so.6-.gnu.warning.sigreturn
libc.so.6-.gnu.warning.siggetmask
libc.so.6-.gnu.warning.tmpnam
libc.so.6-.gnu.warning.tmpnam_r
libc.so.6-.gnu.warning.tempnam
libc.so.6-.gnu.warning.sys_errlist
libc.so.6-.gnu.warning.sys_nerr
libc.so.6-.gnu.warning.gets
libc.so.6-.gnu.warning.getpw
libc.so.6-.gnu.warning.re_max_failures
libc.so.6-.gnu.warning.lchmod
libc.so.6-.gnu.warning.getwd
libc.so.6-.gnu.warning.sstk
libc.so.6-.gnu.warning.revoke
libc.so.6-.gnu.warning.mktemp
libc.so.6-.gnu.warning.gtty
libc.so.6-.gnu.warning.stty
libc.so.6-.gnu.warning.chflags
libc.so.6-.gnu.warning.fchflags
libc.so.6-.gnu.warning.__compat_bdflush
libc.so.6-.gnu.warning.__memset_zero_constant_len_parameter
libc.so.6-.gnu.warning.__gets_chk
libc.so.6-.gnu.warning.inet6_option_space
libc.so.6-.gnu.warning.inet6_option_init
libc.so.6-.gnu.warning.inet6_option_append
libc.so.6-.gnu.warning.inet6_option_alloc
libc.so.6-.gnu.warning.inet6_option_next
libc.so.6-.gnu.warning.inet6_option_find
libc.so.6-.gnu.warning.getmsg
libc.so.6-.gnu.warning.putmsg
libc.so.6-.gnu.warning.fattach
libc.so.6-.gnu.warning.fdetach
libc.so.6-.gnu.warning.setlogin
libc.so.6-.gnu_debuglink
libc.so.6-.shstrtab

PE header - New Section - How to calculate OEP address in memory?

I just succeeded in adding a section and some code to a pe file (in this case my proxy.exe) in Windows 10. I also changed the entrypoint to the beginning of my new section. When i run the patched .exe my stub gets called and i hear a high frequency Beep (i tested with a beep first), so far so good.
But now i want to return to the OEP. But how can i calculate the address my oep will have in memory?
I thought about that problem and i came up with a ?solution?:
Count the Rsize of all sections except the one i freshly added, then i get the address of my stub in memory (delta offset) and subtract the total section size from it. Now i should be at the beginning of the .text section in memory. Afterwards i add the offset for the oep and thats it.
Will that way work or did i make a mistake?Is there another, probably easier way? And what about dlls, is there a difference for the whole procedure?
http://i.imgur.com/tpUH8x4.png <- Picture of stub
Much thanks in advance! ;)
With OEP I guess you imply fixing IMAGE_OPTIONAL_HEADER. AddressOfEntryPoint.
Ok if I have got it correctly, then you will take the IP -> subtract the offset with the expected RVA (related virtual address) to get the loaded address of the module and add it to the OEP's RVA to get the real address of the entry point.
If you do the maths right this should work.
I dont have commenting privileges yet I think it should go in as a comment rather than answer as I havent tried exactly this thing but it is a proposal, which is similar to what a compiler will do for the binary.
OEP is an RVA, during injection you have the ORIG_RVA and the injected_RVA that you have substituted. In your injection logic just add
offset:push orig_rva.
Then for the offset add a relocation entry in the PE's relocation table. The loader will do the rest.

How to query BIOS using GRUB?

I am trying to make a small kernel for 80386 processor mainly for learning purpose and want to get the full memory map of the available RAM.
I have read that it is possible and better to do so with the help of GRUB than directly querying the BIOS.
Can anybody tell me how do I do it ?
Particularly, for using bios functionality in real mode we use bios interrupts and get the desired values in some registers , what is the actual equivalent way when we want to use GRUB provided functions ?
Here is the process I use in my kernel (note that this is 32bit). In my bootstrap assembly file, I tell GRUB to provide me with a memory map:
.set MEMINFO, 1 << 1 # Get memory map from GRUB
Then, GRUB loads the address of the multiboot info structure into ebx for you (this structure contains the address of the memory map). Then I call into C code to handle the actual iteration and processing of the memory map. I do something like this to iterate over the map:
/* Macro to get next entry in memory map */
#define MMAP_NEXT(m) \
(multiboot_memory_map_t*)((uint32_t)m + m->size + sizeof(uint32_t))
void read_mmap(multiboot_info_t* mbt){
multiboot_memory_map_t* mmap = (multiboot_memory_map_t*) mbt->mmap_addr;
/* Iterate over memory map */
while((uint32_t)mmap < mbt->mmap_addr + mbt->mmap_length) {
// process the current memory map entry
mmap = MMAP_NEXT(mmap);
}
}
where multiboot_info_t and multiboot_memory_map_t are defined as in the Gnu multiboot.h file. As Andrew Medico posted in the comments, here is a great link for getting started with this.

GDB Patching results in "Cannot access memory at address 0x

I have a program that I need to patch using GDB. The issue is there is a line of code that makes a "less than or equal test" and fails causing the program to end with a Segmentation fault. The program is already compiled and I do not have the source so I cannot change the source code obviously. However, using GDB, I was able to locate where the <= test is done and then I was able to locate the memory address which you can see below.
(gdb) x/100i $pc
... removed extra lines ...
0x7ffff7acb377: jle 0x7ffff7acb3b1
....
All I need to do is change the test to a 'greater than or equal to' test and then the program should run fine. The opcode for jle is 0x7e and I need to change it to 0x7d. My assignment gives instructions on how to do this as follows:
$ gdb -write -q programtomodify
(gdb) set {unsigned char} 0x8040856f = 0x7d
(gdb) quit
So I try it and get...
$ gdb -write -q player
(gdb) set {unsigned char} 0x7ffff7acb377 = 0x7d
Cannot access memory at address 0x7ffff7acb377
I have tried various other memory addresses and no matter what I try I get the same response. That is my only problem, I don't care if it's the wrong address or wrong opcode instruction at this point, I just want to be able to modify the memory.
I am running Linux Mint 14 via VMware Player
Thank
Cannot access memory at address 0x7ffff7acb377
You are trying to write to an address where some shared library resides. You can find out which library that is with info sym 0x7ffff7acb377.
At the time when you are trying to perform the patch, the said shared library has not been loaded yet, which explains the message you get.
Run the program to main. Then you should be able to write to the address. However, you'll need to have write permission on the library to make your write "stick".