How to disassemble a memory range with GDB? - gdb

I'm trying to disassemble a program to see a syscall assembly instruction (the INT instruction, I believe) and the handler with GDB and have written a little program (see below) for it that opens and closes a file.
I was able to follow the call to fopen with GDB until it executed a call.
When I tried to tell GDB "disassemble 0x...." (address of call) it responded with 'No function contains specified address.'
Is it possible to force GDB to disassemble (or display it in assembler as good as possible) that memory address? If so, how?
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE* f;
f = fopen("main.c", "r");
if (!f) {
perror("open");
return -1;
}
fclose(f);
return 0;
}

Yeah, disassemble is not the best command to use here.
The command you want is "x/i" (examine as instructions):
(gdb) x/i 0xdeadbeef

Do you only want to disassemble your actual main? If so try this:
(gdb) info line main
(gdb) disas STARTADDRESS ENDADDRESS
Like so:
USER#MACHINE /cygdrive/c/prog/dsa
$ gcc-3.exe -g main.c
USER#MACHINE /cygdrive/c/prog/dsa
$ gdb a.exe
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
...
(gdb) info line main
Line 3 of "main.c" starts at address 0x401050 <main> and ends at 0x401075 <main+
(gdb) disas 0x401050 0x401075
Dump of assembler code from 0x401050 to 0x401075:
0x00401050 <main+0>: push %ebp
0x00401051 <main+1>: mov %esp,%ebp
0x00401053 <main+3>: sub $0x18,%esp
0x00401056 <main+6>: and $0xfffffff0,%esp
0x00401059 <main+9>: mov $0x0,%eax
0x0040105e <main+14>: add $0xf,%eax
0x00401061 <main+17>: add $0xf,%eax
0x00401064 <main+20>: shr $0x4,%eax
0x00401067 <main+23>: shl $0x4,%eax
0x0040106a <main+26>: mov %eax,-0xc(%ebp)
0x0040106d <main+29>: mov -0xc(%ebp),%eax
0x00401070 <main+32>: call 0x4010c4 <_alloca>
End of assembler dump.
I don't see your system interrupt call however. (its been a while since I last tried to make a system call in assembly. INT 21h though, last I recall

This isn't the direct answer to your question, but since you seem to just want to disassemble the binary, perhaps you could just use objdump:
objdump -d program
This should give you its dissassembly. You can add -S if you want it source-annotated.

You can force gcc to output directly to assembly code by adding the -S switch
gcc -S hello.c

fopen() is a C library function and so you won't see any syscall instructions in your code, just a regular function call. At some point, it does call open(2), but it does that via a trampoline. There is simply a jump to the VDSO page, which is provided by the kernel to every process. The VDSO then provides code to make the system call. On modern processors, the SYSCALL or SYSENTER instructions will be used, but you can also use INT 80h on x86 processors.

If all that you want is to see the disassembly with the INTC call, use objdump -d as someone mentioned but use the -static option when compiling. Otherwise the fopen function is not compiled into the elf and is linked at runtime.

gdb disassemble has a /m to include source code alongside the instructions. This is equivalent of objdump -S, with the extra benefit of confining to just the one function (or address-range) of interest.

You don't have to use gdb. GCC will do it.
gcc -S foo.c
This will create foo.s which is the assembly.
gcc -m32 -c -g -Wa,-a,-ad foo.c > foo.lst
The above version will create a listing file that has both the C and the assembly generated by it. GCC FAQ

full example for disassembling a memory range to C
/opt/gcc-arm-none-eabi-9-2019-q4-major/bin/arm-none-eabi-gdb
(gdb)file /root/ncs/zephyr/samples/hello_world/build_nrf9160dk_nrf9160ns/zephyr/zephyr.elf
(gdb) directory /root/ncs/zephyr/samples/hello_world/src
#here you want 1
(gdb) info line* 0x000328C0
#here you want 2, -0x04 ~ +0x04 is your range size
(gdb) disassemble /m 0x000328C0-0x04, 0x000328C0+0x04
#here with binary code
(gdb) disassemble /r 0x000328C0-0x04, 0x000328C0+0x04
(gdb) info thread
(gdb) interpreter-exec mi -thread-info

The accepted is not really correct. It does work in some circumstances.
(gdb) disas STARTADDRESS ENDADDRESS
The highest upvoted answer is correct. Read no further is you don't wish to understand why it is correct.
(gdb) x/i 0xdeadbeef
With an appropriately meaningless hex address.
I have an STM32 and I have relocated the code with PIC. The normal boot address is 0x8000000, with a 0x200 vector table. So a normal entry is 0x8000200. However, I have programmed the binary to 0x80040200 (two NOR flash sectors away) and wish to debug there.
The issue gdb has with this is 'file foo.elf' is showing that code is in the first range. Special command like 'disassemble' will actually look at the binary on the host. For the cross debug case, gdb would have to look at memory on the remote which could be expensive. So, it appears that the 'x /i' (examine as code) is the best option. The debug information that gdb depends on (where routines start/end) is not present in a random binary chunk.
To combine the answers above for PIC code on an embedded cross system,
You need to create multiple elf files, one for each possible target location. Use the GDB's file command to select the one with proper symbol locations.
This will NOT work for Cross development
You can use generating gcc debug symbols. The steps are,
Build normal link address.
Extract symbols.
Use symbol-file with an offset for the runtime address.
(gdb) help symbol-file
Load symbol table from executable file FILE.
Usage: symbol-file [-readnow | -readnever] [-o OFF] FILE
OFF is an optional offset which is added to each section address.
You can then switch symbol files to a relocated run address to use the first answer.
If you have a case where the code is relocated, but data is absolute, you need to link twice and choose the relocated elf files (symbols only are relocated and code is the same). This is desirable with NOR flash that is XIP (execute-in-place) as the memory devices for .text and .rodata are different from .data and .bss. Ie, many lower-to-middle scale embedded devices. However, gcc does not support this code generation option (at least on ARM). You must use a 'static base' register (for example, r9 as u-boot does).

There is another way which I wanted to presetn using gdb on top of the suggestions above:
Launch your program with gdb, and set a break point on main break *main and run
The you can use info proc mappings.

Related

GDB Debugger cannot use scanf [duplicate]

I'm following these lessons from OpenSecurityTraining.
I've reached the lab part where I've to train myself on a CMU Bomb. They provide a x86_64 compiled CMU Bomb that you can find here to train on : CMU Bomb x86-64 originally from a 32-bit bomb from CMU Labs for
Computer Systems: A Programmer's Perspective (CS:APP) 1st edition.
I had a virtualized 64 bits Elementary OS distribution where I disassembled the CMU Bomb without problems using GDB. Now, I've a 64 bits Ubuntu 14.04 LTS (not virtualized) and when I try to reproduce why I did on my Elementary OS, I get the famous error.
I run these commands :
gdb ./bomb-x64
(gdb) b main
Breakpoint 1 at 0x400dbd: file bomb.c, line 37. -- why bomb.c ?
(gdb) r
...
bomb.c: no such file or directory
Edit : I can create breakpoints on others functions of the CMU Bomb and it works as expected.
Example :
(gdb) b phase_1
Breakpoint 3 at 0x400f00
(gdb) r
Breakpoint 1, 0x0000000000400f00 in phase_1 ()
(gdb) disas
Dump of assembler code for function phase_1:
=> 0x0000000000400f00 <+0>: sub $0x8,%rsp
0x0000000000400f04 <+4>: mov $0x4023b0,%esi
0x0000000000400f09 <+9>: callq 0x401308 <strings_not_equal>
0x0000000000400f0e <+14>: test %eax,%eax
0x0000000000400f10 <+16>: je 0x400f17 <phase_1+23>
0x0000000000400f12 <+18>: callq 0x40140a <explode_bomb>
0x0000000000400f17 <+23>: add $0x8,%rsp
0x0000000000400f1b <+27>: retq
End of assembler dump.
I've heard of ia32-libs but this doesn't do anything more since I'm on 64bits Ubuntu and run a 64bits compiled CMU Bomb, am I wrong ?
Use dir command to set source path
dir /usr/src/debug
in above path. Your code should present.
The executable contains debugging symbols, which indicate the file (and particular line in the file) corresponding to each bit of assembled code. This is what allows you to step through C code in the debugger. The debugging symbols are put there by the compiler (e.g. by using the -g argument to gcc).
If you don't have the C files that were used to compile the executable, the debugger won't be able to show you the C, and you'll be limited to looking at assembly.
(gdb) list
/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/tasks.c: No such file or directory.
(gdb) set substitute-path /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/ C:/Espressif/frameworks/esp-idf-v4.4/

How to map $eip in gdb to output of objdump -d?

I have an incomplete stacktrace which stops at a known library (linux i686 architecture). In order to ascertain the function last called, I am trying to map $eip as output by gdb, to an address within a file generated by "objdump -d library.so".
I thought I might be able to use the From address output from "info shared" within gdb, along with the $eip to calculate an offset, which I could then translate to an offset from the disassembly text section of the objdump -d output?
Not sure if this approach is sensible, but trying it in a simple test harness app with a shared library does not give me an address within the right function.
Any help much appreciated.
I thought I might be able to use the From address output from "info shared" within gdb, along with the $eip to calculate an offset, which I could then translate to an offset from the disassembly text section of the objdump -d output?
Yes, that is exactly what you need to do.
The From address in GDB display tells you where .text section of the shared library was located.
The readelf -S foo.so | grep '\.text' will tell you offset of .text in the foo.so itself. Subtract one from the other, and you get the relocation for that shared library (it will be page-aligned).
Now take the $eip from GDB, subtract relocation, and you'll get an address that will match output of nm and objdump for foo.so.
However, GDB will have already completed all of the above steps internally. If it wasn't able to deduce which function $eip ended up in, you shouldn't expect that performing these steps manually will produce any better result.

How to map PC (ARMv5) address to source code?

I'm developing on an ARM9E processor running Linux. Sometimes my application crashes with the following message :
[ 142.410000] Alignment trap: rtspserverd (996) PC=0x4034f61c
Instr=0xe591300c Address=0x0000000d FSR 0x001
How can I translate the PC address to actual source code? In other words, how can I make sense out of this message?
With objdump. Dump your executable, then search for 4034f61c:.
The -x, --disassemble, and -l options are particularly useful.
You can turn on listings in the compiler and tell the linker to produce a map file. The map file will give you the meaning of the absolute addresses up to the function where the problem occurs, while the listing will help you pinpoint the exact location of the exception within the function.
For example in gcc you can do
gcc -Wa,-a,-ad -c foo.c > foo.lst
to produce a listing in the file foo.lst.
-Wa, sends the following options to the assembler (gas).
-a tells gas to produce a listing on standard output.
-ad tells gas to omit debug directives, which would otherwise add a lot of clutter.
The option for the GNU linker to produce a map file is -M or --print-map. If you link with gcc you need to pass the option to the linker with an option starting with -Wl,, for example -Wl,-M.
Alternatively you could also run your application in the debugger (e.g. gdb) and look at the stack dump after the crash with the bt command.

Disassemble x86 code in a program at a specific address

Is there any x86 disassembler framework that can be used to analyze code from a specific address in a program, as in:
info = disassemble( startAddress , stopAddress)
It should show every instruction and its operands and any other info that is good for analysis but it should have also fast mode where it isn't so important to obtain that much info for each instruction, but only for some of them that can be specified.
Is GNU binutils not good enough? Here's how to do that with the objdump utility:
# Disassemble from virtual addresses 0x80000000 to 80000100
objdump -d program --start-address=0x80000000 --stop-address=0x80000100
Google's protobuf uses libdisasm for that matter. Sad thing is that (judging from source code) it only supports ia32 and x86 and homepage states that "it is x86 specific and will not be expanded to include other CPU architectures". But since you didn't mention other archs, this library may be sufficient.
The cross-platform InstructionAPI suits this purpose, it will analyze the code and can print out the disassembly or provide a machine-independent view of the instructions for you to query. InstructionAPI is a shared library that you would link your code against.
http://www.paradyn.org/html/manuals.html
I would debug to the start point and look at the disassembly output of the debugger. A more brutal method is to disassemble it all and search for the function name in the disassembly file. objconv can do this, but it is slow on very big files.
Try BeaEngine

GCC: Compile to assembly with clarified correspondence to code?

When you break into the debugger in VS and open the disassembly window, each assembly fragment is displayed below it's corresponding code section (more or less). GCC with -S outputs only the stripped down assembly.
Is there an option in GCC to show some correspondence to the original code?
Source code is C++.
Compile your code with gcc -g, then you can disassemble with objdump -S yourfile. this will give you a disassembly interspersed with the source.
If you are asking about debugging, in gdb use the disassemble command with a /m (mixed) flag:
(gdb) disas /m main
would disassemble main with C++ code interspersed with assembler, assuming the code is available and you compiled with the -g flag.
Disassembly the object instead. The code given normally by -S is exactly what gcc generate for your code, without start code or other things that are put together by the linker. Complement: of course having debug infos in the object helps alot.
gcc yourFile.C -S -fverbose-asm
Not exactly what you're looking for, but more useful than nothing.