How to map $eip in gdb to output of objdump -d?

How to map $eip in gdb to output of objdump -d? - gdb

I have an incomplete stacktrace which stops at a known library (linux i686 architecture). In order to ascertain the function last called, I am trying to map $eip as output by gdb, to an address within a file generated by "objdump -d library.so".
I thought I might be able to use the From address output from "info shared" within gdb, along with the $eip to calculate an offset, which I could then translate to an offset from the disassembly text section of the objdump -d output?
Not sure if this approach is sensible, but trying it in a simple test harness app with a shared library does not give me an address within the right function.
Any help much appreciated.

I thought I might be able to use the From address output from "info shared" within gdb, along with the $eip to calculate an offset, which I could then translate to an offset from the disassembly text section of the objdump -d output?
Yes, that is exactly what you need to do.
The From address in GDB display tells you where .text section of the shared library was located.
The readelf -S foo.so | grep '\.text' will tell you offset of .text in the foo.so itself. Subtract one from the other, and you get the relocation for that shared library (it will be page-aligned).
Now take the $eip from GDB, subtract relocation, and you'll get an address that will match output of nm and objdump for foo.so.
However, GDB will have already completed all of the above steps internally. If it wasn't able to deduce which function $eip ended up in, you shouldn't expect that performing these steps manually will produce any better result.

Related

nm versus gdb break

I am working on Ubuntu 14.04 LTS.
I have an executable file exec compiled from file.c. The file.c makes use of functions from a static library. For example, let's says that fubar() is a function of the static library that is used in file.c.
This is something that I have noticed.
nm exec | grep fubar gives a certain value.
(on my system and for my executable, 0808377f)
gdb ./exec and then break fubar gives a different value.
(on my system and for my executable, 0x8083785)
When I do a similar thing for another executable file (exec1 compiled from file1.c, it outputs the same value for both the commands).
Both the commands are supposed to output the same virtual address. Aren't they? I am obviously missing something. Can someone explain what exactly is happening? And what is the difference between both the commands.

Barring unusual things like -fPIE, what is going on here is that the gdb command break function actually means "break after the function prologue for function". This way, arguments are set up properly by the time the breakpoint is hit.
If you want to break exactly at the first instruction of a function, use the * syntax, like:
(gdb) break *function
If you do this the addresses will probably match.

Measure static memory usage for C++ ported to embedded platform

I have created a small program as a proof-of-concept for a system which are to be implemented on an embedded platform. The program is written in C++11 with use of std and compiled to run on a laptop. The final program which should be implemented later is an embedded system. We do not have access to the compiler of the embedded platform.
I would like to know if there is a way to determine a programs static memory (the size of the compiled binaries) in a sensible and comparable way when it should be ported to an embedded platform.
The requirement is that the size of the binary is less than 10kb.
Our binary has a size of 700Kb when compiled and stripped with the following flags:
g++ options: -Os -s -ffunction-sections -fdata-sections
linker options: -s -Wl,--gc-sections
strip libmodel.a -s -R .comment -R .gnu.version --strip-unneeded -R .note
It took up 4MB before we used strip and optimization options.
I am still way off and it is not really that big a program. How can I justify a comparison in any way with an equivalent program on an embedded platform.

Note that the size of the binary can be a little deceptive in the sense that uninitialised variables, the .bss sections, will not necessarily take up physical space in the binary as these are generally just noted as present without actually have any space given to them... this normally happens by the OS loader when it runs your program.
objdump (http://www.gnu.org/software/binutils/) or perhaps elfdump or the elf tool chain (http://sourceforge.net/apps/trac/elftoolchain/) will help you determine the size of your various segments, data and text, as well as the size of individual functions and globals etc. All these programs "look" into your compiled binary and extract a lot of information such as the size of the .text, .data section, list the various symbols, their locations and sizes, and can even dissasemble the .text section...
An example of using elfdump on an ELF image test.elf might be elfdump -z test.elf > output.txt. This will dump everything including text section dissassembly. For example, from an elfdump on my system I saw
Section #6: .text, type=NOBITS, addr=0x500, off=0x5f168
size=149404(0x2479c), link=0, info=0, align=16, entsize=1
flags=<WRITE,ALLOC,EXECINSTR>
Section #7: .text, type=NOBITS, addr=0x24c9c, off=0x5f168
size=362822(0x58946), link=0, info=0, align=4, entsize=1
flags=<WRITE,ALLOC,EXECINSTR,INCLUDE>
....
Section #9: .rodata, type=NOBITS, addr=0x7d5e4, off=0x5f168
size=7670(0x1df6), link=0, info=0, align=4, entsize=1
flags=<WRITE,ALLOC>
So I can see how much my code is taking up (the .text sections) and my read only data. Later in the file I then see...
Symbol table ".symtab"
Value Size Bind Type Section Name
----- ---- ---- ---- ------- ----
218 0x7c090 130 LOC FUNC .text IRemovedThisName
So I can see that my function IRemovedThisName takes 130 bytes. A quick script would allow you list functions sorted by size and variables sorted by size. This could point you at places to optimize...
For a good example of objdump try http://www.thegeekstuff.com/2012/09/objdump-examples/, specifically the section 3, which shows you how to get the contents of the section headers using the -h option.
As to how the program will compare on two different platforms I think you will just have to compile on both platforms and compare the results you get from your obj/elfdump on each system - the results will depend on the system instruction set, how well each compiler can optimize, general hardware architecture differences etc.
If you don't have access to the embedded system, you might try using a cross-compiler, configured for your eventual target, on your laptop. This would give you a binary suited to the embedded platform and the tools to analyze the file (i.e. the cross-platform version of objdump). This would give you some ball-park figures for how the program would look on the eventual embedded sys.
Hope this helps.
EDIT: This will also help How to get the size of a C function from inside a C program or with inline assembly?

It appeared that the included libraries took up an enormous of space (as it was pointed out in the comment) and by removing these it was possible to reduce the size to nearly nothing in combination with the following flags:
set(CMAKE_CXX_FLAGS "-Os -s -ffunction-sections -fdata-sections -DNO_STD -fno-rtti -fno-exceptions")
set(CMAKE_EXE_LINKER_FLAGS "-s -Wl,--gc-sections")
And stripping away any unnecessary code using:
strip libmodel.a -s -R .comment -R .gnu.version --strip-unneeded -R .note
The 4MB could be reduced to 9.4kb which is below our limit.
In summary, std takes up an tremendous amount of space.

How may I determine the cxxabi before loading a shared object with dlopen()?

I would like to determine that I am loading a compatible binary before calling dlopen(). I want to determine the cxxabi level before I load the library.

You could scan the list of symbols used by the binary before opening it. I am not sure how to do this in a program, although you can read the source for readelf for hints.
Using readelf -d -s -W /usr/lib/libstdc++.so.6 | c++filt | less on a Linux system I see some symbols marked like this: __gnu_cxx::__verbose_terminate_handler()##CXXABI_1.3
However, I would probably just try dlopen() and if it returns NULL, use dlerror() to report an error, then let the user figure it out.

How to map PC (ARMv5) address to source code?

I'm developing on an ARM9E processor running Linux. Sometimes my application crashes with the following message :
[ 142.410000] Alignment trap: rtspserverd (996) PC=0x4034f61c
Instr=0xe591300c Address=0x0000000d FSR 0x001
How can I translate the PC address to actual source code? In other words, how can I make sense out of this message?

With objdump. Dump your executable, then search for 4034f61c:.
The -x, --disassemble, and -l options are particularly useful.

You can turn on listings in the compiler and tell the linker to produce a map file. The map file will give you the meaning of the absolute addresses up to the function where the problem occurs, while the listing will help you pinpoint the exact location of the exception within the function.
For example in gcc you can do
gcc -Wa,-a,-ad -c foo.c > foo.lst
to produce a listing in the file foo.lst.
-Wa, sends the following options to the assembler (gas).
-a tells gas to produce a listing on standard output.
-ad tells gas to omit debug directives, which would otherwise add a lot of clutter.
The option for the GNU linker to produce a map file is -M or --print-map. If you link with gcc you need to pass the option to the linker with an option starting with -Wl,, for example -Wl,-M.
Alternatively you could also run your application in the debugger (e.g. gdb) and look at the stack dump after the crash with the bt command.

How to disassemble a memory range with GDB?

I'm trying to disassemble a program to see a syscall assembly instruction (the INT instruction, I believe) and the handler with GDB and have written a little program (see below) for it that opens and closes a file.
I was able to follow the call to fopen with GDB until it executed a call.
When I tried to tell GDB "disassemble 0x...." (address of call) it responded with 'No function contains specified address.'
Is it possible to force GDB to disassemble (or display it in assembler as good as possible) that memory address? If so, how?
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE* f;
f = fopen("main.c", "r");
if (!f) {
perror("open");
return -1;
}
fclose(f);
return 0;
}

Yeah, disassemble is not the best command to use here.
The command you want is "x/i" (examine as instructions):
(gdb) x/i 0xdeadbeef

Do you only want to disassemble your actual main? If so try this:
(gdb) info line main
(gdb) disas STARTADDRESS ENDADDRESS
Like so:
USER#MACHINE /cygdrive/c/prog/dsa
$ gcc-3.exe -g main.c
USER#MACHINE /cygdrive/c/prog/dsa
$ gdb a.exe
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
...
(gdb) info line main
Line 3 of "main.c" starts at address 0x401050 <main> and ends at 0x401075 <main+
(gdb) disas 0x401050 0x401075
Dump of assembler code from 0x401050 to 0x401075:
0x00401050 <main+0>: push %ebp
0x00401051 <main+1>: mov %esp,%ebp
0x00401053 <main+3>: sub $0x18,%esp
0x00401056 <main+6>: and $0xfffffff0,%esp
0x00401059 <main+9>: mov $0x0,%eax
0x0040105e <main+14>: add $0xf,%eax
0x00401061 <main+17>: add $0xf,%eax
0x00401064 <main+20>: shr $0x4,%eax
0x00401067 <main+23>: shl $0x4,%eax
0x0040106a <main+26>: mov %eax,-0xc(%ebp)
0x0040106d <main+29>: mov -0xc(%ebp),%eax
0x00401070 <main+32>: call 0x4010c4 <_alloca>
End of assembler dump.
I don't see your system interrupt call however. (its been a while since I last tried to make a system call in assembly. INT 21h though, last I recall

This isn't the direct answer to your question, but since you seem to just want to disassemble the binary, perhaps you could just use objdump:
objdump -d program
This should give you its dissassembly. You can add -S if you want it source-annotated.

You can force gcc to output directly to assembly code by adding the -S switch
gcc -S hello.c

fopen() is a C library function and so you won't see any syscall instructions in your code, just a regular function call. At some point, it does call open(2), but it does that via a trampoline. There is simply a jump to the VDSO page, which is provided by the kernel to every process. The VDSO then provides code to make the system call. On modern processors, the SYSCALL or SYSENTER instructions will be used, but you can also use INT 80h on x86 processors.

If all that you want is to see the disassembly with the INTC call, use objdump -d as someone mentioned but use the -static option when compiling. Otherwise the fopen function is not compiled into the elf and is linked at runtime.

gdb disassemble has a /m to include source code alongside the instructions. This is equivalent of objdump -S, with the extra benefit of confining to just the one function (or address-range) of interest.

You don't have to use gdb. GCC will do it.
gcc -S foo.c
This will create foo.s which is the assembly.
gcc -m32 -c -g -Wa,-a,-ad foo.c > foo.lst
The above version will create a listing file that has both the C and the assembly generated by it. GCC FAQ

full example for disassembling a memory range to C
/opt/gcc-arm-none-eabi-9-2019-q4-major/bin/arm-none-eabi-gdb
(gdb)file /root/ncs/zephyr/samples/hello_world/build_nrf9160dk_nrf9160ns/zephyr/zephyr.elf
(gdb) directory /root/ncs/zephyr/samples/hello_world/src
#here you want 1
(gdb) info line* 0x000328C0
#here you want 2, -0x04 ~ +0x04 is your range size
(gdb) disassemble /m 0x000328C0-0x04, 0x000328C0+0x04
#here with binary code
(gdb) disassemble /r 0x000328C0-0x04, 0x000328C0+0x04
(gdb) info thread
(gdb) interpreter-exec mi -thread-info

The accepted is not really correct. It does work in some circumstances.
(gdb) disas STARTADDRESS ENDADDRESS
The highest upvoted answer is correct. Read no further is you don't wish to understand why it is correct.
(gdb) x/i 0xdeadbeef
With an appropriately meaningless hex address.
I have an STM32 and I have relocated the code with PIC. The normal boot address is 0x8000000, with a 0x200 vector table. So a normal entry is 0x8000200. However, I have programmed the binary to 0x80040200 (two NOR flash sectors away) and wish to debug there.
The issue gdb has with this is 'file foo.elf' is showing that code is in the first range. Special command like 'disassemble' will actually look at the binary on the host. For the cross debug case, gdb would have to look at memory on the remote which could be expensive. So, it appears that the 'x /i' (examine as code) is the best option. The debug information that gdb depends on (where routines start/end) is not present in a random binary chunk.
To combine the answers above for PIC code on an embedded cross system,
You need to create multiple elf files, one for each possible target location. Use the GDB's file command to select the one with proper symbol locations.
This will NOT work for Cross development
You can use generating gcc debug symbols. The steps are,
Build normal link address.
Extract symbols.
Use symbol-file with an offset for the runtime address.
(gdb) help symbol-file
Load symbol table from executable file FILE.
Usage: symbol-file [-readnow | -readnever] [-o OFF] FILE
OFF is an optional offset which is added to each section address.
You can then switch symbol files to a relocated run address to use the first answer.
If you have a case where the code is relocated, but data is absolute, you need to link twice and choose the relocated elf files (symbols only are relocated and code is the same). This is desirable with NOR flash that is XIP (execute-in-place) as the memory devices for .text and .rodata are different from .data and .bss. Ie, many lower-to-middle scale embedded devices. However, gcc does not support this code generation option (at least on ARM). You must use a 'static base' register (for example, r9 as u-boot does).

There is another way which I wanted to presetn using gdb on top of the suggestions above:
Launch your program with gdb, and set a break point on main break *main and run
The you can use info proc mappings.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js