Member variable allocated at start of memory - c++

I'm trying to use c++ on an STM32 device compiling with gcc. The device loads the code and start executing it but hard faults on any member variable write.
I can see with GDB that member variables are stored at beginnning of memory (0x7 to be specific), of course the STM32 hard faults at the first write of that location.
I can see that BSS section is not generated unless i declare a variable in main (used readelf on the final elf file).
Shouldnt be member variables be placed in bss?
I'm compiling and linking with -nostdlib -mcpu=cortex-m0plus -fno-exceptions -O0 -g.
The linker script is:
ENTRY(start_of_memory);
MEMORY {
rom (rx) : ORIGIN = 0x08000000, LENGTH = 16K
ram (xrw) : ORIGIN = 0x20000000, LENGTH = 2K
}
SECTIONS {
.text : {
*(.text)
} > rom
.data : {
*(.data)
*(.data.*)
} > ram
.bss : {
*(.bss)
*(.bss.*)
*(COMMON)
} > ram
}
The output of readelf (no variables declaration, only object usage):
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x8000000
Start of program headers: 52 (bytes into file)
Start of section headers: 76536 (bytes into file)
Flags: 0x5000200, Version5 EABI, soft-float ABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 14
Section header string table index: 13
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 08000000 010000 0005a8 00 AX 0 0 4
[ 2] .rodata PROGBITS 080005a8 0105a8 00005c 00 A 0 0 4
[ 3] .ARM.attributes ARM_ATTRIBUTES 00000000 010604 00002d 00 0 0 1
[ 4] .comment PROGBITS 00000000 010631 000049 01 MS 0 0 1
[ 5] .debug_info PROGBITS 00000000 01067a 000a93 00 0 0 1
[ 6] .debug_abbrev PROGBITS 00000000 01110d 0003b8 00 0 0 1
[ 7] .debug_aranges PROGBITS 00000000 0114c5 000060 00 0 0 1
[ 8] .debug_line PROGBITS 00000000 011525 000580 00 0 0 1
[ 9] .debug_str PROGBITS 00000000 011aa5 000416 01 MS 0 0 1
[10] .debug_frame PROGBITS 00000000 011ebc 000228 00 0 0 4
[11] .symtab SYMTAB 00000000 0120e4 000640 10 12 86 4
[12] .strtab STRTAB 00000000 012724 000344 00 0 0 1
[13] .shstrtab STRTAB 00000000 012a68 00008f 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
y (purecode), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x010000 0x08000000 0x08000000 0x00604 0x00604 R E 0x10000
Section to Segment mapping:
Segment Sections...
00 .text .rodata
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
Symbol table '.symtab' contains 100 entries:
Main (init platform probably does not use any variables):
int main(void) {
init_platform(SPEED_4_MHz);
gpio testpin(GPIO_A, 5);
testpin.dir(MODE_OUTPUT);
while (1) {
testpin.high();
wait();
testpin.low();
wait();
}
return 0;
}
Update #1:
The vector table is at beginning of memory, sp and msp are initialized successfully.
(gdb) p/x *0x00000000
$2 = 0x20000700
(gdb) p/x *0x00000004
$3 = 0x80000f1
(gdb) info registers
sp 0x20000700 0x20000700
lr 0xffffffff -1
pc 0x80000f6 0x80000f6 <main()+6>
xPSR 0xf1000000 -251658240
msp 0x20000700 0x20000700
psp 0xfffffffc 0xfffffffc
Putting a breakpoint on a constructor for the GPIO class, i can see variables are at 0x00000XXX
Breakpoint 2, gpio::gpio (this=0x7, port=0 '\000', pin=5 '\005') at gpio.cpp:25
25 mypin = pin;
(gdb) p/x &mypin
$6 = 0xb
I tried to make mypin a public member variable (was private), did not make any change.
Starting to think that dynamic allocation is needed with C++.

Address 0x7 is in the initial vector table in ROM, it is not writeable.
Unfortunately you don't have a section to populate the vector table, so this code is never going to work. You also don't appear to have a stack, which is where the members of gpio would be placed (because it is defined inside a function without the static keyword).
Start by taking the linker script provided as part of the STM32Cube package and then (if you must) modify it a little bit at a time until you break it. Then you will know what you have broken. It is not reasonable to write such a naïve linker script as this and expect it to work on a microcontroller.

of course the STM32 hard faults at the first write of that location.
STM32 does not "fault" if you try to write FLASH. It will simple have no effect.
You need to have a vector table in at the beginning of the FLASH memory. It has to contain as a minimum valid stack pointer address and the firmware entry point.
Your linker script and the code (I understand you do not use any STM supplied startup code) is far from being sufficient.
My advice:
Create the project using STM32Cube.
Then see how it should be done
Having this knowledge you can start to reinvent the wheel

The issue was in the launch script:
Not working:
toolchain\bin\arm-none-eabi-gdb.exe ^
-ex "target remote 127.0.0.1:3333" ^
-ex "load" ^
-ex "b main" ^
-ex "b unmanaged_isr_call" ^
-ex "b hard_fault_isr" ^
-ex "j main" binaries\main.elf
Working:
toolchain\bin\arm-none-eabi-gdb.exe ^
-ex "target remote 127.0.0.1:3333" ^
-ex "load" ^
-ex "b unmanaged_isr_call" ^
-ex "b hard_fault_isr" ^
-ex "set $pc = &main" binaries\main.elf
Made it work.
The issue was in j main.
The jump instruction does not modify the stack frame where all the object are placed by the compiler.
Using set $pc, execution starts at the given address, using jump execution starts at the first C line after the address, a big difference!.
From the gdb jump documentation:
The jump command does not change the current stack frame, or the stack pointer, or the contents of any memory location or any register other than the program counter. If locspec resolves to an address in a different function from the one currently executing, the results may be bizarre if the two functions expect different patterns of arguments or of local variables. For this reason, the jump command requests confirmation if the jump address is not in the function currently executing. However, even bizarre results are predictable if you are well acquainted with the machine-language code of your program.
The first lines make space in the stack for the objects "created" by main, space needed for the object to be used during execution. (verified by launching both commands and seeing differen msp values at the first C line).
With jump, those lines are not executed and the space is not allocated on stack: when code calls a funxction, the parameters will overwrite member data.

Related

C++ const char* messes up the multiboot header

This is weird.
So I'm trying to make a little kernel and I decided to use C++ for this. I did everything and I now have an (almost) working VGA Text Mode Driver. Why almost? Because whenever I pass the write method a const char* the multiboot header literally disappears.
And after a bit of fiddling i realized that ANY const char* use makes it go bonkers. Even just a variable.
The weird thing is that if I never create a const char* it just works. I can print individual characters too.
Note: I based on the Bare Bones Tuturial on OSDev.
Here's the relevant code:
# Main.asm
MBALIGN equ 1 << 0
MEMINFO equ 1 << 1
FLAGS equ MBALIGN | MEMINFO
MAGIC equ 0x1BADB002
CHECKSUM equ -(MAGIC + FLAGS)
section .multiboot
align 4
dd MAGIC
dd FLAGS
dd CHECKSUM
section .bss
align 16
stack_bottom:
resb 16384
stack_top:
section .text
global _start:function (_start.end - _start)
_start:
mov esp, stack_top
extern kernel_main
call kernel_main
cli
.hang: hlt
jmp .hang
.end:
// Main.cpp
void init() {
Drivers::VGA vga;
vga.putc('h');
vga.write("hello", 5);
}
extern "C" void kernel_main() {
init();
}
// Part of VGA.cpp
void VGA::write(const char* data, size_t size) {
for (size_t i = 0; i < size; i += 1) {
s_buffer[i] = vga_entry(data[i], _color);
}
}
[...]
u16 VGA::vga_entry(unsigned char c, u8 color) {
return (u16)c | (u16)color << 8;
}
# Linker.ld
ENTRY(_start)
SECTIONS {
. = 1M;
.text : ALIGN(4K) {
KEEP(*(.multiboot))
*(.text)
}
.rodata : ALIGN(4K) {
*(.rodata)
}
.data : ALIGN(4K) {
*(.data)
}
.bss : ALIGN(4K) {
*(COMMON)
*(.bss)
}
}
Compiler Options: -target i686-pc-elf -c -IKernel -ffreestanding -nostdlib++ -fno-exceptions -fno-rtti -fno-stack-protector -m32 -fno-use-cxa-atexit
Toolchain: Clang, Nasm and ld.lld
The problem is that ld.lld has a bug(?) or something. It put the rodata section before text, so the multiboot header wouldn't be visible.
Here's the output of readelf -S Kernel with ld.lld
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .rodata.str1.1 PROGBITS 00100000 001000 000001 01 AMS 0 0 1
[ 2] .text PROGBITS 00101000 002000 0002fe 00 AX 0 0 4096
[ 3] .data PROGBITS 00102000 003000 000004 00 WA 0 0 4096
[ 4] .bss NOBITS 00103000 003004 004000 00 WA 0 0 4096
[ 5] .comment PROGBITS 00000000 003004 000029 01 MS 0 0 1
[ 6] .symtab SYMTAB 00000000 003030 0001a0 10 8 14 4
[ 7] .shstrtab STRTAB 00000000 0031d0 000044 00 0 0 1
[ 8] .strtab STRTAB 00000000 003214 00018f 00 0 0 1
And here's the output with system's ld (GNU)
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00100000 001000 0002fe 00 AX 0 0 4096
[ 2] .rodata.str1.1 PROGBITS 001002fe 0012fe 000001 01 AMS 0 0 1
[ 3] .data PROGBITS 00101000 002000 000004 00 WA 0 0 4096
[ 4] .bss NOBITS 00102000 002004 004000 00 WA 0 0 4096
[ 5] .comment PROGBITS 00000000 002004 000015 01 MS 0 0 1
[ 6] .symtab SYMTAB 00000000 00201c 0001f0 10 7 19 4
[ 7] .strtab STRTAB 00000000 00220c 00018f 00 0 0 1
[ 8] .shstrtab STRTAB 00000000 00239b 000044 00 0 0 1

"ld: warning: cannot find entry symbol _start; defaulting to 0000000000401000" with custom *.o file

I've compiled a simple object file file and tried to link with ld, but it gave that warning. However, the file has _start symbol, Here's the readelf of the object.
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x40
Start of program headers: 0 (bytes into file)
Start of section headers: 59392 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 6
Section header string table index: 5
(...)
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00000040
000000000000005c 0000000000000000 AX 0 0 1
[ 2] .data PROGBITS 0000000000000000 00001040
0000000000001000 0000000000000000 WA 0 0 8
[ 3] .symtab SYMTAB 0000000000000000 00003400
0000000000000030 0000000000000018 4 2 8
[ 4] .strtab STRTAB 0000000000000000 00003800
0000000000000400 0000000000000000 0 0 1
[ 5] .shstrtab STRTAB 0000000000000000 00003000
0000000000000400 0000000000000000 0 0 1
(...)
Symbol table '.symtab' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 87 FUNC GLOBAL DEFAULT 1 _start
What could be the problem here
So, I found the problem. The info of symbol section header must be the index of the _start function in the symbol table. But for some reason the linker change that later, but It worked just fine!!

Hook an static linked ELF binary

I have an application that have openssl statically linked elf binary and i'm about to hook some of it's openssl function to get pre-master key thus allow me to decrypt the connections using wireshark.
I'm aware and know how to LD_PRELOAD or LD_LIBRARY_PATH hooking shared library, but this is statically linked binary.
Fortunately, the static elf didn't strip their debug symbol, so all named function i'm to hooking to are identified.
How do I have todo to hook this statically linked elf ?
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x80ceae0
Start of program headers: 52 (bytes into file)
Start of section headers: 3285112 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 8
Size of section headers: 40 (bytes)
Number of section headers: 28
Section header string table index: 27
Program Headers:
Elf file type is EXEC (Executable file)
Entry point 0x80ceae0
There are 8 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4
INTERP 0x000134 0x08048134 0x08048134 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x309507 0x309507 R E 0x1000
LOAD 0x309520 0x08352520 0x08352520 0x13168 0x29934 RW 0x1000
DYNAMIC 0x31c0fc 0x083650fc 0x083650fc 0x00100 0x00100 RW 0x4
NOTE 0x000148 0x08048148 0x08048148 0x00020 0x00020 R 0x4
GNU_EH_FRAME 0x2ccc30 0x08314c30 0x08314c30 0x0a06c 0x0a06c R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame .gcc_except_table
03 .data .dynamic .ctors .dtors .jcr .got .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
Symbol Table:
...
8627: 081ddbb0 408 FUNC GLOBAL DEFAULT 12 SSL_free
8629: 081de360 190 FUNC GLOBAL DEFAULT 12 SSL_copy_session_id
8665: 081deba0 148 FUNC GLOBAL DEFAULT 12 SSL_get_shared_ciphers
8848: 081df2f0 17 FUNC GLOBAL DEFAULT 12 SSL_CTX_set_default_passw
8927: 081e03a0 42 FUNC GLOBAL DEFAULT 12 SSL_CTX_set_cert_store
8996: 081de2d0 94 FUNC GLOBAL DEFAULT 12 SSL_get_peer_certificate
9079: 081e0250 14 FUNC GLOBAL DEFAULT 12 SSL_get_verify_result
9130: 081e52e0 269 FUNC GLOBAL DEFAULT 12 SSL_CTX_use_RSAPrivateKey
9193: 081e0f70 20 FUNC GLOBAL DEFAULT 12 SSL_SESSION_get_ex_data
9266: 081e0230 17 FUNC GLOBAL DEFAULT 12 SSL_set_verify_result
9305: 081df350 17 FUNC GLOBAL DEFAULT 12 SSL_CTX_set_verify_depth
9394: 081de230 14 FUNC GLOBAL DEFAULT 12 SSL_CTX_get_verify_depth
9409: 081e1840 36 FUNC GLOBAL DEFAULT 12 SSL_CTX_remove_session
9590: 081e3390 63 FUNC GLOBAL DEFAULT 12 SSL_rstate_string
9655: 081df8c0 122 FUNC GLOBAL DEFAULT 12 SSL_set_ssl_method
9662: 081e0360 20 FUNC GLOBAL DEFAULT 12 SSL_CTX_get_ex_data
9691: 081de330 38 FUNC GLOBAL DEFAULT 12 SSL_get_peer_cert_chain
9696: 081e0d20 20 FUNC GLOBAL DEFAULT 12 SSL_CTX_set_client_CA_lis
9798: 081e0d50 68 FUNC GLOBAL DEFAULT 12 SSL_get_client_CA_list
9810: 081de6f0 138 FUNC GLOBAL DEFAULT 12 SSL_write
...
You'll have to use GDB with a breakpoint command (perhaps involving Python scripting), or Systemtap. There is no direct way to interpose functions which are not listed in the .dynsym section (which is of course missing due to static linking).

gdb add-symbol-file all sections and load address

I'm debugging a boot loader (syslinux) with gdb and the gdb-stub of qemu. At some point the main file load a shared object ldlinux.elf.
I would like to add the symbols in gdb for that file. The command add-symbol-file seems like the way to go. However, as a relocatable file, I have to specify the memory address it has been loaded at. And here comes the problem.
Although I know the base address at which the LOAD segment has been loaded at, add-symbol-file works section-wise and want me to specify the address at which each section has been loaded.
Can I tell gdb to load all the symbols of all the sections provided that I specify the base address of the file in memory?
Does the behavior of gdb make sens? The section headers aren't used for running an ELF and are even optional. I can't see a use case where specifying the load address of the sections would be useful.
Example
Here are the program headers and section headers of the shared object.
Elf file type is DYN (Shared object file)
Entry point 0x4c60
There are 3 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00000000 0x00000000 0x1db10 0x20bfc RWE 0x1000
DYNAMIC 0x01d618 0x0001d618 0x0001d618 0x00098 0x00098 RW 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10
Section to Segment mapping:
Segment Sections...
00 .gnu.hash .dynsym .dynstr .rel.dyn .rel.plt .plt .text .rodata .ctors .dtors .data.rel.ro .dynamic .got .got.plt .data .bss
01 .dynamic
02
There are 29 section headers, starting at offset 0x78618:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .gnu.hash GNU_HASH 00000094 000094 0007e0 04 A 2 0 4
[ 2] .dynsym DYNSYM 00000874 000874 0015c0 10 A 3 1 4
[ 3] .dynstr STRTAB 00001e34 001e34 0010f4 00 A 0 0 1
[ 4] .rel.dyn REL 00002f28 002f28 000ce8 08 A 2 0 4
[ 5] .rel.plt REL 00003c10 003c10 000568 08 AI 2 6 4
[ 6] .plt PROGBITS 00004180 004180 000ae0 04 AX 0 0 16
[ 7] .text PROGBITS 00004c60 004c60 013816 00 AX 0 0 4
[ 8] .rodata PROGBITS 00018480 018480 00462f 00 A 0 0 32
[ 9] .ctors INIT_ARRAY 0001cab0 01cab0 000010 00 WA 0 0 4
[10] .dtors FINI_ARRAY 0001cac0 01cac0 000004 00 WA 0 0 4
[11] .data.rel.ro PROGBITS 0001cae0 01cae0 000b38 00 WA 0 0 32
[12] .dynamic DYNAMIC 0001d618 01d618 000098 08 WA 3 0 4
[13] .got PROGBITS 0001d6b0 01d6b0 0000d0 04 WA 0 0 4
[14] .got.plt PROGBITS 0001d780 01d780 0002c0 04 WA 0 0 4
[15] .data PROGBITS 0001da40 01da40 0000d0 00 WA 0 0 32
[16] .bss NOBITS 0001db20 01db10 0030dc 00 WA 0 0 32
[17] .comment PROGBITS 00000000 01db10 000026 01 MS 0 0 1
[18] .debug_aranges PROGBITS 00000000 01db38 0010c0 00 0 0 8
[19] .debug_info PROGBITS 00000000 01ebf8 021ada 00 0 0 1
[20] .debug_abbrev PROGBITS 00000000 0406d2 009647 00 0 0 1
[21] .debug_line PROGBITS 00000000 049d19 00bd3a 00 0 0 1
[22] .debug_frame PROGBITS 00000000 055a54 004574 00 0 0 4
[23] .debug_str PROGBITS 00000000 059fc8 00538c 01 MS 0 0 1
[24] .debug_loc PROGBITS 00000000 05f354 01312d 00 0 0 1
[25] .debug_ranges PROGBITS 00000000 072481 0005d0 00 0 0 1
[26] .shstrtab STRTAB 00000000 072a51 000101 00 0 0 1
[27] .symtab SYMTAB 00000000 072b54 003530 10 28 504 4
[28] .strtab STRTAB 00000000 076084 002593 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
If I try to load the file at the address 0x7fab000 then it will relocate the symbols so that the .text section starts at 0x7fab000.
(gdb) add-symbol-file bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000
add symbol table from file "bios/com32/elflink/ldlinux/ldlinux.elf" at
.text_addr = 0x7fab000
(y or n) y
Reading symbols from bios/com32/elflink/ldlinux/ldlinux.elf...done.
And then all the symbols are off by 0x4c60 bytes.
So, finally, I made my own command with python and the readelf tool. It's not very clean since it runs readelf in a subprocess and parse its output instead of parsing the ELF file directly, but it works (for 32 bits ELF only).
It uses the section headers to generate and run an add-symbol-file command with all the sections correctly relocated. The usage is pretty simple, you give it the elf file and the base address of the file. And since the remove-symbol-file wasn't working properly by just giving it the filename, I made a remove-symbol-file-all that generate and run the right remove-symbol-file -a address command.
(gdb) add-symbol-file-all bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000
add symbol table from file "bios/com32/elflink/ldlinux/ldlinux.elf" at
.text_addr = 0x7fafc50
.gnu.hash_addr = 0x7fab094
.dynsym_addr = 0x7fab874
.dynstr_addr = 0x7face34
.rel.dyn_addr = 0x7fadf28
.rel.plt_addr = 0x7faec08
.plt_addr = 0x7faf170
.rodata_addr = 0x7fc34e0
.ctors_addr = 0x7fc7af0
.dtors_addr = 0x7fc7b00
.data.rel.ro_addr = 0x7fc7b20
.dynamic_addr = 0x7fc8658
.got_addr = 0x7fc86f0
.got.plt_addr = 0x7fc87bc
.data_addr = 0x7fc8a80
.bss_addr = 0x7fc8b60
(gdb) remove-symbol-file-all bios/com32/elflink/ldlinux/ldlinux.elf 0x7fab000
Here is the code to be added in the .gdbinit file.
python
import subprocess
import re
def relocatesections(filename, addr):
p = subprocess.Popen(["readelf", "-S", filename], stdout = subprocess.PIPE)
sections = []
textaddr = '0'
for line in p.stdout.readlines():
line = line.decode("utf-8").strip()
if not line.startswith('[') or line.startswith('[Nr]'):
continue
line = re.sub(r' +', ' ', line)
line = re.sub(r'\[ *(\d+)\]', '\g<1>', line)
fieldsvalue = line.split(' ')
fieldsname = ['number', 'name', 'type', 'addr', 'offset', 'size', 'entsize', 'flags', 'link', 'info', 'addralign']
sec = dict(zip(fieldsname, fieldsvalue))
if sec['number'] == '0':
continue
sections.append(sec)
if sec['name'] == '.text':
textaddr = sec['addr']
return (textaddr, sections)
class AddSymbolFileAll(gdb.Command):
"""The right version for add-symbol-file"""
def __init__(self):
super(AddSymbolFileAll, self).__init__("add-symbol-file-all", gdb.COMMAND_USER)
self.dont_repeat()
def invoke(self, arg, from_tty):
argv = gdb.string_to_argv(arg)
filename = argv[0]
if len(argv) > 1:
offset = int(str(gdb.parse_and_eval(argv[1])), 0)
else:
offset = 0
(textaddr, sections) = relocatesections(filename, offset)
cmd = "add-symbol-file %s 0x%08x" % (filename, int(textaddr, 16) + offset)
for s in sections:
addr = int(s['addr'], 16)
if s['name'] == '.text' or addr == 0:
continue
cmd += " -s %s 0x%08x" % (s['name'], addr + offset)
gdb.execute(cmd)
class RemoveSymbolFileAll(gdb.Command):
"""The right version for remove-symbol-file"""
def __init__(self):
super(RemoveSymbolFileAll, self).__init__("remove-symbol-file-all", gdb.COMMAND_USER)
self.dont_repeat()
def invoke(self, arg, from_tty):
argv = gdb.string_to_argv(arg)
filename = argv[0]
if len(argv) > 1:
offset = int(str(gdb.parse_and_eval(argv[1])), 0)
else:
offset = 0
(textaddr, _) = relocatesections(filename, offset)
cmd = "remove-symbol-file -a 0x%08x" % (int(textaddr, 16) + offset)
gdb.execute(cmd)
AddSymbolFileAll()
RemoveSymbolFileAll()
end
Can I tell gdb to load all the symbols of all the sections provided that I specify the base address of the file in memory?
Yes, but you need to provide the address of .text section, i.e. 0x7fab000+0x00004c60 here. I agree: it's quite annoying to have to fish out address of .text, and I wanted to fix it many times, so that e.g.
(gdb) add-symbol-file foo.so #0x7abc0000
just works. Feel free to file a feature request in GDB bugzilla.
Does the behavior of gdb make sens?
I am guessing that this is rooted in how GDB was used to debug embedded ROMs, where each section can be at arbitrary memory address.

How to count static initializer in an ELF file?

I'm trying to count static initializers in a C++ file.
Solution I already have (which used to work with gcc-4.4) is looking at size of the .ctors ELF section.
After an upgrade to gcc-4.6, this seems to no longer return valid results (calculated number of static initializers is 0, which doesn't match reality, e.g. as returned by nm).
Now the issue is I'd like the solution to work even in absence of symbols (otherwise I'd have used nm).
Below is the output of readelf -SW of an example executable:
There are 35 section headers, starting at offset 0x4f39820:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 00000174 000174 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 00000188 000188 000020 00 A 0 0 4
[ 3] .note.gnu.build-id NOTE 000001a8 0001a8 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 000001cc 0001cc 000918 04 A 5 0 4
[ 5] .dynsym DYNSYM 00000ae4 000ae4 00a5e0 10 A 6 1 4
[ 6] .dynstr STRTAB 0000b0c4 00b0c4 00ef72 00 A 0 0 1
[ 7] .gnu.version VERSYM 0001a036 01a036 0014bc 02 A 5 0 2
[ 8] .gnu.version_r VERNEED 0001b4f4 01b4f4 000450 00 A 6 13 4
[ 9] .rel.dyn REL 0001b944 01b944 268480 08 A 5 0 4
[10] .rel.plt REL 00283dc4 283dc4 0048c8 08 A 5 12 4
[11] .init PROGBITS 0028868c 28868c 00002e 00 AX 0 0 4
[12] .plt PROGBITS 002886c0 2886c0 0091a0 04 AX 0 0 16
[13] .text PROGBITS 00291860 291860 3ac5638 00 AX 0 0 16
[14] malloc_hook PROGBITS 03d56ea0 3d56ea0 00075a 00 AX 0 0 16
[15] google_malloc PROGBITS 03d57600 3d57600 008997 00 AX 0 0 16
[16] .fini PROGBITS 03d5ff98 3d5ff98 00001a 00 AX 0 0 4
[17] .rodata PROGBITS 03d5ffc0 3d5ffc0 ffa640 00 A 0 0 64
[18] .eh_frame_hdr PROGBITS 04d5a600 4d5a600 0004b4 00 A 0 0 4
[19] .eh_frame PROGBITS 04d5aab4 4d5aab4 001cb8 00 A 0 0 4
[20] .gcc_except_table PROGBITS 04d5c76c 4d5c76c 0003ab 00 A 0 0 4
[21] .tbss NOBITS 04d5df0c 4d5cf0c 000014 00 WAT 0 0 4
[22] .init_array INIT_ARRAY 04d5df0c 4d5cf0c 000090 00 WA 0 0 4
[23] .ctors PROGBITS 04d5df9c 4d5cf9c 000008 00 WA 0 0 4
[24] .dtors PROGBITS 04d5dfa4 4d5cfa4 000008 00 WA 0 0 4
[25] .jcr PROGBITS 04d5dfac 4d5cfac 000004 00 WA 0 0 4
[26] .data.rel.ro PROGBITS 04d5dfc0 4d5cfc0 1b160c 00 WA 0 0 32
[27] .dynamic DYNAMIC 04f0f5cc 4f0e5cc 000220 08 WA 6 0 4
[28] .got PROGBITS 04f0f7ec 4f0e7ec 00a800 04 WA 0 0 4
[29] .data PROGBITS 04f1a000 4f19000 0206b8 00 WA 0 0 32
[30] .bss NOBITS 04f3a6c0 4f396b8 04c800 00 WA 0 0 32
[31] .comment PROGBITS 00000000 4f396b8 00002a 01 MS 0 0 1
[32] .shstrtab STRTAB 00000000 4f396e2 00013e 00 0 0 1
[33] .symtab SYMTAB 00000000 4f39d98 4ff960 10 34 140163 4
[34] .strtab STRTAB 00000000 54396f8 144992a 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Should I be looking at .init or .init_array instead? Could you point me to corresponding documentation that explains the change between gcc or linker versions?
Static constructors can be triggered by any of the three sections .init, .ctors, or .init_array (oldest to newest in that order). .init contains a fragment of code, .ctors and .init_array contain pointers to code. The difference between .ctors and .init_array has to do with the overall order in which constructors are executed. As far as I know, none of this is documented anywhere other than code comments and mailing list posts, but it's probably worth checking the ELF ABI documents (g- and ps- both).
You cannot deduce the number of static constructors in a file from the size of any of these sections. It is permitted, and common, for compilers to generate a single special function which invokes all of the constructors in a file, and reference only that one function in whichever of the sections it uses. All you can know for sure (without examining the contents of the sections, applying relocations, and chasing pointers / call instructions into the .text segment and reverse engineering whatever gets called) is: in an object file, if at least one of these sections has nonzero size, then there is at least one file- or global-scope constructor in the file; if all three sections are empty, then there are none. (In an executable, all three sections are always nonempty, because the data structures that they define need headers and trailers, which are automatically added at link time.)
Note also that constructors for block-scoped static objects are not invoked from any of these sections; they're invoked the first time control reaches their declaration.
I am assuming you have access to all the source code of your applications (and perhaps all the libraries it is called). This obviously is true for free software.
Then, you might measure that more precisely at compilation time, when compiling (with a recent version of GCC, e.g. 4.7 or 4.8) your application. You could extend it with MELT (that is a high level domain specific language to extend GCC), or with painful GCC plugins coded in C++, to measure such things.
And I am not entirely sure that your question makes a precise sense. If your application is e.g. linked to some shared library which use visibility tricks to hide its static constructors, understanding how much static constructors that library calls is not really defined.