I have a simple hello world program and after i dumpbin it with /headers flag, i get this output:
FILE HEADER VALUES
8664 machine (x64)
D number of sections
5A3D287F time date stamp Fri Dec 22 18:45:03 2017
48F file pointer to symbol table
2D number of symbols
0 size of optional header
0 characteristics
Summary
F .data
A0 .debug$S
2F .drectve
24 .pdata
B9 .text$mn
18 .xdata
What exactly xdata section do and what it contains? No info on msdn.
For future reference:
.text: codesegment (think functions); there can be multiple of those when enabling function sections or when comdat is involved (for example templates)
.data: datasegment (think global vars); there can be multiple of those when enabling data sections or when comdat is involved (for example templates)
.bss: datasegment initialized to zeros (not present above); there can be multiple of those when enabling data sections or when comdat is involved (for example templates)
.debug: Debug info; like others, there can be multiple of these when function sections are involved.
.pdata: for x86_64, this is the "exception info" for a method, it defines the start/end of a function, and a pointer to the unwind info (see .xdata); inside object files this is duplicated per function
.drectve: not sure; but from the name I'd guess linker directives.
.xdata: for x86_64; this is the unwind info part that pdata points to. It contains where the exception handler of a function is, and what to do to unwind it when an exception occurs: https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019
The "$" postfix is used for sorting. Given:
- .sec$z
- .sec$data
- .sec$a
The sections are sorted before they are merged into an executable (so .sec$a first, then data, then z), this can be used to create start/end symbols to a pe section.
The repeated sections are for things like c++ templates, the compiler will instantiate a template in any translation unit that needs it and then the linker will pick one of those instantiations (usually the first encountered).
Less common are compiler-specific features like Microsoft's __declspec(selectany) that allow a variable to be defined more than once and again the linker will simply pick one of those definitions and discard the rest.
gcc's ld scripts will take all the .text* sections to create the final .text of the linked executable. You can examine those scripts to get an idea of how the linker creates an executable out of object files.
Related
I used objdump -t on the debug-info file of a program to find the address ranges of each function. There are a few functions the bounds of which can not be determined using this method. Because objdump reports 0 for their sizes. These symbols are shown, below:
deregister_tm_clones 0000000000197ce0
register_tm_clones 0000000000197d20
__do_global_dtors_aux 0000000000197d70
frame_dummy 0000000000197db0
_fini 00000000004e9474
_init 00000000001889e8
How can I determine their sizes? I can only imagine using GDB disas command on the start address and find the end of the disassembly for the function. This may not work in all cases. What is the standard approach?
UPDATE:
I am implementing a Pintool to generate callstacks at runtime. I only need symbols in certain binaries. In other words, I need a subset of functions (e.g., those in the GTK library) to be included in the callstack. Therefore, at runtime, I will need the ranges for these libraries.
On the other hand, I need the ranges for the symbols to find their outgoing jumps. This is a sign of tail-call elimination, which necessitates stack updates.
I'm trying to understand exactly how the exception table (.arm.extab) works.
I'm aware that this is compiler dependent, so I'll restrict myself to armcc (as I'm using Keil).
A typical entry in the table looks something like:
b0aa0380 2a002c00 01000000 00000000
To my understanding, the first word encodes instructions for the personality routine, while the third word is a R_ARM_PREL31 relocation to the start of the catch block.
What baffles me is the second word - it appears to be split into 2 shorts, the second of which measures some distance from the start of the throwing function, but I'm not sure exactly to what (nor what the first short does).
Is there any place where the structure of these entries is documented?
I've found 2 relevant documents, but as far as I can see they have no compiler-dependent information, so they're not sufficient:
https://github.com/ARM-software/abi-aa/releases/download/2022Q1/aaelf32.pdf
https://github.com/ARM-software/abi-aa/releases/download/2022Q1/ehabi32.pdf
If you happen to have the byte ordering missed up, the below applies. Some information is probably useful even if the byte-order is correct in your original example.
extab and exidx are sections added by the AAPCS which is a newer ARM ABI.
For the older APCS, the frame pointer or fp is a root of a linked of the active routine back to the main routine (or _start). With AAPCS records are created and placed in the exidx and extab sections. These are needed to unwind stacks (and resources) when the fp is used as a generic register.
The exidx is an ordered table of routine start addresses and an extab index (or can't unwind). A PC (program counter) can be examined and search via the table to find the corresponding extab entry.
The ARM EHABI documentation has a section 7 on Exception-handling Table entries. These are extab entries and you can at least start from there to learn more. There are two defined,
Generic (or C++)
ARM compact
The compact model will be used for most 'C' code. There are no objects to be destroyed on the stack as with C++. The hex 8003aab0 gives,
1000b for the leading nibble, so this is compact.
0000b for the index. Su16—Short
03h - pop 16 bytes, some locals or padding.
aah - pop r4-r6
b0h - finish
Table 4, ARM-defined frame-unwinding instructions gives the unwinding data of each byte.
The next is 0x002c002a which is an offset to the generic personality routine. The next four values should be the 8.2 Data Structures, which are a size and should be zero... Next would be stride and then a four byte object type info. The offset 0x2c002a would be to call the objects destructor or some sort of wrapper to do this.
I think all C++ code is intended to use this Generic method. Other methods are for different languages and NOT compilers.
Related Q/A and links:
Arm exidx - about the exidx.
ARM link and frame pointer - situation for older APCS and many AAPCS functions.
Linux ARM Unwind - sample unwinding code for 'C'.
prel31 - SO Q/A on prel31 in Linux code above.
Generating unwind in ARM gnu assembler
gas ARM directives See: .cantunwind, .vsave, etc.
I just used DUMPBIN for the first time and I see the term HIGHLOW repeatedly in the output file:
BASE RELOCATIONS #7
11000 RVA, E0 SizeOfBlock
...
3B5 HIGHLOW 2001753D ___onexitbegin
3C1 HIGHLOW 2001753D ___onexitbegin
...
I'm curious what this term stands for. I didn't find anything on Google or Stackoverflow about it.
To apply a fixup, a delta is calculated as the difference between the
preferred base address, and the base where the image is actually
loaded.
The basic idea is that when doing a fixup at some address, we must know
what memory must be changed ("offset" field)
what value is needed for its relocation ("delta" value)
which parts of relocated data and delta value to use ("type" field)
Here are some possible values of the "type" field
HIGH - add higher word (16 bits) of delta to the 16-bit value at "offset"
LOW - add lower word of delta to the value at "offset"
HIGHLOW - add full delta to the 32-bit value at "offset"
In other words, HIGHLOW type tells the program that it's doing a fix-up on offset "offset" from the page of this relocation block*, and that there is a doubleword that needs to be modified in order to have properly working executable.
* all of the relocation entries are grouped into blocks, and every block has a page on which its entries are applied
Let's say that you have this instruction in your code:
section .data
message: "Hello World!", 0
section .code
...
mov eax, message
...
You run assembler and immediately after it you run disassembler. Now your code looks like this:
mov eax, dword [0x702000]
You're now curious why is it 0x700000, and when you look into file dump, you see that
ImageBase: 0x00700000
Now you understand where did this number come from and you'e ready to run the executable.
Loader which loads executable files into memory and creates address space for them finds out, that memory 0x700000 is unavailable and it needs to place that file somewhere else. It decides that 0xf00000 will be OK and copies the file contents there.
But, your program was linked to work only with data on 0x700000 and there was no way for linker to know that its output would be relocated. Because of this, loader must do its magic. It
calculates delta value - the old address (image base) is 0x700000 but it wants 0xf00000 (preferred address). It subtracts one from another and gets 0x800000 as result.
gets to the .reloc section of the file
checks if there is still another page (4KB of data) to be relocated. If no, it continues toward calling file´s entry point.
4.for every relocation for the current page, it
gets data at relocation offset
adds the delta value (in the way as type field states)
places the new value at relocation offset
continues on step 3
There are also more types of relocation entry and some of them are architecture-specific. To see a full list, read the "Microsoft Portable Executable and Common Object File Format, section 6.6.2. Fixup Types".
What you see here is the content of the "Base relocation table" in Microsoft Windows executable files.
Base relocation tables are necessary in Windows for DLL files and they are optional for executable files; they contain information about the location of address information in the EXE/DLL file that must be updated when the actual address of the DLL file in memory is known (when loading the DLL into memory). Windows uses the information stored in this table to update the address information.
The table supports different types of addresses while the naming is Microsoft-specific: ABSOLUTE (= dummy), HIGH, LOW, HIGHLOW, HIGHADJ and MIPS_JMPADDR.
The full name of the constant is "IMAGE_REL_BASED_HIGHLOW".
The "ABSOLUTE" type is typically a dummy entry inserted to ensure the parts of the table are a multiple of 4 (or 8) bytes long.
On x86 CPUs only the "HIGHLOW" type is used: It tells Windows about the location of an absolute (32-bit) address in the file.
Some background info:
In your example the "Image Base" could be 0x20000000 which means that the EXE/DLL file has been compiled to be loaded into address 0x20000000. At the addresses 0x200113B5 (0x20000000 + 0x11000 + 0x3B5) and 0x200113C1 there are absolute addresses.
Let's say the memory at location 0x200113B5 contains the value 0x20012345 which is the address of a function or variable in the program.
Maybe the memory at address 0x20000000 cannot be used and Windows decides to load the DLL into the memory at 0x50000000 instead. Then the 0x20012345 must be replaced by 0x50012345.
The information in the base relocation table is used by Windows to find all addresses that must be replaced.
When debugging in Visual Studio, if symbols for a call stack are missing, for example:
00 > HelloWorld.exe!my_function(int y=42) Line 291
01 dynlib2.dll!10011435()
[Frames below may be incorrect and/or missing, no symbols loaded for dynlib2.dll]
02 dynlib2.dll!10011497()
03 HelloWorld.exe!wmain(int __formal=1, int __formal=1) Line 297 + 0xd bytes
04 HelloWorld.exe!__tmainCRTStartup() Line 594 + 0x19 bytes
05 HelloWorld.exe!wmainCRTStartup() Line 414
06 kernel32.dll!_BaseProcessStart#4() + 0x23 bytes
the debugger will display the warning Frames below may be incorrect and/or missing.
(Note that only lines 01 and 02 have no symbols. Line 00, where I set a breakpoint and all other lines have symbols loaded.)
Now, I know how to fix the warning (->get pdb file), what I do not quite get is why it is displayed after all! The stack I pasted above is fully OK, it's just that I do not have a pdb file for the dynlib2.dll module.
Why does the debugger need a symbols file to make sure the stack is correct?
I think this is because not all the functions follow the "standard" stack layout. Usually every function starts with:
push ebp
mov ebp,esp
and ends with
pop ebp
ret
By this every function creates its so-called stack frame. EBP always points to the beginning of the top stack frame. In every frame the first two values are the pointer to the previous stack frame, and the function return address.
Using this information one can easily reconstruct the stack. However:
This stack information won't include function names and parameters info.
Not all the functions obey this stack frame layout. If some optimizations are enabled (/Oy, omit stack frame pointers, for instance) - the stack layout is different.
I tried to understand this myself a while ago.
As of 2013, FPO is not used within MSFT and is generally frowned upon. I did come across a different MS binary technology used internally, that probably hampers naive EBP chain traversal: Basic Block Tools.
As noted in the post, PDBs do include 'StackFrameTypeEnum', and elsewhere it's hinted that they include the 'unwind program' for a stack frame. So all in all, they are still needed, and the gory details of why exactly - are not documented.
Symbols are decoupled from the associated binary code to reduce the size of shipping binaries. Check how big your PDB files are - huge, especially compared to the matching binary file (EXE/DLL). You would not want that overhead every time the binary is shipped, installed and used. This is especially important at load time. The symbol information is only for debugging after all, not required for correct operation of your code. Provided you keep symbols that match your shipped binaries, you can still debug problems post mortem with all symbols loaded.
If I understand correctly, the .bss section in ELF files is used to allocate space for zero-initialized variables. Our tool chain produces ELF files, hence my question: does the .bss section actually have to contain all those zeroes? It seems such an awful waste of spaces that when, say, I allocate a global ten megabyte array, it results in ten megabytes of zeroes in the ELF file. What am I seeing wrong here?
Has been some time since i worked with ELF. But i think i still remember this stuff. No, it does not physically contain those zeros. If you look into an ELF file program header, then you will see each header has two numbers: One is the size in the file. And another is the size as the section has when allocated in virtual memory (readelf -l ./a.out):
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4
INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x00454 0x00454 R E 0x1000
LOAD 0x000454 0x08049454 0x08049454 0x00104 0x61bac RW 0x1000
DYNAMIC 0x000468 0x08049468 0x08049468 0x000d0 0x000d0 RW 0x4
NOTE 0x000128 0x08048128 0x08048128 0x00020 0x00020 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
Headers of type LOAD are the one that are copied into virtual memory when the file is loaded for execution. Other headers contain other information, like the shared libraries that are needed. As you see, the FileSize and MemSiz significantly differ for the header that contains the bss section (the second LOAD one):
0x00104 (file-size) 0x61bac (mem-size)
For this example code:
int a[100000];
int main() { }
The ELF specification says that the part of a segment that the mem-size is greater than the file-size is just filled out with zeros in virtual memory. The segment to section mapping of the second LOAD header is like this:
03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
So there are some other sections in there too. For C++ constructor/destructors. The same thing for Java. Then it contains a copy of the .dynamic section and other stuff useful for dynamic linking (i believe this is the place that contains the needed shared libraries among other stuff). After that the .data section that contains initialized globals and local static variables. At the end, the .bss section appears, which is filled by zeros at load time because file-size does not cover it.
By the way, you can see into which output-section a particular symbol is going to be placed by using the -M linker option. For gcc, you use -Wl,-M to put the option through to the linker. The above example shows that a is allocated within .bss. It may help you verify that your uninitialized objects really end up in .bss and not somewhere else:
.bss 0x08049560 0x61aa0
[many input .o files...]
*(COMMON)
*fill* 0x08049568 0x18 00
COMMON 0x08049580 0x61a80 /tmp/cc2GT6nS.o
0x08049580 a
0x080ab000 . = ALIGN ((. != 0x0)?0x4:0x1)
0x080ab000 . = ALIGN (0x4)
0x080ab000 . = ALIGN (0x4)
0x080ab000 _end = .
GCC keeps uninitialized globals in a COMMON section by default, for compatibility with old compilers, that allow to have globals defined twice in a program without multiple definition errors. Use -fno-common to make GCC use the .bss sections for object files (does not make a difference for the final linked executable, because as you see it's going to get into a .bss output section anyway. This is controlled by the linker script. Display it with ld -verbose). But that shouldn't scare you, it's just an internal detail. See the manpage of gcc.
The .bss section in an ELF file is used for static data which is not initialized programmatically but guaranteed to be set to zero at runtime. Here's a little example that will explain the difference.
int main() {
static int bss_test1[100];
static int bss_test2[100] = {0};
return 0;
}
In this case bss_test1 is placed into the .bss since it is uninitialized. bss_test2 however is placed into the .data segment along with a bunch of zeros. The runtime loader basically allocates the amount of space reserved for the .bss and zeroes it out before any userland code begins executing.
You can see the difference using objdump, nm, or similar utilities:
moozletoots$ objdump -t a.out | grep bss_test
08049780 l O .bss 00000190 bss_test1.3
080494c0 l O .data 00000190 bss_test2.4
This is usually one of the first surprises that embedded developers run into... never initialize statics to zero explicitly. The runtime loader (usually) takes care of that. As soon as you initialize anything explicitly, you are telling the compiler/linker to include the data in the executable image.
A .bss section is not stored in an executable file. Of the most common sections (.text, .data, .bss), only .text (actual code) and .data (initialized data) are present in an ELF file.
That is correct, .bss is not present physically in the file, rather just the information about its size is present for the dynamic loader to allocate the .bss section for the application program.
As thumb rule only LOAD, TLS Segment gets the memory for the application program, rest are used for dynamic loader.
About static executable file, bss sections is also given space in the execuatble
Embedded application where there is no loader this is common.
Suman