How do relocations work in COFF object (not image) files - relocation

What steps exactly are taken by the linker while resolving relocations in an object file before creating the final image? More specifically, how does the linker treat the value which is already stored at the relocation site? Does it always add it to the final VA/RVA, or is it sometimes ignored (e.g certain relocation types)?
I couldn't find a clear explanation in the MS PE/COFF Specfication, and after googling and experimenting for a while, all I could find out was this:
In the MS COFF spec, chapter 5.6.2 "Base Relocation Types", it is said that "The base relocation applies all 32 bits of the difference to the 32-bit field at offset", which I guess means that the relocation should take into account whatever address is already stored at the specified offset. However, chapter 5.6 (the .reloc section) is only relevant to image files, and not object files.
The dumpbin utility adds a column named "Applied To" when printing the relocations table, which seems to always (no matter the relocation type) contain the value which is stored at the relocation site.
The Relocation Directives chapter in the DJGPP COFF Specification clearly states that the value currently stored at the location should be added to the address of the symbol pointed to by the relocation table entry.
Can you point me to any (relevant) documentation which explains how relocations are handled by the linker?

The relocation section used in "image files" has a slightly different purpose from the relocation information present in "object files".
Unlike Linux Shared Libraries, Windows DLLs do not typically use position independent code. Instead they are defined relative to a fixed based address. The Windows loader, however, has the ability to relocate a DLL in the event of a conflict. To support this, DLL images contain relocation sections that specify what data needs to be modified when the image is relocated. Many intra-dll symbol references will use "eip" (or rip) relative addressing, so they may not need to be modified on DLL relocation.
Image file relocations are always specified relative to the base address of the executable image. Object file relocations are specified relative to the address (within an image, using the images preferred based address) of a symbol in a symbol table. Image files don't have a symbol table (they have an IAT, but that's not a symbol table). The set of supported relocations in object files is richer then the set supported in image files.
The details are covered in the "COFF Relocations (Object Only)" section of the PE/COFF spec (I'm looking at version 3 as I type this).

Related

Object file is 2.5x larger on linux than on macOS or Windows

I have a file which, when compiled to object file, has the following size:
On Windows, using MSVC, it's 8MB.
On macOS, using clang, it's 8MB.
On linux (Ubuntu 18.04 or Gentoo), using either gcc or clang, it's 20MB.
The file (detailed below) is a representation of (a part of) a unicode table along with character properties. The encoding is utf8.
It occured to me that the problem might be that libstdc++ can't handle the file well, so I tried libc++ with clang on Gentoo, but it didn't do anything (the object file size remained the same).
Then I thought that it might be some optimization doing something odd, but once again I had no size improvements when I went from -O3 to -O0.
The file, on line 50 includes UnicodeTable.inc. The UnicodeTable.inc contains a std::array of the unicode codepoints.
I tried changing std::array to C style array, but again, the object file size did not change.
I have the preprocessed version of the CodePoint.cpp which can be compiled with $CC -xc++ CodePoint.i -c -o CodePoint.o. CodePoint.i contains about 40k lines of STL code and about 130k lines of unicode table.
I tried uploading the preprocessed CodePoint.i to gists.github.com and to paste.pound-python.org, but both refused the 170k lines long file.
At this point I'm out of ideas and would greatly appreciate any help regarding finding out the source of the "bloated" object file size.
From the output of size you linked you can see that there are 12 MB of relocations in the elf object (section .rela.dyn). If a 64 bit relocation takes 24 bytes and you have 132624 table entries with 4 pointers to strings each, this pretty much explains the 12 MB difference (132624 *4 * 24 = 12731904 ~ 12 MB ).
Apparently the other formats either use a more efficient relocation type or link the references directly and just relocate the whole block together with the strings as one piece of memory.
Since you are linking this to a shared library the dynamic relocations will not go away.
I am not sure if it is possible to avoid this with the code you currently use.
However, I think a unicode code point must have a maximal size. Why don't you store the code points by value in char arrays in the RawCodePoint struct? The size of each code point string should be no larger than the pointer you currently store, and the locality of reference of the table lookup may actually improve.
constexpr size_t MAX_CP_SIZE = 4; // Check if that is correct
struct RawCodePointLocal {
const std::array<char, MAX_CP_SIZE> original;
const std::array<char, MAX_CP_SIZE> normal;
const std::array<char, MAX_CP_SIZE> folded_case;
const std::array<char, MAX_CP_SIZE> swapped_case;
bool is_letter;
bool is_punctuation;
bool is_uppercase;
uint8_t break_property;
uint8_t combining_class;
};
This way you should not need relocations for the entries.

What does xdata section do?

I have a simple hello world program and after i dumpbin it with /headers flag, i get this output:
FILE HEADER VALUES
8664 machine (x64)
D number of sections
5A3D287F time date stamp Fri Dec 22 18:45:03 2017
48F file pointer to symbol table
2D number of symbols
0 size of optional header
0 characteristics
Summary
F .data
A0 .debug$S
2F .drectve
24 .pdata
B9 .text$mn
18 .xdata
What exactly xdata section do and what it contains? No info on msdn.
For future reference:
.text: codesegment (think functions); there can be multiple of those when enabling function sections or when comdat is involved (for example templates)
.data: datasegment (think global vars); there can be multiple of those when enabling data sections or when comdat is involved (for example templates)
.bss: datasegment initialized to zeros (not present above); there can be multiple of those when enabling data sections or when comdat is involved (for example templates)
.debug: Debug info; like others, there can be multiple of these when function sections are involved.
.pdata: for x86_64, this is the "exception info" for a method, it defines the start/end of a function, and a pointer to the unwind info (see .xdata); inside object files this is duplicated per function
.drectve: not sure; but from the name I'd guess linker directives.
.xdata: for x86_64; this is the unwind info part that pdata points to. It contains where the exception handler of a function is, and what to do to unwind it when an exception occurs: https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=vs-2019
The "$" postfix is used for sorting. Given:
- .sec$z
- .sec$data
- .sec$a
The sections are sorted before they are merged into an executable (so .sec$a first, then data, then z), this can be used to create start/end symbols to a pe section.
The repeated sections are for things like c++ templates, the compiler will instantiate a template in any translation unit that needs it and then the linker will pick one of those instantiations (usually the first encountered).
Less common are compiler-specific features like Microsoft's __declspec(selectany) that allow a variable to be defined more than once and again the linker will simply pick one of those definitions and discard the rest.
gcc's ld scripts will take all the .text* sections to create the final .text of the linked executable. You can examine those scripts to get an idea of how the linker creates an executable out of object files.

What's the meaning of HIGHLOW in a disassembled binary file?

I just used DUMPBIN for the first time and I see the term HIGHLOW repeatedly in the output file:
BASE RELOCATIONS #7
11000 RVA, E0 SizeOfBlock
...
3B5 HIGHLOW 2001753D ___onexitbegin
3C1 HIGHLOW 2001753D ___onexitbegin
...
I'm curious what this term stands for. I didn't find anything on Google or Stackoverflow about it.
To apply a fixup, a delta is calculated as the difference between the
preferred base address, and the base where the image is actually
loaded.
The basic idea is that when doing a fixup at some address, we must know
what memory must be changed ("offset" field)
what value is needed for its relocation ("delta" value)
which parts of relocated data and delta value to use ("type" field)
Here are some possible values of the "type" field
HIGH - add higher word (16 bits) of delta to the 16-bit value at "offset"
LOW - add lower word of delta to the value at "offset"
HIGHLOW - add full delta to the 32-bit value at "offset"
In other words, HIGHLOW type tells the program that it's doing a fix-up on offset "offset" from the page of this relocation block*, and that there is a doubleword that needs to be modified in order to have properly working executable.
* all of the relocation entries are grouped into blocks, and every block has a page on which its entries are applied
Let's say that you have this instruction in your code:
section .data
message: "Hello World!", 0
section .code
...
mov eax, message
...
You run assembler and immediately after it you run disassembler. Now your code looks like this:
mov eax, dword [0x702000]
You're now curious why is it 0x700000, and when you look into file dump, you see that
ImageBase: 0x00700000
Now you understand where did this number come from and you'e ready to run the executable.
Loader which loads executable files into memory and creates address space for them finds out, that memory 0x700000 is unavailable and it needs to place that file somewhere else. It decides that 0xf00000 will be OK and copies the file contents there.
But, your program was linked to work only with data on 0x700000 and there was no way for linker to know that its output would be relocated. Because of this, loader must do its magic. It
calculates delta value - the old address (image base) is 0x700000 but it wants 0xf00000 (preferred address). It subtracts one from another and gets 0x800000 as result.
gets to the .reloc section of the file
checks if there is still another page (4KB of data) to be relocated. If no, it continues toward calling fileĀ“s entry point.
4.for every relocation for the current page, it
gets data at relocation offset
adds the delta value (in the way as type field states)
places the new value at relocation offset
continues on step 3
There are also more types of relocation entry and some of them are architecture-specific. To see a full list, read the "Microsoft Portable Executable and Common Object File Format, section 6.6.2. Fixup Types".
What you see here is the content of the "Base relocation table" in Microsoft Windows executable files.
Base relocation tables are necessary in Windows for DLL files and they are optional for executable files; they contain information about the location of address information in the EXE/DLL file that must be updated when the actual address of the DLL file in memory is known (when loading the DLL into memory). Windows uses the information stored in this table to update the address information.
The table supports different types of addresses while the naming is Microsoft-specific: ABSOLUTE (= dummy), HIGH, LOW, HIGHLOW, HIGHADJ and MIPS_JMPADDR.
The full name of the constant is "IMAGE_REL_BASED_HIGHLOW".
The "ABSOLUTE" type is typically a dummy entry inserted to ensure the parts of the table are a multiple of 4 (or 8) bytes long.
On x86 CPUs only the "HIGHLOW" type is used: It tells Windows about the location of an absolute (32-bit) address in the file.
Some background info:
In your example the "Image Base" could be 0x20000000 which means that the EXE/DLL file has been compiled to be loaded into address 0x20000000. At the addresses 0x200113B5 (0x20000000 + 0x11000 + 0x3B5) and 0x200113C1 there are absolute addresses.
Let's say the memory at location 0x200113B5 contains the value 0x20012345 which is the address of a function or variable in the program.
Maybe the memory at address 0x20000000 cannot be used and Windows decides to load the DLL into the memory at 0x50000000 instead. Then the 0x20012345 must be replaced by 0x50012345.
The information in the base relocation table is used by Windows to find all addresses that must be replaced.

What does a dangerous relocation error mean?

I am getting a linking error:
dangerous relocation: l32r: Literal placed after use:
I am still trying to debug; however, I want to better understand this error. I understand what relocation is; however, I am not sure how it can be dangerous and was looking for some clarification. Also, a small code snippet that could generate this type of error would be helpful.
In short, what is "a dangerous relocation"?
This is a two-part answer, as there are really two questions here, one general ("what's a dangerous relocation?") and one specific to the Xtensa ("why can't you have a literal placed after where it's used in the code?").
What's all this dangerous relocation stuff about, anyway?
To understand what a 'dangerous relocation' is, we must first understand what a relocation is. As a compiler is generating an object file from some piece of code, it will need to reference symbols that are defined somewhere else: perhaps in another object file in the link, or perhaps in a shared library. However, the compiler does not know the addresses of external symbols when compiling a given object file. It must emit a relocation to serve as a named placeholder, telling the linker "OK, shove the address of foobar into this spot, and oh, you have to do X, Y, and Z to it to make it fit into the instructions there."
Most of the time, this works without a hitch, you get a binary out of your linker, and Bob's your uncle. When this process breaks down, and the linker cannot make the address of the symbol the compiler gave it fit into the instructions at the site of the relocation, it gives up and tosses out a 'dangerous relocation' message (among others -- the all-too-common 'relocation truncated to fit' pops out of this process as well) to inform the programmer that something has gone terribly wrong.
What's wrong with a literal placed after where it's used?
Now that we know what a generic 'dangerous relocation' is, we can move on to the second half of the error message, namely "l32r: Literal placed after use". The Xtensa uses an instruction known as L32R to load constant values from memory that don't fit into the Xtensa's MOVI immediate load instruction, which has a 12-bit signed immediate field. The L32R instruction is described in the Xtensa ISA reference as follows:
L32R is a PC-relative 32-bit load from memory. It is typically used to load constant
values into a register when the constant cannot be encoded in a MOVI instruction.
L32R forms a virtual address by adding the 16-bit one-extended constant value encoded
in the instruction word shifted left by two to the address of the L32R
plus three with the two least significant bits cleared. Therefore, the offset can always
specify 32-bit aligned addresses from -262141 to -4 bytes from the address of the L32R
instruction. 32 bits (four bytes) are read from the physical address. This data is then
written to address register at.
Given the restrictions on L32R quoted above, the error message breaks down quite nicely: the compiler generated a L32R to load a constant (which could be a value or an address) somewhere in your code, but either the constant's value was not available to the compiler (think extern const), or the address needed to be filled in by the linker (this is the likely case). So, it emitted this L32R relocation to tell the linker to 'fill in the blank' in the L32R instruction with the address of a constant value or constant address somewhere in your program. However, the linker couldn't find anywhere in the previous 256KB of code -- or literal pool, depending on how your compiler and Xtensa core are configured -- to shove a constant, so it gave up and spat out the error message you asked about.
How does one fix this?
Unfortunately, a 'dangerous relocation' of this sort depends on code size, so unless you have a bona fide compiler or linker bug on your hands, reproducing it with a small snippet of code will be impossible. There are two possible causes you can try to address, though.
There's no room for my literal pool!
If you are compiling with -mno-text-section-literals (which is the default), the linker gets fed the literal pools as separate sections which it then has to interleave with the code sections. If you have a particularly large object file in your link, it may have over 256KB of code in its .text section, leaving nowhere in the range of a L32R instruction for the linker to place the associated literal pool section at. Compiling with -mtext-section-literals should eliminate the error; if it does not work, you have that flag on already, or if you are using -ffunction-sections (which places each function into its own section; it is sometimes used in embedded work to allow the linker to throw out unused code), read on.
The linker (or assembler) still can't find a place to put my literals!
When the compiler and assembler are told to emit literals into the text section, they restrict placement of the literal pools to before the functions that use them (i.e. before the ENTRY instruction of the function) in order to minimize the risk that the literal pools will be executed as code, with obviously bad results. If you have an extremely long function in your code -- I shudder to think what sort of function could generate more than 256KB of code -- the 'default' literal pool placed before the ENTRY instruction can wind up out of range of L32R instructions near the end of the function. Normally, the compiler will emit an assembler directive known as .literal_position, as well as a jump around the mid-function literal pool, to provide the assembler and linker with an extra place to shove literals into. You can tell the compiler to output an assembler listing using -save-temps and then search it for .literal_position directives; if one isn't present in a function that has L32R instructions past the 256KB mark, congratulations! You just found a compiler bug!
What else could happen to produce this?
The only other circumstance I see that can provoke such a problem is if there is nowhere before the ENTRY instruction that the compiler or linker can put a literal pool, and the compiler can't figure this out on its own -- this can occur with interrupt handlers, or functions that are explicitly placed at the beginning of a physical memory boundary by the linker script. In this case, you will need to insert the .literal_position directive and its associated jump & label by hand in an asm statement at the top of the culprit function in order to provide the assembler with a place to put the culprit function's literals. As the GAS manual puts it:
The assembler will automatically place text section literal pools before ENTRY
instructions, so the .literal_position directive is only needed to specify some other
location for a literal pool. You may need to add an explicit jump instruction to skip
over an inline literal pool.
For example, an interrupt vector does not begin with an ENTRY instruction so the
assembler will be unable to automatically find a good place to put a literal pool.
Moreover, the code for the interrupt vector must be at a specific starting address, so
the literal pool cannot come before the start of the code. The literal pool for the
vector must be explicitly positioned in the middle of the vector (before any uses of the
literals, due to the negative offsets used by PC-relative L32R instructions).
Wait, I'm using the absolute literal option!
If you have the LITBASE option enabled in your Xtensa core and are getting this error, this is a sign that your literal pool has overflowed. The compiler should generate the 'glue' needed to switch literal pools in this case, though: if it doesn't, congratulations! You have just found a compiler bug!
Here's http://www.mail-archive.com/mspgcc-users#lists.sourceforge.net/msg11488.html
This might be helpful for you.
Good luck :)

Getting the offset to .text section code PE file format? VirtualAddress, PointerToRawData?

I've been trying to do this for about two days, with no success. I have been reading over many PE file format tutorials to no avail.
I map a 32 bit executable into memory via CreateFileMapping which works perfectly. My program then loops through the section headers, and checks the characteristics against my default characteristics (to make sure the section is executable and is code). If it is true the program returns the (PIMAGE_SECTION_HEADER) pointer to that section header (program works perfectly so far).
Now that I have the pointer, there are two specific entries to the structure that have baffled me, and that is PointerToRawData and VirtualAddress, when I cout the entries;
VirtualSize = 4096, PointerToRawData = 1536.
From what I have read in PE documentation, is that PointerToRawData is a supposed offset (RVA???) to the first byte of data in the section on disk (am I correct?), and is a multiple of a alignment value (512). The question is what do I set this value to, to obtain a pointer which I can use to access the section's data. On a memory-mapped file would it be better to use (VirtualAddress value + the imagebase value) to find the first byte of the section?
Another point of confusion is VirtualSize vs SizeOfRawData. This has confused me because in this article - http://msdn.microsoft.com/en-us/library/ms809762.aspx, it says "The SizeOfRawData field (seems a bit of a misnomer) later on in the structure holds the rounded up value" yet my VirtualSize is greater than my SizeOfRawData value which has led to confusion on which one I should use.
The object of this program is to find the executable section (.text section) and perform a bitwise operation on all the bits in the section, and end the operation before the next section.
I don't want it to seem like I expect a spoonfeed, I just want some clarifications.
Thank you for your time/help, it is appreciated.
I don't happen to have the spec handy or any PE code to look at for reference (I'm writing this on my iPad from my couch ;) but the key point to realize is that there are two modes to consider: all talk of RVAs is only relevant when the PE is mapped into memory and the alignment there is page-alignment. When you're reading the file off disk, the offsets are file offsets and each section is using the file alignment.
I hope this helps.