How to specify an offset with add-symbol-file command which get added to all sections in gdb - gdb

I have one shared object file which gets loaded in memory dynamically using add-symbol-file in gdb.
gdb) add-symbol-file shared.so 0x1234
Doing this .text section loaded at 0x1234 memory. We can specify a section-specific address also.
Can we specify of offset which can get added to all section addresses?
Similar like --slide in lldb debugger. which slides LOAD address by offset.

I have one shared object file which gets loaded in memory dynamically
On most platforms, whey you dynamically load a shared object (e.g. via dlopen), GDB will automatically add symbols from it without the need to add-symbol-file.
If that isn't happening, you should describe your platform, and the mechanism by which the shared object gets loaded.
Can we specify of offset which can get added to all section addresses?
Yes.
Assuming the shared object was added with relocation 0x123456000, and that the .text section in that shared object starts at offset 0xabcd, add-symbol-file 0x123456000+0xabcd will do exactly what you want.

Related

Why does ZwUnmapViewOfSection() unmap the memory of the whole process, when given the PE base address as argument?

In code that creates a process in a suspended state and unmaps the program's memory, I often see the following code:
ZwUnmapViewOfSection(remoteProcessInfo->hProcess, static_cast<PVOID>(remoteImageBaseAddress))
According to the MSDN documentation, ZwUnmapViewOfSection unmaps a single section that contains the remoteImageBaseAddress.
However, PE binaries have multiple sections (.text, .data etc.), so doesn't this call only unmaps a single section of these?
What do I miss? I also don't understand why remoteImageBaseAddress (base address) is given as argument, as this address doesn't belong to the .text section (executable code).
I think you are getting confused between PE file sections, and Windows's memory managment Section type object.
A section in a PE file is just a piece of raw data, usually categorized by their attributes.
In Windows, a section object is an internal memory management object that is used to map physical files on the disk to a process's virtual address space. So when ZwUnmapViewOfSection is passed the address of the mapped file (the section), it removes the section object - aka the mapped file, from kernel space and the process's address space. It does not alter the actual PE file's sections.
I hope you understand this now.

How does GDB know where an executable has been relocated?

I know modern OSs such as Linux don't always execute an application at the same address it was originally linked. When a debugger starts looking around though, it needs to know the relationship between the original link address and the final executing address. How does GDB calculate the offset?
Clarifications: I'm not talking about virtual memory. That is, I have (what I believe to be) a reasonable understanding of how Virtual memory works and am operating entirely with in that address space. I have symbols that are at one location when I dump the symbol table from the ELF, but at another location when I get their address out of memory.
In this particular case, I have a string which in the linked executable is at address 0x0E984141. In a dump of memory from that process, it is at address 0x0E3F2781. Everything in the .rodata section at least has been shifted by 0x5919C0. It appears to be something like Address Space Layout Randomization.
I know modern OSs such as Linux don't always execute an application at the same address it was originally linked.
This is only possible for position-independent executables (linked with -pie flag).
When a debugger starts looking around though, it needs to know the relationship between the original link address and the final executing address.
Correct.
How does GDB calculate the offset?
The same way GDB calculates the offset for shared libraries (a PIE executable is really a special case of a shared library). There is a defined interface between ld.so and GDB, consisting of _dl_debug_state() function (on which GDB sets an internal breakpoint, and which ld.so calls whenever it maps a new ELF image into the process), and struct r_debug. The latter points to a linked list of struct link_maps, and l_addr member of that struct is the offset between linked-at and loaded-at address.
If i understand what you are getting at, I think what you are actually referring to is Virtual Memory addressing This is not handled by GDB, it is handled by the operating system.
http://www.cs.utexas.edu/users/witchel/372/lectures/15.VirtualMemory.pdf
On Linux, every process has its own address space in virtual memory.
The ELF executable contains a header describing the segments in memory (and their corresponding sections in the executable).

Is it possible to change the entry point of a process from a DLL?

The default entry point for most application processes is usually 0x401000.
Is there any way we could shift or change the entry point of a process? For example, if I wanted to change the entry point to 0x901000 externally using a DLL (assuming that the process loaded the DLL via C++)?
I'm trying to create a DLL to edit the process's default entry point.
Yes, you can change ImageBase in Optional Header of Portable Executable, if your linker allows this.
Most linkers set ImageBase=0x10000 when linking executable and 0x400000 when linking DLL. However, this number is chosen arbitrarily (I guess because it is easy to remember and looks good in debuggers) and it may be disobeyed by the loader if the memory is already occupied.
See http://msdn.microsoft.com/en-us/library/ms809762.aspx
Table 3. paragraph IMAGE_OPTIONAL_HEADER.ImageBase:
When the linker creates an executable, it assumes that the file will be memory-mapped to a specific location in memory. That address is stored in this field, assuming a load address allows linker optimizations to take place. If the file really is memory-mapped to that address by the loader, the code doesn't need any patching before it can be run. In executables produced for Windows NT, the default image base is 0x10000. For DLLs, the default is 0x400000. In Windows 95, the address 0x10000 can't be used to load 32-bit EXEs because it lies within a linear address region shared by all processes. Because of this, Microsoft has changed the default base address for Win32 executables to 0x400000. Older programs that were linked assuming a base address of 0x10000 will take longer to load under Windows 95 because the loader needs to apply the base relocations.
On Windows, the default load address for EXEs is 0x400000 - so that's where that part of 0x401000 comes from.
The 0x1000 component is the offset into the image in memory where (usually) the text segment that hold the bulk of the code starts. That's where this particular program's entry point is.
That offset is a field in the PE header, as is indeed the default load address of 0x400000. Both can be changed, but be aware that for EXEs, relocation information is often stripped: Since the default load address is always guaranteed to be free when a new process is first created, relocation information is often assumed to not be needed for EXEs.
If that is the case for your EXE then you can't change the load address without doing major surgery to the image to manually identify and fix up any references that are relative to the assumed 0x400000 load address used during compilation/linking.

Understanding Dynamic Library loading in Linux

I am trying to understand Dynamic Library loading in Linux from here [1] and want to clarify the concept. Concretely, when a dynamic library is loaded in a process in a Linux environment, it is loaded at any point in the address space. Now, a library has a code segment, and a data segment. The code segment's address is not defined pre-linking so it is 0x0000000 while for data segment, some number is defined to be an address.
But here is the trick, this address of data segment is not actually the true address. Actually, at whatever position code segment is loaded, data segment's pre-defined address is added to it.
Am I correct here?
One more thing from the referenced article. What does this statement mean?
However, we have the constraint that the shared library must still have a unqiue data instance in each process. While it would be possible to put the library data anywhere we want at runtime, this would require leaving behind relocations to patch the code and inform it where to actually find the data — destroying the always read-only property of the code and thus sharability.
[1] http://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html
Actually, at whatever position code segment is loaded, data segment's pre-defined address is added to it.
Yes. The "VirtAddr" of the data segment will be added to base address.
What does this statement mean?
It means that when library accesses its own static data, we should not use relocations in the library code. Otherwise linker may need to patch the binary code, which leads to unsharing some parts of library codes between processes (if process1 loads library lib1 at 0x40000000, and process2 loads lib1 at 0x50000000, their data relocations will be different).
So, different solution is used in real life. Both library code and data are loaded together, and the offset between code and data is fixed for all cases. There is the "solution" after text you cited: http://www.technovelty.org/linux/plt-and-got-the-key-to-code-sharing-and-dynamic-libraries.html
As you can see from the above headers, the solution is that the read-write data section is always put at a known offset from the code section of the library. This way, via the magic of virtual-memory, every process sees its own data section but can share the unmodified code. All that is needed to access data is some simple maths; address of thing I want = my current address + known fixed offset.

Specify the memory start address for a process

I wish to know if it is possible to load the process at a user (pre)specified address?
Thanks,
Ashutosh
The base address is specified in the PE file. If you mean for an EXE that you're compiling in MSVC, then you can set the base address in the linker settings. If you've got an arbitrary EXE or DLL, you could alter the base address by hand, with a good PE resource. You should also turn off ASLR - it's also a project setting and in the PE file.
Most EXE files load at their preferred base address as when you start a process with one, it's the only thing in the address space, and it's not unheard of for exe files to skip the relocation table. DLLs however sometimes have to be re-based. It's not a good idea at all to depend on loading at a specific base address.