I have a compiled ELF file, libfoo.so, that exports some class methods, like:
struct Bar {
void f();
void g();
};
I know the exact declaration of these classes, but the definition is compiled into the .so. I'm using the same compiler (gcc >= 7) as the .so, so the name mangling and ABI match. This means that if I add the above declaration to my code, I'll be able to call the methods implemented in libfoo.so directly.
However, I don't want to pollute my namespace with toplevel libfoo stuff (and I don't control libfoo source). So I'd like to declare struct Bar inside namespace foo {}. Now, once I do that, the name mangling no longer matches and the loader will not be able to find the functions in question.
I do not want to write thunks for all of them as a) there's thousands and b) it causes issues with virtual functions and destructors.
I'm thinking of using objcopy and renaming all exported symbols in libfoo.so, but hoping there's a better solution here.
Both libfoo and my code are using C++14 (can be moved to C++17) and gcc 7 (can be moved to later). Compiled for 64bit ELF on Linux.
It ended up being a lot more complex than expected, but I did manage. If libfoo.so has the function Bar::f() (mangled as _ZN3Bar1fEv), and I want to define foo::Bar::f() (_ZN3foo3Bar1fEv), I first need to find the relative offset of the function in libfoo.so. With nm you get something like:
0000000000abcd00 T _ZN3Bar1fEv
I have a script that parses this output, and generates assembly code like this:
; Export start of the translation space
.global START
START:
nop
.global _ZN3foo3Bar1fEv ; Desired target name
_ZN3foo3Bar1fEv:
movq $0x0000abcd12345678, %rax ; arbitrary constant
addq $0xabcd00, %rax ; offset from libfoo
pushq %rax
ret
; same for every other needed symbol
; Export end of the translation space
.global END
END:
nop
This defines a "naked" function with the desired name, which adds an arbitrary constant to the offset and jumps to it. The extra jump is a perf hit I can live with.
The arbitrary constant is needed to deal with ASLR - the 0xabcd00 is the base offset, but when libfoo.so is loaded to some base address (e.g. 0x04000000) the actual address of the function (i.e. if you wanted a function pointer) is 0x04abde00. When generating the asm code, I used a temporary constant that I'll fix up after loading:
const uintptr_t offset = 0xabcd00; // Expected offset in libfoo.so
uintptr_t real = (uintptr_t) dlsym(RTLD_DEFAULT, "_ZN3Bar1fEv"); // real address with ASLR
uintptr_t aslr_base = real - offset; // This is the actual base to be applied.
// Search everything between START and END and update the arbitrary constant
extern "C" void START(); extern "C" void END();
uint8_t *p = (uint8_t*)&START;
uint8_t *end = (uint8_t*)&END;
mprotect(ALIGN_TO_PAGE(p), end-p+PAGE_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC); // mark the memory as writable
// update all instances of the arbitrary constant
while (p != end)
{
if (*(uintptr_t*)p == 0x0000abcd12345678)
*(uintptr_t*)p = aslr_base;
p++;
}
You can then restore the memory protections. The asm can be optimized to not use rax by doing calculations using just push and mov with indirect addressing using rsp, but this works for my case.
What doesn't work
objcopy allows redefining or injecting static symbols, but has no support for editing dynamic symbol table. None of the tools I found do either.
Linker scripts allow defining aliases (PROVIDE(alias = real)), but only if the target is defined at the time of linking; you cannot create an alias to a dynamic symbol.
Sourcing the ASLR offset from a global variable does not work - The linker kept asking that all code be compiled with -fPIC, which it was. This would have removed the need to modify the code.
Related
I am programming an STM32F413 microcontroller with SystemWorkbench 4 stm32. The Interrupt vectors are defined in an assembly startup file as weak aliases like follows:
.weak TIM1_UP_TIM10_IRQHandler
.thumb_set TIM1_UP_TIM10_IRQHandler,Default_Handler
And referenced in an object like follows:
g_pfnVectors:
.word _estack
.word Reset_Handler
.word NMI_Handler
.....
.word TIM1_UP_TIM10_IRQHandler
.....
So that the g_pfnVectors is a list of the addresses of the IRQ Handler functions. They are declared as weak aliases, so that if they are not defined by the user, the default handler is used.
I have defined the handler like this:
extern "C" {
void TIM1_UP_TIM10_IRQHandler() {
if (SU_TIM->SR & TIM_SR_UIF) {
SU_TIM->SR &= ~TIM_SR_UIF;
...
}
}
}
This works fine with the normal compiler optimization flags, however I wanted to try if I get smaller and possibly faster code with -flto (mainly for trying it, don't really needed it). But when compiling with -flto, g++ ignores my implementation of the handler and just uses the default handler, my handler isn't in the code at all.
So I tried to force g++ to include the function by adding __attribute__((used)) to the function definition, but it was still not compiled. However if I give it another name, then it was included in the binary. Also if I remove the weak alias and just have a reference to the handler in the startup file, it works too.
So somehow the weak aliases don't work with g++ link time optimization. Maybe someone can tell me what the error is and what I'm doing wrong here.
EDIT:
I have looked at which symbols are created with nm on the resulting .elf File, and the TIM1_UP_TIM10_IRQHandler is exported as a weak symbol with the address of the DefaultHandler. However when viewing just the .o file from the compilation unit containing the TIM1_UP_TIM10_IRQHandler function, it is exported as a symbol in the text section (T). So the linker, for some reason, chooses to keep the weak symbol, even though there is a strong symbol with the same name.
I think you should inform the compiler that it the interrupt __attribute__ ((interrupt ("IRQ"))), which is not needed normally as F4 has the stack by default aligned to 8 by the hardware.
If it does not help the workaround is to have a function pointer assigned with the handler, which will prevent it from discarding (if the pointer itself will not be discarded itself - check with your debugger).
The last resort - change the .s file with the vector table definitions
For those looking for this, still, there is apparently a confirmed bug in GCC 7 related to link-time optimization (-flto):
https://bugs.launchpad.net/gcc-arm-embedded/+bug/1747966
I have just run into this, again, with GCC 8 (gcc-arm-none-eabi-8-2019-q3-update release), the behavior is still the same.
The workaround that also works for me (from https://github.com/ObKo/stm32-cmake/issues/78) is to remove or comment the weak definitions at the end of the startup_XXX.s file, so change, for example
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
to
/*
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
*/
and replace them with your own implementation in a source file:
void NMI_Handler(void)
{
//...
}
All weak handlers need to be removed that are being called, so for example if you have UART1_Handler() defined in the HAL/LL drivers, you need to remove the corresponding .weak entry from the startup_XXX.s file, otherwise the interrupt will lock up the MCU by getting stuck in the default infinite loop, without executing the intended interrupt handler and returning from the interrupt, allowing other code execution to resume.
This bug is still present in gcc-arm-none-eabi-9-2020-q3-update but only for C handlers. Weirdly enough, handlers written in C++ (and declared with extern "C" linkage) are not anymore affected by this bug.
As another workaround, rather than messing with the startup.s file, I found that putting the IRQ handlers in separate .c files and building those (and only those) without LTO does the trick.
For those using CubeIDE and generating IRQ/HAL handlers with CubeMX (aka. "Device Configuration Tool"), all auto-generated handlers are in Core\Src\stm32XXXX_it.c, you just have to edit the properties of this file and remove LTO from the compilation options.
This is sub optimal, but it fits well with auto-generated IRQ/HAL handlers: only the first call (from IRQ handler to HAL handler) is unoptimized, but the HAL code itself is correctly optimized.
AVR g++ has a pointer size of 16 bits. However, my particular chip (the ATMega2560) has 256 KB of RAM. To support this, the compiler automatically generates trampoline sections in the same section of ROM as the current executing code that then contains the extended assembly code to jump into high memory or back. In order for trampolines to be generated, you must take the address of something that sits in high memory.
In my scenario, I have a bootloader that I have written sitting in high memory. The application code needs to be able to call a function in the bootloader. I know the address of this function and need to be able to directly address it by hard-coding the address in my code.
How can I get the compiler/linker to generate the appropriate trampoline for an arbitrary address?
Compiler and linker will only generate trampoline code when the far address is a symbolic address rather than a literal constant number already in code. something like (assuming the address you want to jump to is 0x20000).
extern void (*farfun)() = 0x20000;
farfun ();
Will definitely not work, it doesn't cause the linker to do anything because the address is already resolved.
You should be able to inject the symbol address in the linker command line like so:
extern void farfun ();
farfun ();
compiling "normally" and linking with
-Wl,--defsym,farfun=0x20000
I think it's clear that you need to make sure yourself that something sensible sits at farfun.
You will most probably also need --relax.
EDIT
Never tried this myself, but maybe:
You could probably try to store the function address in a table in high memory and declare it like this:
extern void (*farfunctable [10])();
(farfunctable [0])();
and use the very same linker command to resolve the external symbol (now your table at 0x20000 (in the bootloader) needs to look like this:
extern void func1();
extern void func2();
void ((*farfunctab [10])() = {
func1,
func2,....
};
I would recommend to put func1() ... func10() in a different module from farfunctab in order to make the linker know it has to generate trampolins.
I was planning on putting a dispatch struct (that is, a struct with function pointers to all the various functions). Your solution works well, but requires knowing all of the locations of all of the functions ahead of time. Is there a way to execute a function call to a far address that isn't known at compile time?
[...] My goal was to put the struct with pointers to the functions in a fixed location. That way, it would be a single thing that needed a fixed address rather than every external function.
So you have two applications, let's call them App and Boot, where Boot provides some functionalities that App wants to use. The following problems have to be addressed:
How to get addresses from Boot into App.
How to build a jump table for Boot.
Avoid constructs that will crash when App tries to use code from Boot, like: Using indirect calls or jumps, using static constructors or using static storage in Boot.
App uses Addresses of boot.elf directly
Linking with -Wl,-R,boot.elf
A simple way would be to just link app.elf against boot.elf be means of -Wl,-R,boot.elf. Option -R instructs the linker to use symbol values from the specified file without dragging any code. Problem is that there's no way to specify which symbols to use, for example this might lead to a situation where App uses libgcc functions from Boot.
Defining Symbols by means of -Wl,--defsym,symbol=value
A bit more control over which symbols are being defined can be implemented by following a specific naming convention. Suppose that all symbols from Boot that have "boot" in their name should be "exported", then you could just
> avr-nm -g boot.elf | grep ' T ' | awk '/boot/ { printf("--defsym %s=0x%s\n",$3,$1) }' > syms.opt
This prints global symbol values, and grep filters out symbols in the text section. awk then transforms lines like 00020102 T boot1 to lines like
--defsym boot1=0x00020102 which are written to an option file syms.opt. The option file can then be provided to the linker by means of -Wl,#syms.opt.
The advantage of an option file is that it is easier to provide than plain options in a build environment like make: app.elf would depend (amongst others) on syms.opt, which in turn would depend on boot.elf.
Defining Symbols in a Linker Script Snippet
An alternative would be to define the symbols in a linker script augmentation, which you would provide by means of -T syms.ld during link and which would contain
"boot1"=ABSOLUTE(0x00020102);
"boot2"=...
...
INSERT AFTER .text
Defining Symbols in an Assembly Module
Yet another way to define the symbols would be by means of an assembly module which contains definitions like .global boot1 together with boot1 = 0x00020102.
All these approaches have in common that all symbols must be defined, or otherwise the linker will throw an undefined symbol error. This means boot.elf must be available, and it does not matter whether just one symbol is undefined or whether dozends of symbols are undefined.
Let Boot provide a Dispatch Table
The problem with using boot.elf directly, like lined out in the previous section, is that it introduces a direct dependency. This means that if Boot is improved or refactored, then you'll also have to re-compile App each time, even if the interface did not change.
A solution is to let Boot provide a dispatch table whose position and layout are known ahead of time. Only when the interface itself changes, App will have to be rebuilt. Just refactoring Boot will not require to re-build App.
The Assembly Module with the Jump Table
As explained in the "Crash" section below, addresses in a dispatch table (and hence indirect jumps) won't work because EIND has a wrong value. Therefore, let's assume we have a table of JMPs to the desired Boot functions, like in an assembly module boot-table.sx that reads:
;;; Linker description file boot.ld locates input section .boot.table
;;; right after .vectors, hence the address of .boot_table will be
;;; text-section-start + _VECTORS_SIZE, where the latter is
;;; #define'd in <avr/io.h>.
;;; No "x" section flag so that the linker won't relax JMPs to RJMPs.
.section .boot.table,"a",#progbits
.global .boot_table
.type .boot_table,#object
boot_table:
jmp boot1
jmp boot2
.size boot_table, .-boot_table
In this example, we are going to locate the jump table right after .vectors, so that its location is known ahead of time. The respective symbol definitions in App's syms.opt will then read
--defsym boot1=0x20000+vectors_size+0*4
--defsym boot2=0x20000+vectors_size+1*4
provided Boot is located at 0x20000. Symbol vectors_size can be defined in a C/C++ module, here by abusing avr-gcc attribute "address":
#include <avr/io.h>
__attribute__((__address__(_VECTORS_SIZE)))
char vectors_size;
Locating the Jump Table
In order to locate input section .boot.table, we need an own linker description file, which you might already use for Boot anyways. We start with a linker script from avr-gcc installation at ./avr/lib/ldscripts/avr6.xn, copy it to boot.ld, and add the following 2 lines after vectors:
...
.text :
{
*(.vectors)
KEEP(*(.vectors))
*(.boot.table)
KEEP(*(.boot.table))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
...
Auto-Generating Boot's Jump Table Module and the Symbols for App
It's highly advisable to have an interface description used by both App and Boot, say common.h. Moreover, in order to keep Boot's boot-table.sx and App's syms.opt in sync with the interface, it's agood idea to auto-generate these two files from common.h. To that end, assume that common.h reads:
#ifndef COMMON_H
#define COMMON_H
#define EX __attribute__((__used__,__externally_visible__))
EX int boot1 /* #boot_table:0 */ (int);
EX int boot2 /* #boot_table:1 */ (void);
#endif /* COMMON_H */
For the matter of simplicity, let's assume that this is C code or the interfaces are extern "C" so that the symbols in source code match the assembly names, and there's no need to use mangled names. It' easy enough to generate boot-table.sx and syms.opt from common.h using the magic comments. The magic comment follows directly after the symbol, so a regex would retrieve the token left of the magic comment, something like Python:
# ... symbol /* #boot_table:index */...
pat = re.compile (r".*(\b\w+\b)\s*/\* #boot_table:(\d+) \*/.*")
for line in sys.stdin.readlines():
match = re.match (pat, line)
if match:
index = int (match.group(2))
symbol = match.group(1)
Output template for syms.opt would be something like:
asm_line = "--defsym {symbol}=0x20000+vectors_size+4*{index}\n"
Code that will crash
Using Boot code from App is subject to several restrictions:
Indirect Calls and Jumps
These will crash because the start addresses of App resp. Boot are in different 128KiB segments of flash. When the address of a code symbol is taken, the compiler does this per gs(symbol) which instructs the linker to generate a stub and resolve gs() to that stub in .trampolines if the target address is outside the 128KiB segment where the trampolines are located. An explanation of gs() can be found in this answer, there is however more to it: The startup code will effectively initialize
EIND = __vectors >> 17;
see gcrt1.S, the AVR-LibC bits of start-up code crt<device>.o. The compiler assumes EIND never changes during execution, see EIND and more than 128KiB of Flash in the GCC documentation.
This means code in Boot assumes EIND = 1 but is called with EIND = 0 and hence EICALL resp. EIJMP will target the wrong address. This means common code must avoid indirect calls and jumps, and should be compiled with -fno-jump-tables so that switch/case won't generate such tables.
This also implies that the dispatch table described above won't work if it would just held gs(symbol) entries, because App and Boot will disagree on EIND.
Data in Static Storage
If common Boot code is using data in static storage, the data might collide with App's static storage. One way out is to avoid static storage in respective parts of Boot and pass addresses to, say, some data buffer by means of pointer erguments of respective functions.
One could have completely separate RAM areas; one for Boot and one for App, but that would be a waste of RAM because the applications will never run at the same time.
Static Constructors
Boot's static constructors will be bypassed if App uses code from Boot. This includes:
C++ code in Boot that explicitly or implicitly generates such constructors.
C/C++ code in Boot that relies on __attribute__((__constructor__)) or code in section .initN which is supposed to run prior to main.
Start-up code that initializes static storage, EIND etc., which is also run by locating it in some .initN sections, but will be bypassed if App calls Boot code.
In C, let's say you have a variable called variable_name. Let's say it's located at 0xaaaaaaaa, and at that memory address, you have the integer 123. So in other words, variable_name contains 123.
I'm looking for clarification around the phrasing "variable_name is located at 0xaaaaaaaa". How does the compiler recognize that the string "variable_name" is associated with that particular memory address? Is the string "variable_name" stored somewhere in memory? Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Variable names don't exist anymore after the compiler runs (barring special cases like exported globals in shared libraries or debug symbols). The entire act of compilation is intended to take those symbolic names and algorithms represented by your source code and turn them into native machine instructions. So yes, if you have a global variable_name, and compiler and linker decide to put it at 0xaaaaaaaa, then wherever it is used in the code, it will just be accessed via that address.
So to answer your literal questions:
How does the compiler recognize that the string "variable_name" is associated with that particular memory address?
The toolchain (compiler & linker) work together to assign a memory location for the variable. It's the compiler's job to keep track of all the references, and linker puts in the right addresses later.
Is the string "variable_name" stored somewhere in memory?
Only while the compiler is running.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Yes, that's pretty much what happens, except it's a two-stage job with the linker. And yes, it uses memory, but it's the compiler's memory, not anything at runtime for your program.
An example might help you understand. Let's try out this program:
int x = 12;
int main(void)
{
return x;
}
Pretty straightforward, right? OK. Let's take this program, and compile it and look at the disassembly:
$ cc -Wall -Werror -Wextra -O3 example.c -o example
$ otool -tV example
example:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl 0x00000096(%rip),%eax
0000000100000f6a popq %rbp
0000000100000f6b ret
See that movl line? It's grabbing the global variable (in an instruction-pointer relative way, in this case). No more mention of x.
Now let's make it a bit more complicated and add a local variable:
int x = 12;
int main(void)
{
volatile int y = 4;
return x + y;
}
The disassembly for this program is:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl $0x00000004,0xfc(%rbp)
0000000100000f6b movl 0x0000008f(%rip),%eax
0000000100000f71 addl 0xfc(%rbp),%eax
0000000100000f74 popq %rbp
0000000100000f75 ret
Now there are two movl instructions and an addl instruction. You can see that the first movl is initializing y, which it's decided will be on the stack (base pointer - 4). Then the next movl gets the global x into a register eax, and the addl adds y to that value. But as you can see, the literal x and y strings don't exist anymore. They were conveniences for you, the programmer, but the computer certainly doesn't care about them at execution time.
A C compiler first creates a symbol table, which stores the relationship between the variable name and where it's located in memory. When compiling, it uses this table to replace all instances of the variable with a specific memory location, as others have stated. You can find a lot more on it on the Wikipedia page.
All variables are substituted by the compiler. First they are substituted with references and later the linker places addresses instead of references.
In other words. The variable names are not available anymore as soon as the compiler has run through
This is what's called an implementation detail. While what you describe is the case in all compilers I've ever used, it's not required to be the case. A C compiler could put every variable in a hashtable and look them up at runtime (or something like that) and in fact early JavaScript interpreters did exactly that (now, they do Just-In-TIme compilation that results in something much more raw.)
Specifically for common compilers like VC++, GCC, and LLVM: the compiler will generally assign a variable to a location in memory. Variables of global or static scope get a fixed address that doesn't change while the program is running, while variables within a function get a stack address-that is, an address relative to the current stack pointer, which changes every time a function is called. (This is an oversimplification.) Stack addresses become invalid as soon as the function returns, but have the benefit of having effectively zero overhead to use.
Once a variable has an address assigned to it, there is no further need for the name of the variable, so it is discarded. Depending on the kind of name, the name may be discarded at preprocess time (for macro names), compile time (for static and local variables/functions), and link time (for global variables/functions.) If a symbol is exported (made visible to other programs so they can access it), the name will usually remain somewhere in a "symbol table" which does take up a trivial amount of memory and disk space.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it
Yes.
and if so, wouldn't it have to use memory in order to make that substitution?
Yes. But it's the compiler, after it compiled your code, why do you care about memory?
I want to ask order of function signature, call and definition
like, which one would the computer look first, second and third
So:
#include <iostream>
using namespace std;
void max(void);
void min(void);
int main() {
max();
min();
return;
}
void max() {
return;
}
void min() {
return;
}
So this is what I think,
the computer will go to main and look at the function call, then it will look at the
function signature, and at the last, it will look at the definition.
It is right?
Thank
It is right?
No.
You need to understand the difference between function declarations and function definitions, the difference between compilation, linking, and execution, and the difference between non-virtual and virtual functions.
Function declarations
This is a function declaration: void max(void);. It doesn't tell the compiler anything about what the function does. What it does is to tell the compiler how to call the function and how to interpret the result. When the compiler is compiling the body of some function, call it function A, the compiler doesn't need to know what other functions do. All it needs to know is what to do with the functions that function A calls. The compiler might generate code in assembly or some intermediate language that corresponds to your C++ function calls. Or it might reject your C++ code because your code doesn't make sense.
Determining whether your code makes sense is another key purpose of those function declarations. This is particularly important in C++ where multiple functions can have the same name. How would the compiler know which of the half dozen or so max functions to call if it didn't know about those functions? When your C++ code calls some function, the compiler must find one best match (possibly involving type conversions) with one of those function declarations. Your code doesn't make sense if the compiler can't find a match at all, or if it finds more than one match but can't distinguish one as the best match.
When the compiler does find a best match, the generated code will be in the form of a call to an undefined external reference to that function. Where that function lives is not the job of the compiler.
Function definitions
That void max(void) was a function declaration. The corresponding void max() {...} is the definition of that function. When the compiler is processing void max() {...} it doesn't have to worry about what other functions have called it. It just has to worry about processing void max() {...} . The body of this function becomes assembly or intermediate language code that is inserted into some compiled object file. The compiler marks the address of the entry point to this generated code is marked as such.
Compilation versus linking
So far I've talked about what the compiler does. It generates chunks of low-level code that correspond to your C++ code. That generated code is not ready for prime time because of those external references. Resolving those undefined external references is the job of the linker. The linker is what builds your executable from multiple object files, multiple libraries. It keeps track of where it has put those chunks of code in the executable. What about those undefined external references? If the linker has already placed that reference in the executable, the linker simply fills in the placeholder for that reference. If the linker hasn't come across the definition for that reference, it puts the reference and the placeholder onto a list of still-unresolved references. Every time the linker adds a chunk of code to the executable, it checks that list to see if it can fix any of those still-unresolved references. At the end, you will either have all references resolved or you will still have some outstanding ones. The latter is an error. The former means that you have an executable.
Execution
When your code runs, those function calls are really just some stack management wrapped around the machine language equivalent of that evil goto statement. There's no examining your function declarations; those don't even exist by the time the code is executed. Return? That's a goto also.
Non-virtual versus virtual functions
What I said above pertains to non-virtual functions. Run-time dispatching does occur for virtual functions. That run-time dispatching has nothing to do with examining function declarations. Those virtual functions are perhaps an issue for a different question.
One last thing:
Get out of the habit of using namespace std; Think of it as akin to smoking. It's a bad habit.
As you may know, the compiler converts the program into machine code (via several intermediate steps). Here is the dissassembly of the machine code for main() when compiled on Visual Studio 2012 in debug mode on Windows 8:
int main() {
00C24400 push ebp # Setup stack frame
00C24401 mov ebp,esp
00C24403 sub esp,0C0h
00C24409 push ebx
00C2440A push esi
00C2440B push edi
00C2440C lea edi,[ebp-0C0h] # Fill with guard bytes
00C24412 mov ecx,30h
00C24417 mov eax,0CCCCCCCCh
00C2441C rep stos dword ptr es:[edi]
max();
00C2441E call max (0C21302h) # Call max
min();
00C24423 call min (0C2126Ch) # Call min
return 0;
00C24428 xor eax,eax
}
00C2442A pop edi # Restore stack frame
00C2442B pop esi
00C2442C pop ebx
00C2442D add esp,0C0h
00C24433 cmp ebp,esp
}
00C24435 call __RTC_CheckEsp (0C212D5h) # Check for memory corruption
00C2443A mov esp,ebp
00C2443C pop ebp
00C2443D ret
The exact details will vary from compiler to compiler and operating system to operating system. If min() or max() had arguments or return values, they would be passed as appropriate for the architecture. The key point is that the compiler has already worked out what the arguments and return values are and created machine code that just passes or accepts them.
You can learn more details if you wish to help with debugging or to do low level calls but be aware that the machine code emitted can be highly variable. For example, here is the same code compiled on the same system in release mode (i.e. with optimizations on):
return 0;
01151270 xor eax,eax
}
01151272 ret
As you can see, it has detected that min() and max() do nothing and removed them completely. Since there is now no stack frame to setup and restore, that is gone, leaving a single instruction to set eax to 0 then returning (since the return value is in the eax register).
I have a program which loads DLLs and I need to call one of the non-exported functions it contains. Is there any way I can do this, via searching in a debugger or otherwise? Before anyone asks, yes I have the prototypes and stuff for the functions.
Yes there is, at least sort of, but it isn't a good idea.
In C/C++ all a function pointer is, is an address in memory. So if you somehow where able to find the address of this function you could call it.
Let me ask some questions though, how do you know this DLL contains this function? Do you have the source code? Otherwise I don't know how you could know for certain that this function exists or if it is safe to call. But if you have the source code, then just expose the function. If the DLL writer didn't expose this function, they never expect you to call it and can change/remove the implementation at any time.
Warnings aside, you can find the function address if you have debug symbols or a MAP file you can find the offset in the DLL. If you don't have anything but the DLL, then there is no way to know where that function exists in the DLL - it is not stored in the DLL itself.
Once you have the offset you can then insert that into the code like so:
const DWORD_PTR funcOffset = 0xDEADBEEF;
typedef void (*UnExportedFunc)();
....
void CallUnExportedFunc() {
// This will get the DLL base address (which can vary)
HMODULE hMod = GetModuleHandle("My.dll");
// Calcualte the acutal address
DWORD_PTR funcAddress = (DWORD_PTR)hMod + funcOffset;
// Cast the address to a function poniter
UnExportedFunc func = (UnExportedFunc)funcAddress;
// Call the function
func();
}
Also realize that the offset of this function WILL CHANGE EVERY TIME the DLL is rebuilt so this is very fragile and let me say again, not a good idea.
I realize this question rather is old, but shf301 has the right idea here. The only thing I would add is to implement a pattern search on the target library. If you have IDA or OllyDbg, you can search for the function and view the binary/hex data which surrounds that function's starting address.
In most cases, there will be some sort of binary signature which rarely changes. The signature may hold wildcards which may change between builds, but ultimately there should be at least one successful hit while searching for this pattern, unless extremely drastic changes have occurred between builds (at which point, you could just figure out the new signature for that particular version).
The way that you would implement a binary pattern search is like so:
bool bCompare(const PBYTE pData, const PBYTE bMask, const PCHAR szMask)
{
for(;*szMask;++szMask,++pData,++bMask)
if(*szMask=='x' && *pData!=*bMask)
return 0;
return (*szMask) == NULL;
}
DWORD FindPattern(DWORD dwAddress, DWORD dwLen, PBYTE bMask, PCHAR szMask)
{
for(DWORD i=0; i<dwLen; i++)
if (bCompare((PBYTE)(dwAddress+i),bMask,szMask))
return (DWORD)(dwAddress+i);
return 0;
}
Example usage:
typedef void (*UnExportedFunc)();
//...
void CallUnExportedFunc()
{
// This will get the DLL base address (which can vary)
HMODULE hMod = GetModuleHandleA( "My.dll" );
// Get module info
MODULEINFO modinfo = { NULL, };
GetModuleInformation( GetCurrentProcess(), hMod, &modinfo, sizeof(modinfo) );
// This will search the module for the address of a given signature
DWORD dwAddress = FindPattern(
hMod, modinfo.SizeOfImage,
(PBYTE)"\xC7\x06\x00\x00\x00\x00\x89\x86\x00\x00\x00\x00\x89\x86",
"xx????xx????xx"
);
// Calculate the acutal address
DWORD_PTR funcAddress = (DWORD_PTR)hMod + dwAddress;
// Cast the address to a function poniter
UnExportedFunc func = (UnExportedFunc)funcAddress;
// Call the function
func();
}
The way that this works is by passing in the base address of the loaded library via GetModuleHandle, specifying the length (in bytes) to search, the binary data to search for, and a mask which specifies which bytes of the binary string are valid ('x') and which are to be overlooked ('?'). The function will then walk through the memory space of the loaded module, searching for a match. In some cases, there may be more than one match and in this case, it's wise to make your signature a little more pronounced to where there is only one match.
Again, you would need to do the initial binary search in a disassembly application in order to know what this signature is, but once you have that then this method should work a little better than manually finding the function offset every time the target is built. Hope this helps.
If the function you want isn't exported, then it won't be in the export address table. Assuming Visual Studio was used to produce this DLL and you have its associated PDB (program database) file, then you can use Microsoft's DIA (debug interface access) APIs to locate the desired function either by name or, approximately, by signature.
Once you have the function (symbol) from the PDB, you will also have its RVA (relative virtual address). You can add the RVA to the loaded module's base address to determine the absolute virtual address in memory where the function is stored. Then, you can make a function call through that address.
Alternatively, if this is just a one-off thing that you need to do (i.e. you don't need a programmatic solution), you can use windbg.exe in the Debugging Tools for Windows toolkit to attach to your process and discover the address of the function you care about. In WinDbg, you can use the x command to "examine symbols" in a module.
For example, you can do x mymodule!*foo* to see all functions whose name contains "foo". As long as you have symbols (PDB) loaded for your module, this will show you the non-export functions as well. Use .hh x to get help on the x command.
Even if you can find the function address, it's not in general safe to call a function created by a compiler that thought it was making a "private" internal-use-only function.
Modern compilers with link-time-optimization enabled may make a specialized version of a function that only does what the specific callers need it to do.
Don't assume that a block of machine code that looks like the function you want actually follows the standard ABI and implements everything the source code says.
In gcc's case, it does use special names for specialized versions of a function that aren't inlined but take advantage of a special case (like constant propagation) from multiple callers.
e.g. in this objdump -drwC output (where -C is demangle):
42944c: e8 cf 13 0e 00 call 50a820
429451: 48 8b 7b 48 mov rdi,QWORD PTR [rbx+0x48]
429455: 48 89 ee mov rsi,rbp
429458: e8 b3 10 0e 00 call 50a510
gcc emits code that calls two different clones of the same function, specialized for two different compile-time-constants. (This is from http://endless-sky.github.io/, which desperately needs LTO because even trivial accessor functions for its XY position class are in Point.cpp, not Point.h, so they can only be inlined by LTO.)
LTO can even make .lto_priv static versions of data: like
mov rcx,QWORD PTR [rip+0x412ff7] # 83dbe0 <_ZN12_GLOBAL__N_116playerGovernmentE.lto_priv.898>
So even if you find a function that looks like what you want, calling it from a new place might violate the assumptions that Link-Time-Optimization took advantage of.
I'm afraid there are no "safe" way to do so if referred library does not explicitly export its object (class/func). Because you will have no idea where is the required object mapped in code memory.
However, by using RE tools, you can find offset for interested object within the library, then add it to any known exported object address to obtain the "real" memory location. After that, prepare a function prototype etc and cast into your local structure for usage.
The most general way to do this (and it's still a bad idea, as everyone else pointed out already) is to scan the DLL code at runtime after it's loaded, and look for a known, unique section of code in that function, and then use code similar to that in shf301's answer to call it. If you know that the DLL won't ever change, than any solution based on determining the offset in the DLL should work.
To find that unique section of code, disassemble the DLL using a disassembler that can show you the machine code in addition to the assembly language mnemonics (I can't think of anything that won't do that) and watch out for call and jmp instructions.
I actually had to do something similar once to apply a binary patch to a DOS exe; it was a bug fix, and the code wasn't under revision control so that was the only way to fix it.
I'd be really curious to know why you need this, by the way.