How to place a function at a particular address in C? - c++

I want to place a function void loadableSW (void) at a specific location:0x3FF802. In another function residentMain() I will jump to this location using pointer to function. How to declare function
loadableSW to accomplish this. I have attached the skeleton of residentMain for clarity.
Update: Target hardware is TMS320C620xDSP. Since this is an aerospace project, deterministic
behaviour is a desirable design objective. Ideally, they would like to know what portion of memory contains what at a particular time. The solution as I just got to know is to define a section in memory in the linker file. The section shall start at 0x3FF802 (Location where to place the function). Since the size of the loadableSW function is known, the size of the memory section can also be determined. And then the directive #pragma CODESECTION ("function_name", "section_name") can place that function in the specified section.
Since pragma directives are not permissible in test scripts, I am wondering if there is any other way to do this without using any linker directives.
Besides I am curious. Is there any placement syntax for functions in C++? I know there is one for objects, but functions?
void residentMain (void)
{
void (*loadable_p) (void) = (void (*) (void)) 0x3FF802;
int hardwareOK = 0;
/*Code to check hardware integrity. hardwareOK = 1 if success*/
if (hardwareOK)
{
loadable_p (); /*Jump to Loadable Software*/
}
else
{
dspHalt ();
}
}

I'm not sure about your OS/toolchain/IDE, but the following answer should work:
How to specify a memory location at which function will get stored?
There is just one way I know of and it is shown in the first answer.
UPDATE
How to define sections in gcc:
variables:
http://mcuoneclipse.com/2012/11/01/defining-variables-at-absolute-addresses-with-gcc/
methods (section ("section-name")): http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Function-Attributes.html#Function%20Attributes

How to place a function at a particular address in C?
Since pragma directives are not permissible in test scripts, I am wondering if there is any other way to do this without using any linker directives.
If your target supports PC-relative addressing and you can ensure it is pure, then you can use a memcpy() to relocate the routine.
How to run code from RAM... has some hints on this. If you can not generate PC-relative/relocatable code, then you absolutely can not do this with out the help of the linker. That is the definition of a linker/loader, to fix up addresses.
Which can take you to a different concept. Do not fully link your code. Instead defer the address fixup until loading. Then you must write a loader to place the code at run-time; but from your aerospace project comment, I think that complexity and analysis are also important so I don't believe you would accept that. You also need double the storage, etc.

Related

C++ LinkTime/CompileTime Generate Function Offset From Start Of .Text Section Or Other Reference Point

So I have a need for a way to get an offset of a function from its PE files .text region/whatever section it is in, or within reference to another function within the file.
I'd like to do something similar:
void func_two()
{
/*...*/
}
void call_our_function()
{
/*...*/
}
void main_loop()
{
constexpr offset_of_two = (int)&func_two - (int)&call_our_function;
// calls func_two
(decltype(&func_two)(offset_of_two + (int)&call_our_function))();
/* OR : */
void* text_region = find_pe_text_region_start();
constexpr offset_from_text = get_offset_from_linker_somehow();
// calls func_two
(decltype(&func_two)(offset_from_text + (int)&offset_from_text))();
}
constexpr doesn't allow this. I'm assuming its because the linker sets these values for func address/etc at link-time. However I know that link time theoretically could do this, otherwise export tables and RVAs in the PE file wouldn't work. I know I could export them and parse the export table, but that doesn't particularly work for my use case.
Anybody know of any ways to solve this problem, without calculating them at runtime? Maybe a plugin for the linker, however I doubt MSVC supports that. Very specific use I have here.
Function pointers are a separate class of pointers and you can't only cast them to other function pointers. They may be larger then uintptr_t and certainly will be larger than int on common 64bit architectures. Using int is totally UB. Using uintptr_t would at least bring it up to implementation defined behavior.
But you are right that the values are only going to be available at link time. Until you link the compiler has no idea where in memory the functions will end up and thus can't know the offsets between them.
So there is no way of making this constexpr. It should become link time evaluated though. The object format (at least ELF) allows encoding the difference between 2 symbols and other simple math and the linker will compute the actual value at link time. There should be no runtime overhead for this.
PS: declare the offsets global and check if the resulting binary contains them as constants or computes them in the init_array / ctors. The local variables might compute them at runtime because that doesn't require defining an extra constant.

ELF INIT section code to prepopulate objects used at runtime

I'm fairly new to c++ and am really interested in learning more. Have been reading quite a bit. Recently discovered the init/fini elf sections.
I started to wonder if & how one would use the init section to prepopulate objects that would be used at runtime. Say for example you wanted
to add performance measurements to your code, recording the time, filename, linenumber, and maybe some ID (monotonic increasing int for ex) or name.
You would place for example:
PROBE(0,"EventProcessing",__FILE__,__LINE__)
...... //process event
PROBE(1,"EventProcessing",__FILE__,__LINE__)
......//different processing on same event
PROBE(2,"EventProcessing",__FILE__,__LINE__)
The PROBE could be some macro that populates a struct containing this data (maybe on an array/list, etc using the id as an indexer).
Would it be possible to have code in the init section that could prepopulate all of this data for each PROBE (except for the time of course), so only the time would need to be retrieved/copied at runtime?
As far as I know the __attribute__((constructor)) can not be applied to member functions?
My initial idea was to create some kind of
linked list with each node pointing to each probe and code in the init secction could iterate it populating the id, file, line, etc, but
that idea assumed I could use a member function that could run in the "init" section, but that does not seem possible. Any tips appreciated!
As far as I understand it, you do not actually need an ELF constructor here. Instead, you could emit descriptors for your probes using extended asm statements (using data, instead of code). This also involves switching to a dedicated ELF section for the probe descriptors, say __probes.
The linker will concatenate all the probes and in an array, and generate special symbols __start___probes and __stop___probes, which you can use from your program to access thes probes. See the last paragraph in Input Section Example.
Systemtap implements something quite similar for its userspace probes:
User Space Probe Implementation
Adding User Space Probing to an Application (heapsort example)
Similar constructs are also used within the Linux kernel for its self-patching mechanism.
There's a pretty simple way to have code run on module load time: Use the constructor of a global variable:
struct RunMeSomeCode
{
RunMeSomeCode()
{
// your code goes here
}
} do_it;
The .init/.fini sections basically exist to implement global constructors/destructors as part of the ABI on some platforms. Other platforms may use different mechanisms such as _start and _init functions or .init_array/.deinit_array and .preinit_array. There are lots of subtle differences between all these methods and which one to use for what is a question that can really only be answered by the documentation of your target platform. Not all platforms use ELF to begin with…
The main point to understand is that things like the .init/.fini sections in an ELF binary happen way below the level of C++ as a language. A C++ compiler may use these things to implement certain behavior on a certain target platform. On a different platform, a C++ compiler will probably have to use different mechanisms to implement that same behavior. Many compilers will give you tools in the form of language extensions like __attributes__ or #pragmas to control such platform-specific details. But those generally only make sense and will only work with that particular compiler on that particular platform.
You don't need a member function (which gets a this pointer passed as an arg); instead you can simply create constructor-like functions that reference a global array, like
#define PROBE(id, stuff, more_stuff) \
__attribute__((constructor)) void \
probeinit##id(){ probes[id] = {id, stuff, 0/*to be written later*/, more_stuff}; }
The trick is having this macro work in the middle of another function. GNU C / C++ allows nested functions, but IDK if you can make them constructors.
You don't want to declare a static int dummy#id = something because then you're adding overhead to the function you profile. (gcc has to emit a thread-safe run-once locking mechanism.)
Really what you'd like is some kind of separate pass over the source that identifies all the PROBE macros and collects up their args to declare
struct probe global_probes[] = {
{0, "EventName", 0 /*placeholder*/, filename, linenum},
{1, "EventName", 0 /*placeholder*/, filename, linenum},
...
};
I'm not confident you can make that happen with CPP macros; I don't think it's possible to #define PROBE such that every time it expands, it redefines another macro to tack on more stuff.
But you could easily do that with an awk/perl/python / your fave scripting language program that scans your program and constructs a .c that declares an array with static storage.
Or better (for a single-threaded program): keep the runtime timestamps in one array, and the names and stuff in a separate array. So the cache footprint of the probes is smaller. For a multi-threaded program, stores to the same cache line from different threads is called false sharing, and creates cache-line ping-pong.
So you'd have #define PROBE(id, evname, blah blah) do { probe_times[id] = now(); }while(0)
and leave the handling of the later args to your separate preprocessing.

Changing what a function points to

I have been playing around with pointers and function pointers in c/c++. As you can get the adress of a function, can you change where a function call actually ends?
I tried getting the memory adress of a function, then writing a second functions adress to that location, but it gave me a access violation error.
Regards,
Function pointers are variables, just like ints and doubles. The address of a function is something different. It is the location of the beginning of the function in the .text section of the binary. You can assign the address of a function to a function pointer of the same type however the .text section is read only and therefore you can't modify it. Writing to the address of a function would attempt to overwrite the code at the beginning of the function and is therefore not allowed.
Note:
If you want to change, at runtime, where function calls end up you can create something called a vritual dispatch table, or vtable. This is a structure containing function pointers and is used in languages such as c++ for polymorphism.
e.g.:
struct VTable {
int (*foo)(void);
int (*bar)(int);
} vTbl;
At runtime you can change the values of vTbl.foo and vTbl.bar to point to different functions and any calls made to vTbl.foo() or .bar will be directed to the new functions.
If the function you're trying to call is inlined, then you're pretty much out of luck. However, if it's not inlined, then there may be a way:
On Unix systems there's a common feature of the dynamic linker called LD_PRELOAD which allows you to override functions in shared libraries with your own versions. See the question What is the LD_PRELOAD trick? for some discussion of this. If the function you're trying to hijack is not loaded from a shared library (i.e. if it's part of the executable or if it's coming from a statically linked library), you're probably out of luck.
On Windows, there are other attack vectors. If the function to be hooked is exported by some DLL, you could use Import Address Table Patching to hijack it without tinkering with the code of the function. If it's not exported by the DLL but you can get the address of it (i.e. by taking the address of a function) you could use something like the free (and highly recommended) N-CodeHook project.
In some environments, it is possible to "patch" the beginning instructions of a function to make the call go somewhere else. This is an unusual technique and is not used for normal programming. It is sometimes used if you have an existing compiled program and need to change how it interacts with the operating system.
Microsoft Detours is an example of a library that has the ability to this.
You can change what a function pointer points to, but you can't change a normal function nor can you change what the function contains.
You generally can't find where a function ends. There's no such standard functionality in the language and the compiler can optimize code in such ways that the function's code isn't contiguous and really has not a single point of end and in order to find where the code ends one would need to either use some non-standard tools or disassemble the code and make sense of it, which isn't something you can easily write a program for to do automatically.

Determine size of functions/stub/namespace in memory

I have a couple of functions in a namespace called stub.
I have to determine the exact start address of the namespace and the end address, of at least the size of the namespace in memory (to copy these functions into another process).
While this worked perfectly in Visual C++ 2008 by adding a
void stub_end() { }
at the end of the namespace and using
size_t size = reinterpret_cast<ULONG_PTR>(stub_end) - reinterpret_cast<ULONG_PTR>(stub_start);
to determine the size of the stub.
This worked because Visual C++ preserved the function order as it is in the .cpp file, however that does not seem to be the case in Visual C++ 2010 anymore.
How can I find out the size of the functions or the whole namespace/stub by using pragma directives, compiler/linker facilities or similar?
With the new push in security these days (heap randomization, layout randomization, etc..) I think this is going to be much more difficult. You may end up having to just copy each function individually.
You can try and place each function in a different section, using the VC++ equivalent GCC's attribute ((section ("name"))) http://www.delorie.com/gnu/docs/gcc/gcc_62.html and then use your technique, or you could place each function in a different source file.
The C++ language provides no guarantees for finding addresses or sizes of namespaces. That said, venture into assembly language and linker instructions.
Many assembly languages have an opcode or mnemonic for placing code at specific addresses. This allows a label to be set up to indicate the start of a memory area. Some linkers have variables for obtaining segment starting addresses and sizes. These would be user defined logical addresses.
In summary, use your assembly and linker tools to define public symbols for the namespace start and length or optionally the end of the segment. In your C++ program, access these labels as extern.

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.