How to jump the program execution to a specific address in C? - c++

I want the program to jump to a specific address in memory and continue execution from that address. I thought about using goto but I don't have a label rather just an address in memory.
There is no need to worry about return back from the jump address.
edit: using GCC compiler

Inline assembly might be the easiest and most "elegant" solution, although doing this is highly unusual, unless you are writing a debugger or some specialized introspective system.
Another option might be to declare a pointer to a void function (void (*foo)(void)), then set the pointer to contain your address, and then invoke it:
void (*foo)(void) = (void (*)())0x12345678;
foo();
There will be things pushed on the stack since the compiler thinks you are doing a subroutine call, but since you don't care about returning, this might work.

gcc has an extension that allows jumping to an arbitrary address:
void *ptr = (void *)0x1234567; // a random memory address
goto *ptr; // jump there -- probably crash
This is pretty much the same as using a function pointer that you set to a fixed value, but it will actually use a jump instruction rather than a call instruction (so the stack won't be modified)

#include <stdio.h>
#include <stdlib.h>
void go(unsigned int addr) {
(&addr)[-1] = addr;
}
int sub() {
static int i;
if(i++ < 10) printf("Hello %d\n", i);
else exit(0);
go((unsigned int)sub);
}
int main() {
sub();
}
Of course, this invokes undefined behavior, is platform-dependent, assumes that code addresses are the same size as int, etc, etc.

It should look something like this:
unsigned long address=0x80;
void (*func_ptr)(void) = (void (*)(void))address;
func_ptr();
However, it is not a very safe operation, jumping to some unknown address will probably result in a crash!

Since the question has a C++ tag, here's an example of a C++ call to a function with a signature like main()--int main(int argc, char* argv[]):
int main(int argc, char* argv[])
{
auto funcAddr = 0x12345678; //or use &main...
auto result = reinterpret_cast<int (*)(int, char**)>(funcAddr)(argc, argv);
}

Do you have control of the code at the address that you intend to jump to? Is this C or C++?
I hesitantly suggest setjmp() / longjmp() if you're using C and can run setjmp() where you need to jump back to. That being said, you've got to be VERY careful with these.
As for C++, see the following discussion about longjmp() shortcutting exception handling and destructors destructors. This would make me even more hesitant to suggest it's use in C++.
C++: Safe to use longjmp and setjmp?

This is what I am using for my bootstrap loader(MSP430AFE253,Compiler = gcc,CodeCompeserStudio);
#define API_RESET_VECT 0xFBFE
#define JUMP_TO_APP() {((void (*)()) (*(uint16_t*)API_RESET_VECT)) ();}

I Propos this code:
asm(
"LDR R0,=0x0a0000\n\t" /* Or 0x0a0000 for the base Addr. */
"LDR R0, [R0, #4]\n\t" /* Vector+4 for PC */
"BX R0"
);

Related

Can I be sure that the binary code of the functions will be copied sequentially?

Sorry if this question already exist, because I hope this approach is used but i just don't know how this called. So, my purpose to execute sequence of functions from memory, for this I copied size of first and last func.
This is my first try:
source.cpp
void func1(int var1, int var2)
{
func2();
func3();
//etc.
}
void func2(...){...}
void func3(...){...}
void funcn(){return 123;}//last func as border, I do not use it
//////////////////////////////////////////////////
main.cpp
#include"source.cpp"
long long size= (long long)funcn-(long long)func1;// i got size of binary code of this funcs;
// and then i can memcpy it to file or smth else and execute by adress of first
Firstly it's worked correct, but after updating my functions it's crashed. Size has become negative.
Then i tried to attach it to memory hardlier:
source.cpp
extern void(*pfunc1)(int, int);
extern void(*pfuncn)();
void(*pfunc1)(int , int) = &func1;
void(*funcn)() = &funcn;
static void __declspec(noinline) func1(int var1, int var2)
{
//the same impl
}
static void __declspec(noinline) func2(...){...}
static void __declspec(noinline) func3(...){...}
static void __declspec(noinline) funcn(...){retunr 123;}
//////////////////////////////////
main.cpp
#include"source.cpp"
long long size= (long long) pfuncn - (long long) pfunc1;
//same impl
This worked after my 1st update, but then, I had to update it again, and now this gives me wrong size. Size was near 900+ bytes. I changed some funcs, and size become 350+ bytes i haven't changed that many.
I disabled optimizations and inline optimizations.
So my question is how to be sure that my func1 will be less adress then last funcn and what could change their locations in memory. Thank you for attention.
// and then i can memcpy it to file or smth else and execute by adress of first
copy it in memory and then call it in allocated memory and then call by adress of allocation.
This needs to be stated:
You cannot copy code from one location to another and hope for it to work.
There's no guarantees that all the code required to call a function
be located in a contiguous block.
There's no guarantee the function pointer actually point to the
beginning of the needed code.
There's no guarantees you can effectively write to executable memory. To the OS, you'd look a lot like a virus.
there's no guarantees the code is relocatable (able to work after being moved to a different location). for this it requires to use only relative addresses
In short: unless you have supporting tools that go beyond the scope of standard C++, don't even think about it.
GCC family only!
You can force the compiler to put the whole function to separate section. Then you can know the memory area where the funcion resides.
int __attribute__((section(".foosection"))) foo()
{
/* some code here */
}
in linker script in the .text you need to add
.text :
{
/* ... */
__foosection_start = .;
*(*foosection)
*(.foosection*)
__foosection_end = .;
/* .... */
and in the place where you want to know or use it
extern unsigned char __foosection_start[];
extern unsigned char __foosection_end[];
void printfoo()
{
printf("foosection start: %p, foosection end: %p\n ", (void *)__foosection_start, (void *)__foosection_end);
}
This is probably not possible because of a requirement you did not mention, but why not use an array of function pointers?
std::function<void()> funcs[] = {
func2,
func3,
[](){ /* and an inline lambda, because why not */ },
};
// Call them in sequence like so:
for (auto& func: funcs) {
func();
}

I can call a function imported with dlsym() with a wrong signature, why?

host.cpp has:
int main (void)
{
void * th = dlopen("./p1.so", RTLD_LAZY);
void * fu = dlsym(th, "fu");
((void(*)(int, const char*)) fu)(2, "rofl");
return 0;
}
And p1.cpp has:
#include <iostream>
extern "C" bool fu (float * lol)
{
std::cout << "fuuuuuuuu!!!\n";
return true;
}
(I intentionally left errors checks out)
When executing host, “fuuuuuuuu!!!” is printed correctly, even though I typecasted the void pointer to the symbol with a completely different function signature.
Why did this happen and is this behavior consistent between different compilers?
This happened because UB, and this behaviour isn't consistent with anything, at all, ever, for any reason.
Because there's no information about function signature in void pointer. Or any information besides the address. You might get in trouble if you started to use parameters, tho.
This actually isn't a very good example of creating a case that will fail since:
You never use the arguments from the function fu.
Your function fu has less arguments (or the activation frame itself is smaller memory-footprint-wise) than the function pointer-type you're casting to, so you're never going to end-up with a situation where fu attempts to access memory outside its activation record setup by the caller.
In the end, what you're doing is still undefined behavior, but you don't do anything to create a violation that could cause issues, so therefore it ends up as a silent error.
is this behavior consistent between different compilers?
No. If your platform/compiler used a calling convention that required the callee to clean-up the stack, then oops, you're most likely hosed if there's a mis-match in the size of the activation record between what the callee and caller expect... upon return of the callee, the stack pointer would be moved to the wrong spot, possibly corrupting the stack, and completely messing up any stack-pointer relative addressing.
It's just happened, that
C uses cdecl call conversion (so caller clears the stack)
your function does not use given arguments arguments
so your call seems to work correctly.
But actually behavior is undefined. Changing signature or using arguments will cause your program crash:
ADD:
For example, consider stdcall calling conversion, where callee mast clear the stack. In this case, even if you declare correct calling conversion for both caller and callee, your program will still crash, because your stack will be corrupted, due to callee will clear it according to it signature, but caller fill according another signature:
#include <iostream>
#include <string>
extern "C" __attribute__((stdcall)) __attribute__((noinline)) bool fu (float * lol)
{
std::cout << "fuuuuuuuu!!!\n";
return true;
}
void x()
{
(( __attribute__((stdcall)) void(*)(int, const char*)) fu)(2, "rofl");
}
int main (void)
{
void * th = reinterpret_cast<void*>(&fu);
std::string s = "hello";
x();
std::cout << s;
return 0;
}

a "general function signature" pointer that points to an arbitrary function

I'll try to explain better what I want to do.
I read a file with function signatures, and I want to create a pointer to each function.
For example, a file that looks like this:
something.dll;int f(char* x, int y, SOMESTRUCT z)
something.dll;void g(void)
something.dll;SOMESTRUCT l(longlong w)
now, during runtime I want be able to create pointers to these functions (by loading something.dll and using GetProcAddress to these functions).
Now, GetProcAddress returns FARPROC which points to an arbitrary functions, but how can I use FARPROC to call these functions during runtime?
From what I know, I need to cast FARPROC to the correct signature, but I can't do it during runtime (or at least I don't know how).
Does anyone have any idea how to design do that?
Thanks! :-)
Function types are compile-time in C++, so it won't work, unless you can define all the types you're going to use in advance.
Its a matter of pushing the arguments to the stack (and local vars are like that) and calling the function as void (__cdecl *)(void).
With some other kinds of functions (like fastcall, or thiscall) it can be more problematic.
Update: I actually made an example, and it works on codepad:
(Also works with stdcall functions, because of stack restore after aligned stack alloc)
http://codepad.org/0cf0YFRH
#include <stdio.h>
#ifdef __GNUC__
#define NOINLINE __attribute__((noinline))
#define ALIGN(n) __attribute__((aligned(n)))
#else
#define NOINLINE __declspec(noinline)
#define ALIGN(n) __declspec(align(n))
#endif
//#define __cdecl
// Have to be declared __cdecl when its available,
// because some args may be passed in registers otherwise (optimization!)
void __cdecl test( int a, void* b ) {
printf( "a=%08X b=%08X\n", a, unsigned(b) );
}
// actual pointer type to use for function calls
typedef int (__cdecl *pfunc)( void );
// wrapper type to get around codepad's "ISO C++" ideas and gcc being too smart
union funcwrap {
volatile void* y;
volatile pfunc f;
void (__cdecl *z)(int, void*);
};
// gcc optimization workaround - can't allow it to know the value at compile time
volatile void* tmp = (void*)printf("\n");
volatile funcwrap x;
int r;
// noinline function to force the compiler to allocate stuff
// on stack just before the function call
NOINLINE
void call(void) {
// force the runtime stack pointer calculation
// (compiler can't align a function stack in compile time)
// otherwise, again, it gets optimized too hard
// the number of arguments; can be probably done with alloca()
ALIGN(32) volatile int a[2];
a[0] = 1; a[1] = 2; // set the argument values
tmp = a; // tell compiler to not optimize away the array
r = x.f(); // call the function; returned value is passed in a register
// this function can't use any other local vars, because
// compiler might mess up the order
}
int main( void ) {
// again, weird stuff to confuse compiler, so that it won't discard stuff
x.z = test; tmp=x.y; x.y=tmp;
// call the function via "test" pointer
call();
// print the return value (although it didn't have one)
printf( "r=%i\n", r );
}
Once you have a FARPROC, you can cast the FARPROC into a pointer to the appropriate function type. For example, you could say
int (*fPtr)(char*, int, SOMESTRUCT) = (int (*)(char*, int, SOMESTRUCT))GetProcAddress("f");
Or, if you want to use typedefs to make this easier:
typedef int (*FType)(char *, int, SOMESTRUCT);
FType fPtr = (FType)GetProcAddress("f");
Now that you have the function pointer stored in a function pointer of the appropriate type, you can call f by writing
fPtr("My string!", 137, someStructInstance);
Hope this helps!
The compiler needs to know the exact function signature in order to create the proper setup and teardown for the call. There's no easy way to fake it - every signature you read from the file will need a corresponding compile-time signature to match against.
You might be able to do what you want with intimate knowledge of your compiler and some assembler, but I'd recommend against it.

Print address of virtual member function

I am trying to print the address of a virtual member function.
If I know which class implements the function I can write:
print("address: %p", &A::func);
But I want to do something like this:
A *b = new B();
printf("address: %p", &b->func);
printf("address: %p", &b->A::func);
However this does not compile. Is it possible to do something like this, perhaps looking up the address in the vtable at runtime?
Currently there is no standard way of doing this in C++ although the information must be available somewhere. Otherwise, how could the program call the function? However, GCC provides an extension that allows us to retrieve the address of a virtual function:
void (A::*mfp)() = &A::func;
printf("address: %p", (void*)(b->*mfp));
...assuming the member function has the prototype void func().
This can be pretty useful when you want to cache the address of a virtual function or use it in generated code. GCC will warn you about this construct unless you specify -Wno-pmf-conversions. It's unlikely that it works with any other compiler.
Pointers to member functions are not always simple memory addresses. See the table in this article showing the sizes of member function pointers on different compilers - some go up to 20 bytes.
As the article outlines a member function pointer is actually a blob of implementation-defined data to help resolve a call through the pointer. You can store and call them OK, but if you want to print them, what do you print? Best to treat it as a sequence of bytes and get its length via sizeof.
I found a way to do this using a disassembler (https://github.com/vmt/udis86). The steps are:
Get a pointer to the virtual function via normal C++ code
Disassemble the jmp instruction at that address
Parse the real address from the disassembled string
Here is how I did it:
// First get the raw pointer to the virtual function
auto myVirtualFuncPtr = &MyClass::myFunc;
void* myVirtualFuncPtrRaw = (void*&)myVirtualFuncPtr;
// Resolve the real function!
void* myFuncPtr = resolveVirtualFunctionAddress(myVirtualFuncPtrRaw);
...
static void* resolveVirtualFunctionAddress(void* address)
{
const int jumpInstructionSize = 5;
static ud_t ud_obj;
ud_init(&ud_obj);
ud_set_mode(&ud_obj, sizeof(void*) * 8);
ud_set_syntax(&ud_obj, UD_SYN_INTEL);
ud_set_pc(&ud_obj, (uint64_t)address);
ud_set_input_buffer(&ud_obj, (unsigned uint8_t*)address, jumpInstructionSize);
std::string jmpInstruction = "";
if (ud_disassemble(&ud_obj))
{
jmpInstruction += ud_insn_asm(&ud_obj);
}
// TODO: Implement startsWith and leftTrim yourself
if (startsWith(jmpInstruction, "jmp "))
{
std::string jumpAddressStr = leftTrim(jmpInstruction, "jmp ");
return hexToPointer(jumpAddressStr);
}
// If the jmp instruction was not found, then we just return the original address
return address;
}
static void* hexToPointer(std::string hexString)
{
void* address;
std::stringstream ss;
ss << std::hex << hexString;
ss >> address;
return address;
}
From what I can tell in the standard, the only time you get dynamic binding is during a virtual function call. And once you've called a function, you're executing the statements within the function (i.e., you can't "stop halfway" into the call and get the address.)
I think it's impossible.
Doesn't make a lot a of sense to me. If you have a normal function:
void f( int n ) {
}
then you can take its address:
f
but you cannot take the address of a function call, which is what you seem to want to do.

Creating function pointers to functions created at runtime

I would like to do something like:
for(int i=0;i<10;i++)
addresses[i] = & function(){ callSomeFunction(i) };
Basically, having an array of addresses of functions with behaviours related to a list of numbers.
If it's possible with external classes like Boost.Lambda is ok.
Edit: after some discussion I've come to conclusion that I wasn't explicit enough. Please read Creating function pointers to functions created at runtime
What I really really want to do in the end is:
class X
{
void action();
}
X* objects;
for(int i=0;i<0xFFFF;i++)
addresses[i] = & function(){ objects[i]->action() };
void someFunctionUnknownAtCompileTime()
{
}
void anotherFunctionUnknowAtCompileTime()
{
}
patch someFunctionUnknownAtCompileTime() with assembly to jump to function at addresses[0]
patch anotherFunctionUnknownAtCompileTime() with assembly to jump to function at addresses[1]
sth, I don't think your method will work because of them not being real functions but my bad in not explaining exactly what I want to do.
If I understand you correctly, you're trying to fill a buffer with machine code generated at runtime and get a function pointer to that code so that you can call it.
It is possible, but challenging. You can use reinterpret_cast<> to turn a data pointer into a function pointer, but you'll need to make sure that the memory you allocated for your buffer is flagged as executable by the operating system. That will involve a system call (LocalAlloc() on Windows iirc, can't remember on Unix) rather than a "plain vanilla" malloc/new call.
Assuming you've got an executable block of memory, you'll have to make sure that your machine code respects the calling convention indicated by the function pointer you create. That means pushing/popping the appropriate registers at the beginning of the function, etc.
But, once you've done that, you should be able to use your function pointer just like any other function.
It might be worth looking at an open source JVM (or Mono) to see how they do it. This is the essence of JIT compilation.
Here is an example I just hacked together:
int func1( int op )
{
printf( "func1 %d\n", op );
return 0;
}
int func2( int op )
{
printf( "func2 %d\n", op );
return 0;
}
typedef int (*fp)(int);
int main( int argc, char* argv[] )
{
fp funcs[2] = { func1, func2 };
int i;
for ( i = 0; i < 2; i++ )
{
(*funcs[i])(i);
}
}
The easiest way should be to create a bunch of boost::function objects:
#include <boost/bind.hpp>
#include <boost/function.hpp>
// ...
std::vector< boost::function<void ()> > functors;
for (int i=0; i<10; i++)
functors.push_back(boost::bind(callSomeFunction, i));
// call one of them:
functors[3]();
Note that the elements of the vector are not "real functions" but objects with an overloaded operator(). Usually this shouldn't be a disadvantage and actually be easier to handle than real function pointers.
You can do that simply by defining those functions by some arbitrary names in the global scope beforehand.
This is basically what is said above but modifying your code would look something like this:
std::vector<int (*) (int)> addresses;
for(int i=0;i<10;i++) {
addresses[i] = &myFunction;
}
I'm not horribly clear by what you mean when you say functions created at run time... I don't think you can create a function at run time, but you can assign what function pointers are put into your array/vector at run time. Keep in mind using this method all of your functions need to have the same signature (same return type and parameters).
You can't invoke a member function by itself without the this pointer. All instances of a class have the function stored in one location in memory. When you call p->Function() the value of p is stored somewhere (can't remember if its a register or stack) and that value is used as base offset to calculate locations of the member variables.
So this means you have to store the function pointer and the pointer to the object if you want to invoke a function on it. The general form for this would be something like this:
class MyClass {
void DoStuf();
};
//on the left hand side is a declaration of a member function in the class MyClass taking no parameters and returning void.
//on the right hand side we initialize the function pointer to DoStuff
void (MyClass::*pVoid)() = &MyClass::DoStuff;
MyClass* pMyClass = new MyClass();
//Here we have a pointer to MyClass and we call a function pointed to by pVoid.
pMyClass->pVoid();
As i understand the question, you are trying to create functions at runtime (just as we can do in Ruby). If that is the intention, i'm afraid that it is not possible in compiled languages like C++.
Note: If my understanding of question is not correct, please do not downvote :)