What are Call Instructions - c++

I'm currently reading "Programming: Principles and Practice using C++", and the author mentioned that writing the definition of a member function within the class definition can make a function inline. I wasn't entirely sure what that meant, so I looked on https://www.geeksforgeeks.org/inline-functions-cpp/ for a more concrete understanding. I can't seem to understand what "instruction" means in the context of this sentence:
When the program executes the function call instruction the CPU stores the memory address of the instruction following the function call.
I googled, and it looks like call instructions are just passing control another part of the program or another application. If that's what they mean, shouldn't they say that "the CPU stores the memory address of the call instruction of the function call"?
This question may sound weird or nit-picky, but I am new to CS and really want to get a solid understanding of CS.

They do in fact mean that the memory address of the instruction following the function call is stored. This is because of the way instructions work at the machine code level. After the function call is completed the program needs a way to get back to where it was. It does this through a jump instruction to the stored memory address which causes execution to jump to that instruction. If the memory address pointed to the function call, it would loop forever.

First of all, the page you link to is talking about the behaviour on a particular class of systems. It isn't describing Standard C++.
The page is talking about what happens in assembly language that could be generated by a C++ compiler. By "function call instruction" it means the assembly language (or machine code) instruction which performs the function call. In x86 syntax that instruction is call. Example.
You could find out more information about this by searching "x86 call instruction" or similar terms.
The address being stored , which is usually called the return address, is the address of the next instruction after the call. When execution of the function reaches the ret assembly instruction, execution jumps to the return address.

No, what it is saying (and this is particular to x86 chips, others may do it differently) is that the CPU stores the address of the instruction following the call (on top of the stack) and then jumps to the address that is the operand of the call instruction. When the called function executes a 'ret' instruction that stored address is read and execution jumps to that point.

When an x86 CALL instruction is executed, the contents of program counter i.e. address of instruction following CALL, are stored in the stack and the program control is transferred to subroutine.
(x86's program-counter register (IP / EIP / RIP) isn't normally directly accessible, but it's defined as pointing to the next instruction while the current one is executing.)
On completing the execution of the subroutine, the RET instruction is executed which loads back the stack contents i.e. address of the instruction following CALL instruction, into the program counter.
Thus execution is resumed in the caller at the instruction following the call

Related

Base of Global Call Stack in C/C++

I have read that each function invocation leads to pushing of a stack frame in the global call stack and once the function call is completed the call stack is popped off and the control passes to the address that we get from the popped of stack frame. If a called function calls on to yet another function, it will push another return address onto the top of the same call stack, and so on, with the information stacking up and unstacking as the program dictates.
I was wondering what's at the base of global call stack in a C or C++ program?
I did some searching on the internet but none of the sources explicitly mention about it. Is the call stack empty when our program starts and only once a function is called, the call stack usage starts? OR Is the address where main() function has to return, gets implicitly pushed as the base of our call stack and is a stack frame in our call stack? I expect the main() would also have a stack frame in our call stack since we are always returning something at end of our main() function and there needs to be some address to return to. OR is this dependent on compiler/OS and differs according to implementation?
It would be helpful if someone has some informative links about this or could provide details on the process that goes into it.
main() is invoked by the libc code that handles setting up the environment for the executable etc. So by the time main() is called, the stack already has at least one frame created by the caller.
I'm not sure if there is a universal answer, as stack is something that may be implemented differently per architecture. For example a stack may grow up (i.e. stack position pointer value increases when pushing onto the stack) or grow downwards.
Exiting main() is usually done by calling an operating function to indicate the program wishes to to terminate (with the specified return code), so I don't expect a return address for main() to be present on the stack, but this may differ per operating system and even compiler.
I'm not sure why you need to know this, as this is typically something you leave up to the system.
First of all, there is no such thing as a "global call stack". Each thread has a stack, and the stack for the main thread is often looking quite different from the thread of any thread spawned later on. And mostly, each of these "stacks" is just an arbitrary memory segment currently declared to be used as such, sub-allocated from any arbitrary suitable memory pool.
And due to compiler optimizations, many function calls will not even end up on the stack, usually. Meaning there isn't necessarily a distinguishable stack frame. You are only guaranteed that you can reference variables you put on the stack, but not that the compiler must preserve anything you didn't explicitly reference.
There is not even a guarantee that the memory layout for your call stack must even be organized in distinguishable frames. Function pointers are never guaranteed to be part of the stack frame, just happens to be an implementation detail in architectures where data and function pointers may co-exist in the address space. (As there are architectures which require return addresses to be stored in a different address space than the data used in the call stack.)
That aside, yes, there is code which is executed outside of the main() function. Specifically initializers for global static variables, code to set up the runtime environment (env, call parameters, stdin/stdout) etc.
E.g. when having linked to libc, there is __libc_start_main which will call your main function after initialization is done. And clean up when your main function returns.
__libc_start_main is about the point where "stack" starts being used, as far as you can see from within the program. That's not actually true though, there has already been some loader code been executed in kernel space, for reserving memory for your process to operate in initially (including memory for the future stack), initializing registers and memory to well defined values etc.
Right before actually "starting" your process, after dropping out of kernel mode, arbitrary pointers to a future stack, and the first instruction of your program, are loaded into the corresponding processor registers. Effectively, that's where __libc_start_main (or any other initialization function, depending on your runtime) starts running, and the stack visible to you starts building up.
Getting back into the kernel usually involves an interrupt now, which doesn't follow the stack either, but may just directly access processor registers to simply swap the contents of the corresponding processor registers. (E.g. if you call a function from the kernel, the memory required by the call stack inside the function call is not allocated from your stack, but from one you don't even have access to.)
Either way, everything that happens before main() is called, and whenever you enter a syscall, is implementation dependent, and you are not guaranteed any specific observable behavior. And messing around with processor registers, and thereby alternating the program flow, is also far outside defined behavior as far as a pure C / C++ run time is concerned.
Every system I have seen, when main() is called a stack is setup. It has to be or just declaring a variable inside main would fail. A stack is setup once a thread or process is created. Thus any thread of execution has a stack. Further in every assembly language i know, a register or fixed memory location is used to indicate the current value of the stack pointer, so the concept of a stack always exists (the stack pointer might be bad, but stack operations always exist since they are built into the every mainstream assembly language).

Intel Pin: how to obtain return address of system call

In Intel Pin you can get the return address of a routine call using IARG_RETURN_IP as one of the arguments of RTN_InsertCall.
I wanted to do the same with a system call, instrumented using PIN_AddSyscallEntryFunction and PIN_AddSyscallExitFunction.
So at first I thought about getting the value of the instruction pointer after the call using
ADDRINT returnIp = PIN_GetContextReg(ctx, REG_INST_PTR);in the function passed as argument to PIN_AddSyscallExitFunction.
However, I noticed that, if I get the value of REG_INST_POINTER in the same way but this time before the system call is executed, I always get the same two values for the instruction pointer.
For example, I would always get 2003266482 before and 2003266484 after.
So I was wondering why is this the case and if I am doing something wrong.
This has to do with the way system calls are executed in libc, there is a single assembly stub that actually does what needs to be done to pass control from and back to the kernel, which all system calls go through.

Pop{pc} in assembly

This may be a stupid question, but in my assembly code, during debugging, I have
pop{r2-r6,pc}
and I think it is giving me an hard fault exception. I understand what pop does, but I am unsure what the pc part means. I cannot find it explained anywhere on the internet and it is not a variable in my code anywhere.
I am using keil on an stm32 in c++
pc or r15 is the program counter, the register which gives the address that the processor fetches instructions from. Changing it to another address makes the program execution jump to that address.
In this case, the address is read off the stack to return from a function call; the return address would have been pushed onto the stack (from the link register lr or r14) at the start of the function.
If that's causing a crash, then it's probably because the address on the stack has been corrupted. Perhaps you're writing outside the bounds of a local array, or overflowing the stack with too deep a function call level.
The PC register is the program counter, it holds the address of the next instruction to be executed on an ARM architecture (STM32 uses the ARM architecture).
The default in ARM assembly it to simply overwrite the PC register when a function is to return. What you are seeing with the pop statement is just a direct way to return, see here.
The rest of your question is neatly explained in Mike's post.

How does the compiler know where control should return to after a function call?

Consider the following functions:
int main()
{
//statement(s);
func1();
//statement(s);
}
void func1()
{
//statement(s);
func2();
//statement(s);
}
void func2()
{
//statement(s);
}
How does the compiler know where to return to after the func2 has performed all its operations? I know the control transfers to function func1 (and exactly which statement), but how does the compiler knows it? What tells the compiler where to return to?
This is typically implemented using a call stack:
When control is being transfered to a function, the address to return to is pushed onto the stack.
When the function finishes, the address is popped off the stack and used to transfer control back to the callee.
The details are typically mandated by the hardware architecture for which the code is being compiled.
Actually, the compiler doesn't run the code, but the machine does, and when it calls a new function, it stores the address of the next instruction to be executed after the function currently being called on the stack, so that when the function returns it can pop it off back in to the Instruction Pointer (IP) and resume from there.
I've simplified things a bit for the sake of explanation.
When a function is called, the correct return address in the calling function is placed somewhere, usually the stack though the standard does not mandate that, that is used for precisely the purpose of storing the return address.
It is the compiler's duty to ensure that its calling conventions are such that unless something goes wrong (for example, a stack overflow), then the called function knows how to return to the calling function.
The runtime makes use of some thing called as a 'call stack' which basically holds the address of the next statement to call after the function being called is returned. So when a function call is made and before the control jumps to the new instruction address, the next instruction address in the calling function is pushed on to the stack. And this process is repeated for every subsequent call to any function. Now why only a stack? because it's necessary to get back to the point where it left off - which is basically a 'last in first out' behavior and stack is the data structure that does that. You can actually look at this call stack when you are debugging a program in Visual Studio - there's a separate window called 'Call Stack' which shows the entries of the addresses placed in the call stack.

What happens in assembly language when you call a method/function?

If I have a program in C++/C that (language doesn't matter much, just needed to illustrate a concept):
#include <iostream>
void foo() {
printf("in foo");
}
int main() {
foo();
return 0;
}
What happens in the assembly? I'm not actually looking for assembly code as I haven't gotten that far in it yet, but what's the basic principle?
In general, this is what happens:
Arguments to the function are stored on the stack. In platform specific order.
Location for return value is "allocated" on the stack
The return address for the function is also stored in the stack or in a special purpose CPU register.
The function (or actually, the address of the function) is called, either through a CPU specific call instruction or through a normal jmp or br instruction (jump/branch)
The function reads the arguments (if any) from the stack and the runs the function code
Return value from function is stored in the specified location (stack or special purpose CPU register)
Execution jumps back to the caller and the stack is cleared (by restoring the stack pointer to its initial value).
The details of the above vary from platform to platform and even from compiler to compiler (see e.g. STDCALL vs CDECL calling conventions). For instance, in some cases, CPU registers are used instead of storing stuff on the stack. The general idea is the same though
You can see it for yourself:
Under Linux 'compile' your program with:
gcc -S myprogram.c
And you'll get a listing of the programm in assembler (myprogram.s).
Of course you should know a little bit about assembler to understand it (but it's worth learning because it helps to understand how your computer works). Calling a function (on x86 architecture) is basically:
put variable a on stack
put variable b on stack
put variable n on stack
jump to address of the function
load variables from stack
do stuff in function
clean stack
jump back to main
What happens in the assembly?
A brief explanation: The current stack state is saved, a new stack is created and the code for the function to be executed is loaded and run. This involves inconveniencing a few registers of your microprocessor, some frantic to and fro read/writes to the memory and once done, the calling function's stack state is restored.
What happens? In x86, the first line of your main function might look something like:
call foo
The call instruction will push the return address on the stack and then jmp to the location of foo.
Arguments are pushed in stack and "call" instruction is made
Call is a simple "jmp" with pushing an address of instruction into stack ("ret" in the end of a method popping it and jumping on it)
I think you want to take a look at call stack to get a better idea what happens during a function call: http://en.wikipedia.org/wiki/Call_stack
A very good illustration:
http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.pdf
What happens?
C mimics what will occur in assembly...
It is so close to machine that you can realize what will occur
void foo() {
printf("in foo");
/*
db mystring 'in foo'
mov eax, dword ptr mystring
mov edx , dword ptr _printf
push eax
call edx
add esp, 8
ret
//thats it
*/
}
int main() {
foo();
return 0;
}
1- a calling context is established on the stack
2- parameters are pushed on the stack
3- a "call" is performed to the method
The general idea is that you need to
Save the current local state
Pass the arguments to a function
Call the actual function. This involves putting the return address somewhere so the RET instruction knows where to continue.
The specifics vary from architecture to architecture. And the even more specific specifics might vary between various languages. Although there usually are ways of controlling this to some extent to allow for interoperability between different languages.
A pretty useful starting point is the Wikipedia article on calling conventions. On x86 for example the stack is almost always used for passing arguments to functions. On many RISC architectures, however, registers are mainly used while the stack is only needed in exceptional cases.
The common idea is that the registers that are used in the calling method are pushed on the stack (stack pointer is in ESP register), this process is called "push the registers". Sometimes they're also zeroed, but that depends. Assembly programmers tend to free more registers then the common 4 (EAX, EBX, ECX and EDX on x86) to have more possibilities within the function.
When the function ends, the same happens in the reverse: the stack is restored to the state from before calling. This is called "popping the registers".
Update: this process does not necessarily have to happen. Compilers can optimize it away and inline your functions.
Update: normally parameters of the function are pushed on the stack in reverse order, when they are retrieved from the stack, they appear as if in normal order. This order is not guaranteed by C. (ref: Inner Loops by Rick Booth)