Pop{pc} in assembly - c++

This may be a stupid question, but in my assembly code, during debugging, I have
pop{r2-r6,pc}
and I think it is giving me an hard fault exception. I understand what pop does, but I am unsure what the pc part means. I cannot find it explained anywhere on the internet and it is not a variable in my code anywhere.
I am using keil on an stm32 in c++

pc or r15 is the program counter, the register which gives the address that the processor fetches instructions from. Changing it to another address makes the program execution jump to that address.
In this case, the address is read off the stack to return from a function call; the return address would have been pushed onto the stack (from the link register lr or r14) at the start of the function.
If that's causing a crash, then it's probably because the address on the stack has been corrupted. Perhaps you're writing outside the bounds of a local array, or overflowing the stack with too deep a function call level.

The PC register is the program counter, it holds the address of the next instruction to be executed on an ARM architecture (STM32 uses the ARM architecture).
The default in ARM assembly it to simply overwrite the PC register when a function is to return. What you are seeing with the pop statement is just a direct way to return, see here.
The rest of your question is neatly explained in Mike's post.

Related

arm64 - how to debug EXC_BAD_ACCESS with LLDB

Hello I use MacBook M1 (OSX) and I have a dangling pointer which it seems I can't catch.
I am using Clion and LLDB as a debuger.
When I run my code I get:
Exception: EXC_BAD_ACCESS (code=1, address=0x18)
However this does not really shows me or I can't understand where exactly is the bad pointer.
I am attaching also screenshot of my editor and the debugger window:
I have read something about zombie objects which when enabled allows you to catch dangling pointers. How can I do that?
EXC_BAD_ACCESS (defined in /usr/include/mach/exception_types.h) has a code (which is a kern_return_t) and a subcode. kern_return_t is defined in /usr/include/mach/kern_return.h and 1 means KERN_INVALID_ADDRESS, so this was not a protection problem but an actual invalid address access. The subcode (0x18) is the address accessed.
A small number like 0x18 usually means that your code accessed a field that was 0x18 bytes into an object, but the object pointer was null. So the first thing to do is look at all the accesses in that line of code (or around it if you are debugging optimized code) and make sure none of them are null. This might also be a null vtable, and the 0x18 the vtable offset, i.e. one of the methods of the object, so look for calls as well. However, I didn't see any suspicious looking pointer values in your locals, so maybe it's some subobject?
If it isn't obvious from there which pointer is bad, you could run your code under ASAN (address sanitizer) - if the bad pointer access is because of a use after free ASAN will often find those quickly. Note, Zombie objects is an ObjC only thing, that doesn't look relevant in your code.
If that doesn't get it, the most straightforward way to diagnose this sort of error is to look at the disassembly, for instance just run:
(lldb) disassemble
The current PC will be marked in the output. That instruction will be some kind of memory access, often dereferencing a register with an offset or something like that. For instance:
ldr w9, [x9, #0x18]
is loading memory 0x18 bytes off from the value in register x9. If this were the instruction, the next question is what program entity is currently occupying x9? lldb might know, you can ask it by doing:
(lldb) image lookup -va $pc
That will tell you everything lldb knows about that pc, among other things the last set of entries will be where all the known variables are currently located. Look for one that is in x9. If there isn't one listed in x9, then maybe one of the currently visible variables was temporarily copied into x9, in which case you have to look up in the instruction stream to see what was the last value that got copied into x9.

Filter out breaks based on stack trace

I want to break in a function, but only if it was NOT called from a specific other function. That's because there's one or two functions that amount for most of the calls, but I'm not interested in debugging them.
I noticed that breakpoints have a Filter option:
Is that something that could be used to filter stack trace and break based on it's contents?
I don't think you can use the filters for that, based on this: Use breakpoints in the Visual Studio debugger Specifically, the breakpoint filters are meant for concurrent programs, and you can filter on:
MachineName, ProcessId, ProcessName, ThreadId, or ThreadName.
One suggestion I would make to get something like what you want, is to add an extra parameter with a default value to the function you want to break in. Then set the value to something different in the places you don't want to monitor, and use a "Conditional Expression" in the breakpoint to make it only break on the default value.
Of course, this requires you to make debugging-only changes to your code (and then revert them when done), so it is a pretty ugly approach.
If you know the address of the code location where the function is called from, you could make the breakpoint condition depend on the return address stored on the call stack.
Therefore, you should be able to set the breakpoint as a condition of the value *(DWORD*)ESP (32-bit code) or *(QWORD*)RSP (64-bit code). I haven't tested it though.
However, my above example will only work if the breakpoint is set at the very start of the function, before the called function pushes any values on the stack or modifies the stack pointer. I'm not sure where Visual Studio sets the breakpoint if you place it on the first instruction of a function. Therefore, you may have to either set the breakpoint in the disassembly window to the first assembler instruction of the function or you might have to compensate for the function having modified the stack pointer in the function prolog.
Alternatively, if a proper stack frame has been set up using the EBP register (or RBP for 64-bit), then you could use that instead.
Please note that not the address of the CALL instructon will be placed on the stack, but rather the return address, which is the address of the next assembler-level instruction of the calling function.
I suggest you first set an unconditional breakpoint where you want it and then inspect the stack using the memory viewer in the debugger, specifically to see where the values of ESP/RSP and EBP/RBP are pointing and where the return address is stored on the stack.

What are Call Instructions

I'm currently reading "Programming: Principles and Practice using C++", and the author mentioned that writing the definition of a member function within the class definition can make a function inline. I wasn't entirely sure what that meant, so I looked on https://www.geeksforgeeks.org/inline-functions-cpp/ for a more concrete understanding. I can't seem to understand what "instruction" means in the context of this sentence:
When the program executes the function call instruction the CPU stores the memory address of the instruction following the function call.
I googled, and it looks like call instructions are just passing control another part of the program or another application. If that's what they mean, shouldn't they say that "the CPU stores the memory address of the call instruction of the function call"?
This question may sound weird or nit-picky, but I am new to CS and really want to get a solid understanding of CS.
They do in fact mean that the memory address of the instruction following the function call is stored. This is because of the way instructions work at the machine code level. After the function call is completed the program needs a way to get back to where it was. It does this through a jump instruction to the stored memory address which causes execution to jump to that instruction. If the memory address pointed to the function call, it would loop forever.
First of all, the page you link to is talking about the behaviour on a particular class of systems. It isn't describing Standard C++.
The page is talking about what happens in assembly language that could be generated by a C++ compiler. By "function call instruction" it means the assembly language (or machine code) instruction which performs the function call. In x86 syntax that instruction is call. Example.
You could find out more information about this by searching "x86 call instruction" or similar terms.
The address being stored , which is usually called the return address, is the address of the next instruction after the call. When execution of the function reaches the ret assembly instruction, execution jumps to the return address.
No, what it is saying (and this is particular to x86 chips, others may do it differently) is that the CPU stores the address of the instruction following the call (on top of the stack) and then jumps to the address that is the operand of the call instruction. When the called function executes a 'ret' instruction that stored address is read and execution jumps to that point.
When an x86 CALL instruction is executed, the contents of program counter i.e. address of instruction following CALL, are stored in the stack and the program control is transferred to subroutine.
(x86's program-counter register (IP / EIP / RIP) isn't normally directly accessible, but it's defined as pointing to the next instruction while the current one is executing.)
On completing the execution of the subroutine, the RET instruction is executed which loads back the stack contents i.e. address of the instruction following CALL instruction, into the program counter.
Thus execution is resumed in the caller at the instruction following the call

What happens in assembly language when you call a method/function?

If I have a program in C++/C that (language doesn't matter much, just needed to illustrate a concept):
#include <iostream>
void foo() {
printf("in foo");
}
int main() {
foo();
return 0;
}
What happens in the assembly? I'm not actually looking for assembly code as I haven't gotten that far in it yet, but what's the basic principle?
In general, this is what happens:
Arguments to the function are stored on the stack. In platform specific order.
Location for return value is "allocated" on the stack
The return address for the function is also stored in the stack or in a special purpose CPU register.
The function (or actually, the address of the function) is called, either through a CPU specific call instruction or through a normal jmp or br instruction (jump/branch)
The function reads the arguments (if any) from the stack and the runs the function code
Return value from function is stored in the specified location (stack or special purpose CPU register)
Execution jumps back to the caller and the stack is cleared (by restoring the stack pointer to its initial value).
The details of the above vary from platform to platform and even from compiler to compiler (see e.g. STDCALL vs CDECL calling conventions). For instance, in some cases, CPU registers are used instead of storing stuff on the stack. The general idea is the same though
You can see it for yourself:
Under Linux 'compile' your program with:
gcc -S myprogram.c
And you'll get a listing of the programm in assembler (myprogram.s).
Of course you should know a little bit about assembler to understand it (but it's worth learning because it helps to understand how your computer works). Calling a function (on x86 architecture) is basically:
put variable a on stack
put variable b on stack
put variable n on stack
jump to address of the function
load variables from stack
do stuff in function
clean stack
jump back to main
What happens in the assembly?
A brief explanation: The current stack state is saved, a new stack is created and the code for the function to be executed is loaded and run. This involves inconveniencing a few registers of your microprocessor, some frantic to and fro read/writes to the memory and once done, the calling function's stack state is restored.
What happens? In x86, the first line of your main function might look something like:
call foo
The call instruction will push the return address on the stack and then jmp to the location of foo.
Arguments are pushed in stack and "call" instruction is made
Call is a simple "jmp" with pushing an address of instruction into stack ("ret" in the end of a method popping it and jumping on it)
I think you want to take a look at call stack to get a better idea what happens during a function call: http://en.wikipedia.org/wiki/Call_stack
A very good illustration:
http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.pdf
What happens?
C mimics what will occur in assembly...
It is so close to machine that you can realize what will occur
void foo() {
printf("in foo");
/*
db mystring 'in foo'
mov eax, dword ptr mystring
mov edx , dword ptr _printf
push eax
call edx
add esp, 8
ret
//thats it
*/
}
int main() {
foo();
return 0;
}
1- a calling context is established on the stack
2- parameters are pushed on the stack
3- a "call" is performed to the method
The general idea is that you need to
Save the current local state
Pass the arguments to a function
Call the actual function. This involves putting the return address somewhere so the RET instruction knows where to continue.
The specifics vary from architecture to architecture. And the even more specific specifics might vary between various languages. Although there usually are ways of controlling this to some extent to allow for interoperability between different languages.
A pretty useful starting point is the Wikipedia article on calling conventions. On x86 for example the stack is almost always used for passing arguments to functions. On many RISC architectures, however, registers are mainly used while the stack is only needed in exceptional cases.
The common idea is that the registers that are used in the calling method are pushed on the stack (stack pointer is in ESP register), this process is called "push the registers". Sometimes they're also zeroed, but that depends. Assembly programmers tend to free more registers then the common 4 (EAX, EBX, ECX and EDX on x86) to have more possibilities within the function.
When the function ends, the same happens in the reverse: the stack is restored to the state from before calling. This is called "popping the registers".
Update: this process does not necessarily have to happen. Compilers can optimize it away and inline your functions.
Update: normally parameters of the function are pushed on the stack in reverse order, when they are retrieved from the stack, they appear as if in normal order. This order is not guaranteed by C. (ref: Inner Loops by Rick Booth)

What can modify the frame pointer?

I have a very strange bug cropping up right now in a fairly massive C++ application at work (massive in terms of CPU and RAM usage as well as code length - in excess of 100,000 lines). This is running on a dual-core Sun Solaris 10 machine. The program subscribes to stock price feeds and displays them on "pages" configured by the user (a page is a window construct customized by the user - the program allows the user to configure such pages). This program used to work without issue until one of the underlying libraries became multi-threaded. The parts of the program affected by this have been changed accordingly. On to my problem.
Roughly once in every three executions the program will segfault on startup. This is not necessarily a hard rule - sometimes it'll crash three times in a row then work five times in a row. It's the segfault that's interesting (read: painful). It may manifest itself in a number of ways, but most commonly what will happen is function A calls function B and upon entering function B the frame pointer will suddenly be set to 0x000002. Function A:
result_type emit(typename type_trait<T_arg1>::take _A_a1) const
{ return emitter_type::emit(impl_, _A_a1); }
This is a simple signal implementation. impl_ and _A_a1 are well-defined within their frame at the crash. On actual execution of that instruction, we end up at program counter 0x000002.
This doesn't always happen on that function. In fact it happens in quite a few places, but this is one of the simpler cases that doesn't leave that much room for error. Sometimes what will happen is a stack-allocated variable will suddenly be sitting on junk memory (always on 0x000002) for no reason whatsoever. Other times, that same code will run just fine. So, my question is, what can mangle the stack so badly? What can actually change the value of the frame pointer? I've certainly never heard of such a thing. About the only thing I can think of is writing out of bounds on an array, but I've built it with a stack protector which should come up with any instances of that happening. I'm also well within the bounds of my stack here. I also don't see how another thread could overwrite the variable on the stack of the first thread since each thread has it's own stack (this is all pthreads). I've tried building this on a linux machine and while I don't get segfaults there, roughly one out of three times it will freeze up on me.
Stack corruption, 99.9% definitely.
The smells you should be looking carefully for are:-
Use of 'C' arrays
Use of 'C' strcpy-style functions
memcpy
malloc and free
thread-safety of anything using pointers
Uninitialised POD variables.
Pointer Arithmetic
Functions trying to return local variables by reference
I had that exact problem today and was knee-deep in gdb mud and debugging for a straight hour before occurred to me that I simply wrote over array boundaries (where I didn't expect it the least) of a C array.
So, if possible, use vectors instead because any decend STL implementation will give good compiler messages if you try that in debug mode (whereas C arrays punish you with segfaults).
I'm not sure what you're calling a "frame pointer", as you say:
On actual execution of that
instruction, we end up at program
counter 0x000002
Which makes it sound like the return address is being corrupted. The frame pointer is a pointer that points to the location on the stack of the current function call's context. It may well point to the return address (this is an implementation detail), but the frame pointer itself is not the return address.
I don't think there's enough information here to really give you a good answer, but some things that might be culprits are:
incorrect calling convention. If you're calling a function using a calling convention different from how the function was compiled, the stack may become corrupted.
RAM hit. Anything writing through a bad pointer can cause garbage to end up on the stack. I'm not familiar with Solaris, but most thread implementations have the threads in the same process address space, so any thread can access any other thread's stack. One way a thread can get a pointer into another thread's stack is if the address of a local variable is passed to an API that ultimately deals with the pointer on a different thread. unless you synchronize things properly, this will end up with the pointer accessing invalid data. Given that you're dealing with a "simple signal implementation", it seems like it's possible that one thread is sending a signal to another. Maybe one of the parameters in that signal has a pointer to a local?
There's some confusion here between stack overflow and stack corruption.
Stack Overflow is a very specific issue cause by try to use using more stack than the operating system has allocated to your thread. The three normal causes are like this.
void foo()
{
foo(); // endless recursion - whoops!
}
void foo2()
{
char myBuffer[A_VERY_BIG_NUMBER]; // The stack can't hold that much.
}
class bigObj
{
char myBuffer[A_VERY_BIG_NUMBER];
}
void foo2( bigObj big1) // pass by value of a big object - whoops!
{
}
In embedded systems, thread stack size may be measured in bytes and even a simple calling sequence can cause problems. By default on windows, each thread gets 1 Meg of stack, so causing stack overflow is much less of a common problem. Unless you have endless recursion, stack overflows can always be mitigated by increasing the stack size, even though this usually is NOT the best answer.
Stack Corruption simply means writing outside the bounds of the current stack frame, thus potentially corrupting other data - or return addresses on the stack.
At it's simplest:-
void foo()
{
char message[10];
message[10] = '!'; // whoops! beyond end of array
}
That sounds like a stack overflow problem - something is writing beyond the bounds of an array and trampling over the stack frame (and probably the return address too) on the stack. There's a large literature on the subject. "The Shell Programmer's Guide" (2nd Edition) has SPARC examples that may help you.
With C++ unitialized variables and race conditions are likely suspects for intermittent crashes.
Is it possible to run the thing through Valgrind? Perhaps Sun provides a similar tool. Intel VTune (Actually I was thinking of Thread Checker) also has some very nice tools for thread debugging and such.
If your employer can spring for the cost of the more expensive tools, they can really make these sorts of problems a lot easier to solve.
It's not hard to mangle the frame pointer - if you look at the disassembly of a routine you will see that it is pushed at the start of a routine and pulled at the end - so if anything overwrites the stack it can get lost. The stack pointer is where the stack is currently at - and the frame pointer is where it started at (for the current routine).
Firstly I would verify that all of the libraries and related objects have been rebuilt clean and all of the compiler options are consistent - I've had a similar problem before (Solaris 2.5) that was caused by an object file that hadn't been rebuilt.
It sounds exactly like an overwrite - and putting guard blocks around memory isn't going to help if it is simply a bad offset.
After each core dump examine the core file to learn as much as you can about the similarities between the faults. Then try to identify what is getting overwritten. As I remember the frame pointer is the last stack pointer - so anything logically before the frame pointer shouldn't be modified in the current stack frame - so maybe record this and copy it elsewhere and compare upon return.
Is something meaning to assign a value of 2 to a variable but instead is assigning its address to 2?
The other details are lost on me but "2" is the recurring theme in your problem description. ;)
I would second that this definitely sounds like a stack corruption due to out of bound array or buffer writing. Stack protector would be good as long as the writing is sequential, not random.
I second the notion that it is likely stack corruption. I'll add that the switch to a multi-threaded library makes me suspicious that what has happened is a lurking bug has been exposed. Possibly the sequencing the buffer overflow was occurring on unused memory. Now it's hitting another thread's stack. There are many other possible scenarios.
Sorry if that doesn't give much of a hint at how to find it.
I tried Valgrind on it, but unfortunately it doesn't detect stack errors:
"In addition to the performance penalty an important limitation of Valgrind is its inability to detect bounds errors in the use of static or stack allocated data."
I tend to agree that this is a stack overflow problem. The tricky thing is tracking it down. Like I said, there's over 100,000 lines of code to this thing (including custom libraries developed in-house - some of it going as far back as 1992) so if anyone has any good tricks for catching that sort of thing, I'd be grateful. There's arrays being worked on all over the place and the app uses OI for its GUI (if you haven't heard of OI, be grateful) so just looking for a logical fallacy is a mammoth task and my time is short.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
No one asked this, but I built with gcc-4.2. Also, I can guarantee ABI safety here so that's also not the issue. As for the "garbage at the end of the stack" on the RAM hit, the fact that it is universally 2 (though in different places in the code) makes me doubt that as garbage tends to be random.
It is impossible to know, but here are some hints that I can come up with.
In pthreads you must allocate the stack and pass it to the thread. Did you allocate enough? There is no automatic stack growth like in a single threaded process.
If you are sure that you don't corrupt the stack by writing past stack allocated data check for rouge pointers (mostly uninitialized pointers).
One of the threads could overwrite some data that others depend on (check your data synchronisation).
Debugging is usually not very helpful here. I would try to create lots of log output (traces for entry and exit of every function/method call) and then analyze the log.
The fact that the error manifest itself differently on Linux may help. What thread mapping are you using on Solaris? Make sure you map every thread to it's own LWP to ease the debugging.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
If you pass anything on the stack by reference or by address, this would most certainly happen if another thread tried to use it after the first thread returned from a function.
You might be able to repro this by forcing the app onto a single processor. I don't know how you do that with Sparc.