Related
I have read that each function invocation leads to pushing of a stack frame in the global call stack and once the function call is completed the call stack is popped off and the control passes to the address that we get from the popped of stack frame. If a called function calls on to yet another function, it will push another return address onto the top of the same call stack, and so on, with the information stacking up and unstacking as the program dictates.
I was wondering what's at the base of global call stack in a C or C++ program?
I did some searching on the internet but none of the sources explicitly mention about it. Is the call stack empty when our program starts and only once a function is called, the call stack usage starts? OR Is the address where main() function has to return, gets implicitly pushed as the base of our call stack and is a stack frame in our call stack? I expect the main() would also have a stack frame in our call stack since we are always returning something at end of our main() function and there needs to be some address to return to. OR is this dependent on compiler/OS and differs according to implementation?
It would be helpful if someone has some informative links about this or could provide details on the process that goes into it.
main() is invoked by the libc code that handles setting up the environment for the executable etc. So by the time main() is called, the stack already has at least one frame created by the caller.
I'm not sure if there is a universal answer, as stack is something that may be implemented differently per architecture. For example a stack may grow up (i.e. stack position pointer value increases when pushing onto the stack) or grow downwards.
Exiting main() is usually done by calling an operating function to indicate the program wishes to to terminate (with the specified return code), so I don't expect a return address for main() to be present on the stack, but this may differ per operating system and even compiler.
I'm not sure why you need to know this, as this is typically something you leave up to the system.
First of all, there is no such thing as a "global call stack". Each thread has a stack, and the stack for the main thread is often looking quite different from the thread of any thread spawned later on. And mostly, each of these "stacks" is just an arbitrary memory segment currently declared to be used as such, sub-allocated from any arbitrary suitable memory pool.
And due to compiler optimizations, many function calls will not even end up on the stack, usually. Meaning there isn't necessarily a distinguishable stack frame. You are only guaranteed that you can reference variables you put on the stack, but not that the compiler must preserve anything you didn't explicitly reference.
There is not even a guarantee that the memory layout for your call stack must even be organized in distinguishable frames. Function pointers are never guaranteed to be part of the stack frame, just happens to be an implementation detail in architectures where data and function pointers may co-exist in the address space. (As there are architectures which require return addresses to be stored in a different address space than the data used in the call stack.)
That aside, yes, there is code which is executed outside of the main() function. Specifically initializers for global static variables, code to set up the runtime environment (env, call parameters, stdin/stdout) etc.
E.g. when having linked to libc, there is __libc_start_main which will call your main function after initialization is done. And clean up when your main function returns.
__libc_start_main is about the point where "stack" starts being used, as far as you can see from within the program. That's not actually true though, there has already been some loader code been executed in kernel space, for reserving memory for your process to operate in initially (including memory for the future stack), initializing registers and memory to well defined values etc.
Right before actually "starting" your process, after dropping out of kernel mode, arbitrary pointers to a future stack, and the first instruction of your program, are loaded into the corresponding processor registers. Effectively, that's where __libc_start_main (or any other initialization function, depending on your runtime) starts running, and the stack visible to you starts building up.
Getting back into the kernel usually involves an interrupt now, which doesn't follow the stack either, but may just directly access processor registers to simply swap the contents of the corresponding processor registers. (E.g. if you call a function from the kernel, the memory required by the call stack inside the function call is not allocated from your stack, but from one you don't even have access to.)
Either way, everything that happens before main() is called, and whenever you enter a syscall, is implementation dependent, and you are not guaranteed any specific observable behavior. And messing around with processor registers, and thereby alternating the program flow, is also far outside defined behavior as far as a pure C / C++ run time is concerned.
Every system I have seen, when main() is called a stack is setup. It has to be or just declaring a variable inside main would fail. A stack is setup once a thread or process is created. Thus any thread of execution has a stack. Further in every assembly language i know, a register or fixed memory location is used to indicate the current value of the stack pointer, so the concept of a stack always exists (the stack pointer might be bad, but stack operations always exist since they are built into the every mainstream assembly language).
I have a hypothesis here, but it's a little tough to verify.
Is there a unique stack frame for each calling thread when two threads invoke the same method of the same object instance? In a compiled binary, I understand a class to be a static code section filled with function definitions in memory and the only difference between different objects is the this pointer which is passed beneath the hood.
But therefore the thread calling it must have its own stack frame, or else two threads trying to access the same member function of the same object instance, would be corrupting one another's local variables.
Just to reiterate here, I'm not referring to whether or not two threads can corrupt the objects data by both modifying this at the same time, I'm well aware of that. I'm more getting at whether or not, in the case that two threads enter the same method of the same instance at the same time, whether or not the local variables of that context are the same places in memory. Again, my assumption is that they are not.
You are correct. Each thread makes use of its own stack and each stack makes local variables distinct between threads.
This is not specific to C++ though. It's just the way processors function. (That is in modern processors, some older processors had only one stack, like the 6502 that had only 256 bytes of stack and no real capability to run threads...)
Objects may be on the stack and shared between threads and thus you can end up modifying the same object on another thread stack. But that's only if you share that specific pointer.
you are right that different threads have unique stacks. That is not a feature of c++ or cpp but something provided by the OS. class objects won't necessary be different. This depends on how they are allocated. Different threads could share heap objects which might lead to concurrent problem.
Local variables of any function or class method are stored in each own stack (actually place in thread's stack, stack frame), so it is doesn't matter from what thread you're calling method - it will use it's own stack during execution for each call
a little different explanation: each method call creates its own stack (or better stack frame)
NOTE: static variables will be the same
of course there exists techniques to get access to another's method's stack memory during execution, but there are kinda hacks
Lets say I have allocated some memory in my background thread, that is, a thread stack is holding a pointer to that memory. Now I want do terminate background thread execution by calling pthread_cancel on it. Will that memory be released or not? (My platform is iOS, compiler is gcc 4.2)
Each thread by necessity requires its own stack; however there is typically only one heap per process. When a thread is destroyed, there is no automatic mechanism to free the memory allocated on the heap. All you end up with is a memory leak.
As a general rule, avoid using pthread_cancel since it is hard to ensure that pthread_cancel will run safely. Rather build in some mechanism where you can pass a message to the thread to destroy itself (after freeing any resources that it owns).
The thread stack will be removed once the thread exits. But there will be no process or code that will look into your thread stack and release any references for the objects that you've allocated on the heap. Also, typically, thread stacks don't hold any references to the memory, the thread stack is an independent space that is given to the thread to use for generic program stack, any reference will only be on the stack for as long as you are inside the function that pushed such a reference onto the stack, typically because you are referencing it with a local variable.
by default, no -- see other answers, which are more specific to the answer you seek. there is however, such a thing as a thread-specific allocator; if you were using one, you'd know.
No, it won't be deleted or freed automatically. If you're very lucky, it might be garbage collected sometime if you're running a collector. File handles, shared memory ids, mutexes etc. won't be released/deallocated either. Async cancellation is safe for e.g. pure maths calculations on data still owned by another thread, but very risky in general - that's why some threading APIs have experimented with and removed the function completely.
If I've registered my very own vectored exception handler (VEH) and a StackOverflow exception had occurred in my process, when I'll reach to the VEH, will I'll be able to allocate more memory on the stack? will the allocation cause me to override some other memory? what will happen?
I know that in .Net this is why the entire stack is committed during the thread's creation, but let's say i'm writing in native and such scenario occurs ... what will i able to do inside the VEH? what about memory allocation..?
In the case of a stack overflow, you'll have a tiny bit of stack to work with. It's enough stack to start a new thread, which will have an entirely new stack. From there, you can do whatever you need to do before terminating.
You cannot recover from a stack overflow, it would involve unwinding the stack, but your entire program would be destroyed in the progress. Here's some code I wrote for a stack-dumping utility:
// stack overflows cannot be handled, try to get output then quit
set_current_thread(get_current_thread());
boost::thread t(stack_fail_thread);
t.join(); // will never exit
All this did was get the thread's handle so the stack dumping mechanism knew which thread to dump, start a new thread to do the dumping/logging, and wait for it to finish (which won't happen, the thread calls exit()).
For completeness, get_current_thread() looked like this:
const HANDLE process = GetCurrentProcess();
HANDLE thisThread = 0;
DuplicateHandle(process, GetCurrentThread(), process,
&thisThread, 0, true, DUPLICATE_SAME_ACCESS);
All of these are "simple" functions that don't require a lot of room to work (and keep in mind, the compiler will inline these msot likely, removing a function call). You cannot, contrarily, throw an exception. Not only does that require much more work, but destructors can do quite a bit of work (like deallocating memory), which tend to be complex as well.
Your best bet is to start a new thread, save as much information about your application as you can or want, then terminate.
No, you can't allocate memory in vectored exception handler.
MSDN says it explicitly:
"The handler should not call functions that acquire synchronization objects or allocate memory, because this can cause problems. Typically, the handler will simply access the exception record and return."
Stacks need to be contiguous, so you can't just allocate any random memory but have to allocate the next part of the address space.
If you are willing to preallocate the address space (i.e. just reserve a range of addresses without actually allocating memory), you can use VirtualAlloc. First you call it with the MEM_RESERVE flag to set aside the address space. Later, in your exception handler you can call it again with MEM_COMMIT to allocate physical memory to your pre-reserved address space.
I have a very strange bug cropping up right now in a fairly massive C++ application at work (massive in terms of CPU and RAM usage as well as code length - in excess of 100,000 lines). This is running on a dual-core Sun Solaris 10 machine. The program subscribes to stock price feeds and displays them on "pages" configured by the user (a page is a window construct customized by the user - the program allows the user to configure such pages). This program used to work without issue until one of the underlying libraries became multi-threaded. The parts of the program affected by this have been changed accordingly. On to my problem.
Roughly once in every three executions the program will segfault on startup. This is not necessarily a hard rule - sometimes it'll crash three times in a row then work five times in a row. It's the segfault that's interesting (read: painful). It may manifest itself in a number of ways, but most commonly what will happen is function A calls function B and upon entering function B the frame pointer will suddenly be set to 0x000002. Function A:
result_type emit(typename type_trait<T_arg1>::take _A_a1) const
{ return emitter_type::emit(impl_, _A_a1); }
This is a simple signal implementation. impl_ and _A_a1 are well-defined within their frame at the crash. On actual execution of that instruction, we end up at program counter 0x000002.
This doesn't always happen on that function. In fact it happens in quite a few places, but this is one of the simpler cases that doesn't leave that much room for error. Sometimes what will happen is a stack-allocated variable will suddenly be sitting on junk memory (always on 0x000002) for no reason whatsoever. Other times, that same code will run just fine. So, my question is, what can mangle the stack so badly? What can actually change the value of the frame pointer? I've certainly never heard of such a thing. About the only thing I can think of is writing out of bounds on an array, but I've built it with a stack protector which should come up with any instances of that happening. I'm also well within the bounds of my stack here. I also don't see how another thread could overwrite the variable on the stack of the first thread since each thread has it's own stack (this is all pthreads). I've tried building this on a linux machine and while I don't get segfaults there, roughly one out of three times it will freeze up on me.
Stack corruption, 99.9% definitely.
The smells you should be looking carefully for are:-
Use of 'C' arrays
Use of 'C' strcpy-style functions
memcpy
malloc and free
thread-safety of anything using pointers
Uninitialised POD variables.
Pointer Arithmetic
Functions trying to return local variables by reference
I had that exact problem today and was knee-deep in gdb mud and debugging for a straight hour before occurred to me that I simply wrote over array boundaries (where I didn't expect it the least) of a C array.
So, if possible, use vectors instead because any decend STL implementation will give good compiler messages if you try that in debug mode (whereas C arrays punish you with segfaults).
I'm not sure what you're calling a "frame pointer", as you say:
On actual execution of that
instruction, we end up at program
counter 0x000002
Which makes it sound like the return address is being corrupted. The frame pointer is a pointer that points to the location on the stack of the current function call's context. It may well point to the return address (this is an implementation detail), but the frame pointer itself is not the return address.
I don't think there's enough information here to really give you a good answer, but some things that might be culprits are:
incorrect calling convention. If you're calling a function using a calling convention different from how the function was compiled, the stack may become corrupted.
RAM hit. Anything writing through a bad pointer can cause garbage to end up on the stack. I'm not familiar with Solaris, but most thread implementations have the threads in the same process address space, so any thread can access any other thread's stack. One way a thread can get a pointer into another thread's stack is if the address of a local variable is passed to an API that ultimately deals with the pointer on a different thread. unless you synchronize things properly, this will end up with the pointer accessing invalid data. Given that you're dealing with a "simple signal implementation", it seems like it's possible that one thread is sending a signal to another. Maybe one of the parameters in that signal has a pointer to a local?
There's some confusion here between stack overflow and stack corruption.
Stack Overflow is a very specific issue cause by try to use using more stack than the operating system has allocated to your thread. The three normal causes are like this.
void foo()
{
foo(); // endless recursion - whoops!
}
void foo2()
{
char myBuffer[A_VERY_BIG_NUMBER]; // The stack can't hold that much.
}
class bigObj
{
char myBuffer[A_VERY_BIG_NUMBER];
}
void foo2( bigObj big1) // pass by value of a big object - whoops!
{
}
In embedded systems, thread stack size may be measured in bytes and even a simple calling sequence can cause problems. By default on windows, each thread gets 1 Meg of stack, so causing stack overflow is much less of a common problem. Unless you have endless recursion, stack overflows can always be mitigated by increasing the stack size, even though this usually is NOT the best answer.
Stack Corruption simply means writing outside the bounds of the current stack frame, thus potentially corrupting other data - or return addresses on the stack.
At it's simplest:-
void foo()
{
char message[10];
message[10] = '!'; // whoops! beyond end of array
}
That sounds like a stack overflow problem - something is writing beyond the bounds of an array and trampling over the stack frame (and probably the return address too) on the stack. There's a large literature on the subject. "The Shell Programmer's Guide" (2nd Edition) has SPARC examples that may help you.
With C++ unitialized variables and race conditions are likely suspects for intermittent crashes.
Is it possible to run the thing through Valgrind? Perhaps Sun provides a similar tool. Intel VTune (Actually I was thinking of Thread Checker) also has some very nice tools for thread debugging and such.
If your employer can spring for the cost of the more expensive tools, they can really make these sorts of problems a lot easier to solve.
It's not hard to mangle the frame pointer - if you look at the disassembly of a routine you will see that it is pushed at the start of a routine and pulled at the end - so if anything overwrites the stack it can get lost. The stack pointer is where the stack is currently at - and the frame pointer is where it started at (for the current routine).
Firstly I would verify that all of the libraries and related objects have been rebuilt clean and all of the compiler options are consistent - I've had a similar problem before (Solaris 2.5) that was caused by an object file that hadn't been rebuilt.
It sounds exactly like an overwrite - and putting guard blocks around memory isn't going to help if it is simply a bad offset.
After each core dump examine the core file to learn as much as you can about the similarities between the faults. Then try to identify what is getting overwritten. As I remember the frame pointer is the last stack pointer - so anything logically before the frame pointer shouldn't be modified in the current stack frame - so maybe record this and copy it elsewhere and compare upon return.
Is something meaning to assign a value of 2 to a variable but instead is assigning its address to 2?
The other details are lost on me but "2" is the recurring theme in your problem description. ;)
I would second that this definitely sounds like a stack corruption due to out of bound array or buffer writing. Stack protector would be good as long as the writing is sequential, not random.
I second the notion that it is likely stack corruption. I'll add that the switch to a multi-threaded library makes me suspicious that what has happened is a lurking bug has been exposed. Possibly the sequencing the buffer overflow was occurring on unused memory. Now it's hitting another thread's stack. There are many other possible scenarios.
Sorry if that doesn't give much of a hint at how to find it.
I tried Valgrind on it, but unfortunately it doesn't detect stack errors:
"In addition to the performance penalty an important limitation of Valgrind is its inability to detect bounds errors in the use of static or stack allocated data."
I tend to agree that this is a stack overflow problem. The tricky thing is tracking it down. Like I said, there's over 100,000 lines of code to this thing (including custom libraries developed in-house - some of it going as far back as 1992) so if anyone has any good tricks for catching that sort of thing, I'd be grateful. There's arrays being worked on all over the place and the app uses OI for its GUI (if you haven't heard of OI, be grateful) so just looking for a logical fallacy is a mammoth task and my time is short.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
No one asked this, but I built with gcc-4.2. Also, I can guarantee ABI safety here so that's also not the issue. As for the "garbage at the end of the stack" on the RAM hit, the fact that it is universally 2 (though in different places in the code) makes me doubt that as garbage tends to be random.
It is impossible to know, but here are some hints that I can come up with.
In pthreads you must allocate the stack and pass it to the thread. Did you allocate enough? There is no automatic stack growth like in a single threaded process.
If you are sure that you don't corrupt the stack by writing past stack allocated data check for rouge pointers (mostly uninitialized pointers).
One of the threads could overwrite some data that others depend on (check your data synchronisation).
Debugging is usually not very helpful here. I would try to create lots of log output (traces for entry and exit of every function/method call) and then analyze the log.
The fact that the error manifest itself differently on Linux may help. What thread mapping are you using on Solaris? Make sure you map every thread to it's own LWP to ease the debugging.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
If you pass anything on the stack by reference or by address, this would most certainly happen if another thread tried to use it after the first thread returned from a function.
You might be able to repro this by forcing the app onto a single processor. I don't know how you do that with Sparc.