Is it possible an usual code to damage call stack in c/c++?
I don't mean a kind of hack or something, just an oversight mistake or something, but not random, such that damages it every time.
Someone told me that an ex colleague managed but I don't think it is possible.
Does someone have such an experience?
Yes, easy. One of the very common issues, in fact. Consider this:
void foo()
{
int i;
int *p = &i;
p -= 5; // now point somewhere god knows where, generally undefined behavior
*p = 0; // boom, on different compilers will end up with various bad things,
// including potentially trashing the call stack
}
Many cases of an out-of-boundaries access of a local array/buffer end up with trashed stacks.
Yes. On many platforms, local variables are stored along with the call stack; in that case, writing outside a local array is a very easy way to corrupt it:
void evil() {
int array[1];
std::fill(array, array+1000000, 0);
return; // BOOM!
}
More subtly, returning a reference to a local variable could corrupt the stack of a function that's called later on:
int & evil() {
int x;
return x;
}
void good(int & x) {
x = 0;
return; // BOOM!
}
void innocent() {
good(evil());
}
Note that neither of these (and indeed anything else that could corrupt the stack) are legal; but the compiler doesn't have to diagnose them. Luckily, most compilers will spot these errors, as long as you enable the appropriate warnings.
Related
It's common knowledge that returning a pointer to a stack variable is generally a bad idea:
int* foo() {
int i = 0;
return &i;
}
int main() {
int* p = foo();
}
In the example above, my understanding is that the int is destroyed and so p is a dangling pointer.
I am wondering about the extent to which this applies to the newly introduced coroutines of C++20:
generator<span<byte>> read(stream& s) {
array<byte, 4096> b;
while (s.is_open()) {
const size_t n = s.read_some(b);
co_yield span(b, n);
}
}
int main() {
stream s;
for (span<byte> v : read(s)) {
/* ... */
}
}
In this example, the coroutine read yields a span view into the local buffer b. Internally, that view stores a pointer to the buffer. Will that pointer ever be dangling when v is used with the body of the range-for loop?
For context, the coroutine code in the second example is modeled after the code in my own project. There, AddressSanitizer ends the program with a "use-after-free" error. Ordinarily I'd consider that enough to answer my question, but since coroutine development is still coming along at this point in time (my project is using boost::asio::experimental::coro, emphasis on "experimental"), I was wondering if the error was caused by a bug with generator's implementation or if returning pointers in this way is fundamentally incorrect (similar to the first example).
With language coroutines, this has to be safe: the lifetime of b must continue until the generator is finished, so pointers to it must be useful that long.
I am quite a newbie to the C++ programming, but this question keeps on spinning in my head. I understand that returning reference to a local variable in a function is illegal, i.e. compiling this code snippet:
inline int& funref() {
int a = 8;
return a; // not OK!
}
results in a warning from the compiler and then a runtime error. But then, why does this piece of code get compiled without any warnings and run without error?
inline int& funref() {
int a = 8;
int& refa = a;
return refa; // OK!
}
int main() {
int& refa = funref();
cout << refa;
}
My compiler is g++ on Linux Fedora platform.
It's still wrong, it just happens to be working by (un)happy coincidence.
This code has undefined behaviour with all the usual caveats (it might always work, it might always work until it's too late to fix, it might set fire to your house and run away with your betrothed).
The compiler isn't required to issue a diagnostic (warning or error message) for every possible mistake, just because it isn't always possible to do so. Here, at least your current version of g++ hasn't warned. A different compiler, or a different version of g++, or even the same version with different flags, might warn you.
The reason why you can't return a reference to a local variable is because the local variable will get wiped when your function returns. Simply put, the compiler prevents you from referencing garbage data.
However, the compiler isn't bulletproof (as shown in your example #2).
It does work for retrieving a singleton instance, though.
inline int& funref()
{
static int* p_a = nullptr;
if (nullptr == p_a)
p_a = new int(8);
return *p_a;
}
this case is valid because the memory pointed by p_a remains valid after the function returns.
It's my first year of using C++ and learning on the way. I'm currently reading up on Return Value Optimizations (I use C++11 btw). E.g. here https://en.wikipedia.org/wiki/Return_value_optimization, and immediately these beginner examples with primitive types spring to mind:
int& func1()
{
int i = 1;
return i;
}
//error, 'i' was declared with automatic storage (in practice on the stack(?))
//and is undefined by the time function returns
...and this one:
int func1()
{
int i = 1;
return i;
}
//perfectly fine, 'i' is copied... (to previous stack frame... right?)
Now, I get to this and try to understand it in the light of the other two:
Simpleclass func1()
{
return Simpleclass();
}
What actually happens here? I know most compilers will optimise this, what I am asking is not 'if' but:
how the optimisation works (the accepted response)
does it interfere with storage duration: stack/heap (Old: Is it basically random whether I've copied from stack or created on heap and moved (passed the reference)? Does it depend on created object size?)
is it not better to use, say, explicit std::move?
You won't see any effect of RVO when returning ints.
However, when returning large objects like this:
struct Huge { ... };
Huge makeHuge() {
Huge h { x, y, x };
h.doSomething();
return h;
}
The following code...
auto h = makeHuge();
... after RVO would be implemented something like this (pseudo code) ...
h_storage = allocate_from_stack(sizeof(Huge));
makeHuge(addressof(h_storage));
auto& h = *properly_aligned(h_storage);
... and makeHuge would compile to something like this...
void makeHuge(Huge* h_storage) // in fact this address can be
// inferred from the stack pointer
// (or just 'known' when inlining).
{
phuge = operator (h_storage) new Huge(x, y, z);
phuge->doSomething();
}
When it comes to massively-recursive method calls, the call-stack size has to be extended by modifying the appropriate compiler parameters in order to avoid stack-overflow.
Let's consider writing a portable application whose layout is simple enough so that its users need only possess minimal technical knowledge, so manual virtual memory configuration is out of question.
Running massively-recursive methods (behind the scenes obviously) may result in the call-stack limit being surpassed, especially if the machine the application is running on is limited memory-wise.
Enough chit-chat: In C++ is it possible to manually extend the call-stack to disk in case memory is (almost) full?
It may just barely be possible.
Use a coroutine library. With that, you allocate your own stack out of the heap. Restructure your code to track how deep it is in its callstack, and when it gets dangerously deep, create a new cothread and switch into that instead. When you run out of heap memory, freeze old cothreads and free their memory. Of course, you better be sure to unfreeze them to the same address--so I suggest you allocate their stacks yourselves out of your own arena that you can control. In fact it may be easier to just reuse the same piece of memory for the cothread stack and swap them in and out one at a time.
It's certainly easier to rewrite your algorithm to be non-recursive.
This may be an example of it working, or it may just print the right answer on accident:
#include <stdio.h>
#include "libco.h"
//byuu's libco has been modified to use a provided stack; it's a simple mod, but needs to be done per platform
//x86.c:
////if(handle = (cothread_t)malloc(size)) {
//handle = (cothread_t)stack;
//here we're going to have a stack on disk and have one recursion's stack in RAM at a time
//I think it may be impossible to do this without a main thread controlling the coroutines, but I'm not sure.
#define STACKSIZE (32*1024)
char stack[STACKSIZE];
FILE* fpInfiniteStack;
cothread_t co_mothership;
#define RECURSING 0
#define EXITING 1
int disposition;
volatile int recurse_level;
int call_in_cothread( int (*entrypoint)(int), int arg);
int fibo_b(int n);
int fibo(int n)
{
if(n==0)
return 0;
else if(n==1)
return 1;
else {
int a = call_in_cothread(fibo,n-1);
int b = call_in_cothread(fibo_b,n-2);
return a+b;
}
}
int fibo_b(int n) { printf("fibo_b\n"); return fibo(n); } //just to make sure we can call more than one function
long filesize;
void freeze()
{
fwrite(stack,1,STACKSIZE,fpInfiniteStack);
fflush(fpInfiniteStack);
filesize += STACKSIZE;
}
void unfreeze()
{
fseek(fpInfiniteStack,filesize-STACKSIZE,SEEK_SET);
int read = fread(stack,1,STACKSIZE,fpInfiniteStack);
filesize -= STACKSIZE;
fseek(fpInfiniteStack,filesize,SEEK_SET);
}
struct
{
int (*proc)(int);
int arg;
} thunk, todo;
void cothunk()
{
thunk.arg = thunk.proc(thunk.arg);
disposition = EXITING;
co_switch(co_mothership);
}
int call_in_cothread(int (*proc)(int), int arg)
{
if(co_active() != co_mothership)
{
todo.proc = proc;
todo.arg = arg;
disposition = RECURSING;
co_switch(co_mothership);
//we land here after unfreezing. the work we wanted to do has already been done.
return thunk.arg;
}
NEXT_RECURSE:
thunk.proc = proc;
thunk.arg = arg;
cothread_t co = co_create(stack,STACKSIZE,cothunk);
recurse_level++;
NEXT_EXIT:
co_switch(co);
if(disposition == RECURSING)
{
freeze();
proc = todo.proc;
arg = todo.arg;
goto NEXT_RECURSE;
}
else
{
recurse_level--;
unfreeze();
if(recurse_level==0)
return thunk.arg; //return from initial level of recurstion
goto NEXT_EXIT;
}
return -666; //this should not be possible
}
int main(int argc, char**argv)
{
fpInfiniteStack = fopen("infinite.stack","w+b");
co_mothership = co_active();
printf("result: %d\n",call_in_cothread(fibo,10));
}
Now you just need to detect how much memory's in the system, how much of it is available, how big the callstack is, and when the callstack's exhausted, so you know when to deploy the infinite stack. That's not easy stuff for one system, let alone doing it portably. It might be better to learn how the stack is actually meant to be used instead of fighting it.
It's feasible.
You need a bit of assembly to manipulate the stack pointer as there's no standardized way of accessing it from C++ directly (as far as I know). Once you are there you can point to your memory page and take care of swapping memory in and out. There are already libraries out there doing it for you.
On the other hand if the system provider considered that paging memory or the other virtual memory techniques would not work/be worth on the platform they probably had a very good reason (most likely it would be incredibly slow). Try to get your solution to work without the recursion or change it to make the recursion fit into what's available. Even a less efficient implementation would end up faster than your disk paged recursion.
I have the following structure:
struct CountCarrier
{
int *CurrCount;
};
And this is what I want to do:
int main()
{
CountCarrier carrier = CountCarrier();
*(carrier.CurrCount) = 2; // initialize the *(carrier.CurrCount) to 2
IncreaseCount(&carrier); // should increase the *(carrier.CurrCount) to 3
}
void IncreaseCount(CountCarrier *countCarrier)
{
int *currCounts = countCarrier->CurrCount;
(*currCounts)++;
}
So, my intention is specified in the comments.
However, I couldn't get this to work. For starters, the program throws an exception at this line:
*(carrier.CurrCount) = 2;
And I suspect the following line won't work as well. Anything I did wrong?
struct CountCarrier
{
int *CurrCount; //No memory assigned
};
You need to allocate some valid memory to the pointer inside the structure to be able to put data in this.
Unless you do so, What you ar trying to do is attempting to write at some invalid address, which results in an Undefined Behavior, which luckiy in this case shows up as an exception.
Resolution:
struct CountCarrier
{
int *CurrCount; //No memory assigned
CountCarrier():CurrCount(new(int))
{
}
};
Suggestion:
Stay away from dynamic allocations as long as you can.
When you think of using pointers always think whether you really need one. In this case it doesn't really seem that you need one, A simple int member would be just fine.
You need to create the pointer. ie. carrier->CurrCount = new int;
*(carrier.CurrCount)
This is dereferencing the pointer carrier.CurrCount, but you never initialized it. I suspect this is what you want:
carrier.CurrCount = new int(2);
I seriously doubt that your program throws an exception at the line:
*(carrier.CurrCount) = 2;
While throwing an exception is certainly allowed behaviour, it seems much more likely that you encountered an access violation that caused the process to be killed by the operating system.
The problem is that you are using a pointer, but your pointer is not initialised to point at anything. This means that the result of the pointer dereference is undefined.
In this situation there does not seem to be any advantage to using a pointer at all. Your CurrCount member would work just as well if it was just a plain int.
If you are using C++, then you should encash its facilities. Instead of correcting your code, I am showing here that how the code should look like:
struct CountCarrier
{
int CurrCount; // simple data member
CountCarrier(int count) : CurrCount(count) {} // constructor
CountCarrier& operator ++ () // overloaded operator
{
++ CurrCount;
return *this;
}
};
We are overloading operator ++, because you have only one data member. You can replace with some named method also, like void IncrementCount().
CountCarrier carrier(2);
++ carrier;
As Als said, you need to provide some memory for the code to work.
But why make it so complicated? You don't need any pointers for the code you have to work. The "modern C++" way looks more like this:
struct CountCarrier
{
public:
CountCarrier(int currCount) : currCount(currCount) {}
void IncreaseCount() { ++currCount; }
int GetCount() const { return currCount; }
private:
int currCount;
};
int main()
{
CountCarrier carrier(2); // Initialize carrier.currCount to 2
carrier.IncreaseCount(); // Increment carrier.currCount to 3
}
Note how much cleaner and less error prone that is. Like I said, pick up a good introductory C++ book and read through it.