Modifying the call stack - c++

Is it possible to modify the call stack in c++? (I realize this is a horrible idea and am really just wondering----I don't plan on actually doing this)
For example:
void foo(){
other();
cout << "You never see this" << endl; //The other() function modifies the stack to
//point to what ever called this function...so this is not displayed
}
void other(){
//modify the stack pointer here somehow to go down 2 levels
}
//Elsewhere
foo();

When a function calls another one in typical C implementations, the processor stack is used and the call opcode is used. That has as effect to push the next to execute processor instuction pointer on the processor stack. Usually besides the return address also the value of the stack frame pointer is used.
So the stack contains:
...free_space... [local_variables] [framePtr] [returnAddr] PREVIOUS_STACK.
So in order to change the return address ( that you should know what size it has -- if you compile e.g. via -m64 it will have as size of 64 bits ) you may get the address of a variable and add some to it in order to arrive to the address of the return pointer and change it.
The code bellow has been compiled with g++ in mode m64.
If it works by change also for you you may see the effect.
#include <stdio.h>
void changeRetAddr(long* p){
p-=2;
*p+=0x11;
}
void myTest(){
long a=0x1122334455667788;
changeRetAddr(&a);
printf("hi my friend\n");
printf("I didn't showed the salutation\n");
}
int main(int argc, char **argv)
{
myTest();
return 0;
}

Related

A variable changes his address with no reason

maybe this question will be a bit complicated and maybe i'm missing something of stupid.
I'll try to explain without any source code, because my project is big and i don't know how/where to start.
I have:
bool result = false;
bool* pointer = &result;
these variables are stored in some classes.. (not as in the above code).
When result is created, his address is something like 0x28fddc.
and the pointer variable takes this address.
Suddendly, without any reason (maybe), his address is not 0x28fddc anymore but something like 0x3a8c6e4.
With the pointer variable, i am trying to change the result var by doing:
*result = true;
But obviously, this doesn't work (and it doesn't give me any error). It will not change result var because it's in another address.
I don't know why this happens.
Can you only tell me how could this happen? And i'll try to fix.
(This classes are everytime being updated in some functions with parameters passed by reference).
For example:
void update_class(obj_class &obj);
(These names are only an example).
I hope i've been clear and if not, i'll just delete this topic.
Sorry for bad english but i'm italian :)
EDIT:
Now i'll try to provide some code..
button.h
class button
{
public:
void check_tap(SDL_Event* e);
bool* done;
}
messagebox.h:
class messagebox
{
public:
messagebox();
bool result_ok;
button btn_ok;
}
void check_tap(std::vector<messagebox> &msgbox, SDL_Event* e) {
for(unsigned int k=0; k<msgbox.size(); k++) {
msgbox[k].btn_ok.check_tap(e);
// check_tap is a function that i create for checking if the user is tapping the button with his finger or not. When the user presses the button and leaves it the done variable should become true, but result_ok seems not to be affected because his address here is different. This problem is only in this case using messagebox. I created more other buttons outside and all works perfect.
}
}
messagebox.cpp:
messagebox::messagebox() {
// Initializing things
btn_ok.done = &result_ok;
// Here, btn_ok.done gets the address of result_ok..
}
main.cpp:
std::vector<messagebox> msgbox;
msgbox.push_back(msgbox());
No, the address of variables do not change during their lifetime.
However, storing the address of a variable is problematical if the variable ceases to exist. A simple example is
#include <iostream>
int *p;
void f()
{
int i;
p = &i;
}
int main();
{
f();
std::cout << (void *)p << '\n';
// other code here
f();
std::cout << (void *)p << '\n';
}
In the above case, the two values of p may be the same, or they may differ. This is not because the address of variables change. It is because a variable, i is created each time f() is called, and ceases to exist when f() returns. The variable i in the first call of f() is, as far as your program is concerned, a distinct variable from i during the second call of f().
Depending on what happens with "other code here" in the above, the memory occupied by i in the first call of f() may be inaccessible (e.g. used for another variable) during the second call of f() - so, during the second call of f(), i will have a different address than during the first. There are no guarantees - you may also get lucky (or unlucky depending on how you look at it) and the addresses printed will be the same.
If you are getting behaviour that suggests, to you, that the address of a variable is changing then - somewhere in your code - there is a bug of some form. Typically, this will involve storing the address of a variable in a pointer, and using (or accessing the value of) the pointer after the variable ceases to exist. And any dereferrencing of that pointer (e.g. to access the variable pointed at) has undefined behaviour.
For example, any usage of *p like
*p = 42;
or
std::cout << *p << '\n';
in the main() I have given above will give undefined behaviour.
The act of assigning a pointer to contain the address of that variable does not change the lifetime of that variable.

Is it necessary to clean up stack contents?

We are under a PCI PA-DSS certification and one of its requirements is to avoid writing clean PAN (card number) to disk. The application is not writing such information to disk, but if the operating system (Windows, in this case) needs to swap, the memory contents is written to page file. Therefore the application must clean up the memory to prevent from RAM capturer services to read sensitive data.
There are three situations to handle:
heap allocation (malloc): before freeing the memory, the area can be cleaned up with memset
static or global data: after being used, the area can be cleaned up using memset
local data (function member): the data is put on stack and is not accessible after the function is finished
For example:
void test()
{
char card_number[17];
strcpy(card_number, "4000000000000000");
}
After test executes, the memory still contains the card_number information.
One instruction could zero the variable card_number at the end of test, but this should be for all functions in the program.
memset(card_number, 0, sizeof(card_number));
Is there a way to clean up the stack at some point, like right before the program finishes?
Cleaning the stack right when the program finishes might be too late, it could have already been swapped out during any point at its runtime. You should keep your sentitive data only in memory locked with VirtualLock so it does not get swapped out. This has to happen before said sensitive data is read.
There is a small limit on how much memory you can lock like this so you can propably not lock the whole stack and should avoid storing sensitive data on the stack at all.
I assume you want to get rid of this situation below:
#include <iostream>
using namespace std;
void test()
{
char card_number[17];
strcpy(card_number, "1234567890123456");
cout << "test() -> " << card_number << endl;
}
void test_trash()
{
// don't initialize, so get the trash from previous call to test()
char card_number[17];
cout << "trash from previous function -> " << card_number << endl;
}
int main(int argc, const char * argv[])
{
test();
test_trash();
return 0;
}
Output:
test() -> 1234567890123456
trash from previous function -> 1234567890123456
You CAN do something like this:
#include <iostream>
using namespace std;
class CardNumber
{
char card_number[17];
public:
CardNumber(const char * value)
{
strncpy(card_number, value, sizeof(card_number));
}
virtual ~CardNumber()
{
// as suggested by #piedar, memset_s(), so the compiler
// doesn't optimize it away.
memset_s(card_number, sizeof(card_number), 0, sizeof(card_number));
}
const char * operator()()
{
return card_number;
}
};
void test()
{
CardNumber cardNumber("1234567890123456");
cout << "test() -> " << cardNumber() << endl;
}
void test_trash()
{
// don't initialize, so get the trash from previous call to test()
char card_number[17];
cout << "trash from previous function -> " << card_number << endl;
}
int main(int argc, const char * argv[])
{
test();
test_trash();
return 0;
}
Output:
test() -> 1234567890123456
trash from previous function ->
You can do something similar to clean up memory on the heap or static variables.
Obviously, we assume the card number will come from a dynamic source instead of the hard-coded thing...
AND YES: to explicit answer the title of your question: The stack will not be cleaned automatically... you have to clean it by yourself.
I believe it is necessary, but this is only half of the problem.
There are two issues here:
In principle, nothing prevents the OS from swapping your data while you are still using it. As pointed out in the other answer, you want VirtualLock on windows and mlock on linux.
You need to prevent the optimizer from optimizing out the memset. This also applies to global and dynamically allocated memory. I strongly suggest to take a look at cryptopp SecureWipeBuffer.
In general, you should avoid to do it manually, as it is an error-prone procedure. Instead, consider using a custom allocator or a custom class template for secure data that can be freed in the destructor.
The stack is cleaned up by moving the stack pointer, not by actually popping values from it. The only mechanics are to pop the return into the appropriate registers. You must do it all manually. Also -- volatile can help you avoid optimizations on a per variable basis. You can manually pop the stack clean, but -- you need assembler to do that -- and it is not so simple to start manipulating the stack -- it is not actually your resource -- the compiler owns it as far as you are concerned.

Polymorphism in mixin classes - virtual functions

I'm currently reading about mixin classes and I think I unerstand everything more or less. The only thing I don't understand is why I don't need virtual functions anymore. (See here and here)
E.g. greatwolf writes in his answer here that virtual functions are not needed. Here is the example: (I just copied the essential parts)
struct Number
{
typedef int value_type;
int n;
void set(int v) { n = v; }
int get() const { return n; }
};
template <typename BASE, typename T = typename BASE::value_type>
struct Undoable : public BASE
{
typedef T value_type;
T before;
void set(T v) { before = BASE::get(); BASE::set(v); }
void undo() { BASE::set(before); }
};
typedef Undoable<Number> UndoableNumber;
int main()
{
UndoableNumber mynum;
mynum.set(42); mynum.set(84);
cout << mynum.get() << '\n'; // 84
mynum.undo();
cout << mynum.get() << '\n'; // 42
}
But what happens now if I do something like this:
void foo(Number *n)
{
n->set(84); //Which function is called here?
}
int main()
{
UndoableNumber mynum;
mynum.set(42);
foo(&mynum);
mynum.undo();
cout << mynum.get() << '\n'; // 42 ???
}
What value does mynum have and why? Does the polymorphism work in foo()?!?
n->set(84); //Which function is called here?
Number::set will be called here.
Does the polymorphism work in foo()?!?
No, without virtual. If you try the code, you'll get an unspecified value because before doesn't be set at all.
LIVE
I compiled your code in VS 2013, and it gives an unspecified number.
You got no constructor in your struct, which means that the variable before is not initialized.
Your code example invokes undefined behaviour, because you try to read from the int variable n while it is not in a valid status. The question is not what value will be printed. Your program is not required to print anything, or do anything that makes sense, although you are likely using a machine on which the undefined behaviour will only present itself as a seeminly random value in n or on which it will mostly appear as 0.
Your compiler likely gives you an important hint if you allow it to detect such problems, for example:
34:21: warning: 'mynum.Number::n' is used uninitialized in this function [-Wuninitialized]
However, the undefined behaviour starts even before that. Here's how it happens, step by step:
UndoableNumber mynum;
This also creates the Number sub-object with an unintialised n. That n is of type int and can thus have its individual bits set to a so-called trap representation.
mynum.set(42);
This calls the derived-class set function. Inside of set, an attempt is made to set the before member variable to the uninitialised n value with the possible trap representation:
void set(T v) { before = BASE::get(); BASE::set(v); }
But you cannot safely do that. The before = BASE::get() part is already wrong, because Base::get() copies the int with the possible trap representation. This is already undefined behaviour.
Which means that from this point on, C++ as a programming language no longer defines what will happen. Reasoning about the rest of your program is moot.
Still, let's assume for a moment that the copy would be fine. What else would happen afterwards?
Base::set is called, setting n to a valid value. before remains in its previous invalid status.
Now foo is called:
void foo(Number *n)
{
n->set(84); //Which function is called here?
}
The base-class set is called because n is of type Number* and set is non-virtual.
set happily sets the n member variable to 84. The derived-class before remains invalid.
Now the undo function is called and does the following:
BASE::set(before);
After this assignment, n is no longer 84 but is set to the invalid before value.
And finally...
cout << mynum.get() << '\n';
get returns the invalid value. You try to print it. This will yield unspecified results even on a machine which does not have trap representation for ints (you are very likely using such a machine).
Conclusion:
C++ as a language does not define what your program does. It may print something, print nothing, crash or do whatever it feels like, all because you copy an unininitialised int.
In practice, crashing or doing whatever it feels like is unlikely on a typical end-user machine, but it's still undefined what will be printed.
If you want your derived-class set to be called when invoked on a Number*, then you must make set a virtual function in Number.

In C++, how can I get the current thread's call stack?

I'm writing this error handler for some code I'm working in, in C++. I would like to be able to make some sort of reference to whatever I have on the stack, without it being explicitly passed to me. Specifically, let's say I want to print the names of the functions on the call stack, in order. This is trivial in managed runtime environments like the JVM, probably not so trivial with 'simple' compiled code. Can I do this?
Notes:
Assume for simplicity that I compile my code with debugging information and no optimization.
I want to write something that is either platform-independent or multi-platform. Much prefer the former.
If you think I'm trying to reinvent the wheel, just link to the source of the relevant wheel and I'll look there.
Update:
I can't believe how much you need to bend over backwards to do this... almost makes me pine for another language which shall not be mentioned.
There is a way to get a back-trace in C++, though it is not portable. I cannot speak for Windows, but on Unix-like systems there is a backtrace API that consists primarily of the following functions:
int backtrace(void** array, int size);
char** backtrace_symbols(void* const* array, int size);
void backtrace_symbols_fd(void* const* array, int size, int fd);
You can find up to date documentation and examples on GNU website here. There are other sources, like this manual page for OS X, etc.
Keep in mind that there are a few problems with getting backtrace using this API. Firstly, there no file names and no line numbers. Secondly, you cannot even get backtrace in certain situations like if the frame pointer is omitted entirely (default behavior of recent GCC compilers for x86_64 platforms). Or maybe the binary doesn't have any debug symbols whatsoever. On some systems, you also have to specify -rdynamic flag when compiling your binary (which has other, possible undesirable, effects).
Unfortunately, there is no built-in way of doing this with the standard C++. You can construct a system of classes to help you build a stack tracer utility, but you would need to put a special macro in each of the methods that you would like to trace.
I've seen it done (and even implemented parts of it) using the strategy outlined below:
Define your own class that stores the information about a stack frame. At the minimum, each node should contain the name of the function being called, file name / line number info being close second.
Stack frame nodes are stored in a linked list, which is reused if it exists, or created if it does not exist
A stack frame is created and added to the list by instantiating a special object. Object's constructor adds the frame node to the list; object's destructor deletes the node from the list.
The same constructor/destructor pair are responsible for creating the list of frames in thread local storage, and deleting the list that it creates
The construction of the special object is handled by a macro. The macro uses special preprocessor tokens to pass function identification and location information to the frame creator object.
Here is a rather skeletal proof-of-concept implementation of this approach:
#include <iostream>
#include <list>
using namespace std;
struct stack_frame {
const char *funName;
const char *fileName;
int line;
stack_frame(const char* func, const char* file, int ln)
: funName(func), fileName(file), line(ln) {}
};
thread_local list<stack_frame> *frames = 0;
struct entry_exit {
bool delFrames;
entry_exit(const char* func, const char* file, int ln) {
if (!frames) {
frames = new list<stack_frame>();
delFrames = true;
} else {
delFrames = false;
}
frames->push_back(stack_frame(func, file, ln));
}
~entry_exit() {
frames ->pop_back();
if (delFrames) {
delete frames;
frames = 0;
}
}
};
void show_stack() {
for (list<stack_frame>::const_iterator i = frames->begin() ; i != frames->end() ; ++i) {
cerr << i->funName << " - " << i->fileName << " (" << i->line << ")" << endl;
}
}
#define FUNCTION_ENTRY entry_exit _entry_exit_(__func__, __FILE__, __LINE__);
void foo() {
FUNCTION_ENTRY;
show_stack();
}
void bar() {
FUNCTION_ENTRY;
foo();
}
void baz() {
FUNCTION_ENTRY;
bar();
}
int main() {
baz();
return 0;
}
The above code compiles with C++11 and prints this:
baz - prog.cpp (52)
bar - prog.cpp (48)
foo - prog.cpp (44)
Functions that do not have that macro would be invisible on the stack. Performance-critical functions should not have such macros.
Here is a demo on ideone.
It is not easy. The exact solution depends very much on the OS and Execution environment.
Printing the stack is usually not that difficult, but finding symbols can be quite tricky, since it usually means reading debug symbols.
An alternative is to use an intrusive approach and add some "where am I" type code to each function (presumably for "debug builds only"):
#ifdef DEBUG
struct StackEntry
{
const char *file;
const char *func;
int line;
StackEntry(const char *f, const char *fn, int ln) : file(f), func(fn), line(ln) {}
};
std::stack<StackEntry> call_stack;
class FuncEntry
{
public:
FuncEntry(const char *file, const char *func, int line)
{
StackEntry se(file, func, line);
call_stack.push_back(se);
}
~FuncEntry()
{
call_stack.pop_back();
}
void DumpStack()
{
for(sp : call_stack)
{
cout << sp->file << ":" << sp->line << ": " << sp->func << "\n";
}
}
};
#define FUNC() FuncEntry(__FILE__, __func__, __LINE__);
#else
#define FUNC()
#endif
void somefunction()
{
FUNC();
... more code here.
}
I have used this technique in the past, but I just typed this code in, it may not compile, but I think it's clear enough . One major benefit is that you don't HAVE to put it in every function - just "important ones". [You could even have different types of FUNC macros that are enabled or disabled based on different levels of debugging].

How does the Visual C++ compiler pass the this ptr to the called function?

I'm learning C++ using Eckel's "Thinking in C++". It states the following:
If a class contains virtual methods, a virtual function table is created for that class etc. The workings of the function table are explained roughly. (I know a vtable is not mandatory, but Visual C++ creates one.)
The calling object is passed to the called function as an argument. (This might not be true for Visual C++ (or any compiler).) I'm trying to find out how VC++ passes the calling object to the function.
To test both points in Visual C++, I've created the following class (using Visual Studio 2010, WinXP Home 32bit):
ByteExaminer.h:
#pragma once
class ByteExaminer
{
public:
short b[2];
ByteExaminer(void);
virtual void f() const;
virtual void g() const;
void bruteFG();
};
ByteExaminer.cpp:
#include "StdAfx.h"
#include "ByteExaminer.h"
using namespace std;
ByteExaminer::ByteExaminer(void)
{
b[0] = 25;
b[1] = 26;
}
void ByteExaminer::f(void) const
{
cout << "virtual f(); b[0]: " << hex << b[0] << endl;
}
void ByteExaminer::g(void) const
{
cout << "virtual g(); b[1]: " << hex << b[1] << endl;
}
void ByteExaminer::bruteFG(void)
{
int *mem = reinterpret_cast<int*>(this);
void (*fg[])(ByteExaminer*) = { (void (*)(ByteExaminer*))(*((int *)*mem)), (void (*)(ByteExaminer*))(*((int *)(*mem + 4))) };
fg[0](this);
fg[1](this);
}
The navigation through the vtable in bruteFG() works - when I call fg[0](this), f() is called. What does NOT work, however, is the passing of this to the function - meaning that this->b[0] is not printed correctly (garbage comes out instead. I'm actually lucky this doesn't produce a segfault).
So the actual output for
ByteExaminer be;
be.bruteFG();
is:
virtual f(); b[0]: 1307
virtual g(); b[1]: 0
So how should I proceed to get the correct result? How are the this pointers passed to functions in VC++?
(Nota bene: I'm NOT going to program this way seriously, ever. This is "for the lulz"; or for the learning experience. So don't try to convert me to proper C++ianity :))
Member functions in Visual Studio have a special calling convention, __thiscall, where this is passed in a special register. Which one, I don't recall, but MSDN will say. You will have to go down to assembler if you want to call a function pointer which is in a vtable.
Of course, your code exhibits massively undefined behaviour- it's only OK to alias an object using a char or unsigned char pointer, and definitely not an int pointer- even ignoring the whole vtable assumptions thing.
OK using DeadMG's hint I've found a way without using assembler:
1) Remove the ByteExaminer* arg from the functions in the fg[] array
2) Add a function void callfunc(void (*)()); to ByteExaminer:
void ByteExaminer::callfunc(void (*func)())
{
func();
}
... this apparently works because func() is the first thing to be used in callfunc, so ecx is apparently not changed before. But this is a dirty trick (as you can see in the code above, I'm always on the hunt for clean code). I'm still looking for better ways.