Debug unintended use of memory "belonging" to parent function - c++

Suppose I have a function foo (in C/C++) that is called from a given software tool.
Function foo is only allowed to write memory that has been allocated by foo or one of the functions called by foo, but not to write to memory that has been allocated by the functions that have been executed before calling foo.
I have the strong suspicion that at some place foo writes to memory it is not allowed to.
Is there a way to systematically debug this behavior? Maybe some fancy flag to valgrind?

The Valgrind manual has some Valgrind functions that your program can call.
It looks like VALGRIND_MAKE_MEM_NOACCESS may be what you want.

You could use a custom allocator (Boost Pool comes to mind) to make sure all your memory that you want to 'protect' is contiguously allocated.
Next, set a hardware breakpoint when any data in that memory region is changed.

I'd write a GDB script that sets a breakpoint on your function, then sets a hardware watch on the memory you suspect is being altered, then continues.
If function foo is modifying that memory the hardware watch will trigger on that instruction doing it.
The GDB script might look like:
break foo
commands
up
watch array
down
continue
end
I didn't test that and it may need tweaking, especially the watch expression. You might be limited to watching only one array element. I believe hardware watchpoints can actually watch only one integer size block: 4 bytes on 32 bit or 8 bytes on 64 bit.

The only way foo() can write to the memory outside its scope is if that memory is global, i.e. extern variable, or if foo() had one or more arguments which were meant to be read only but somehow they got modified.
To verify if the calling arguments are getting modified, you can create a structure to hold the arguments and just before returning compare original with saved arguments.
struct foo_args {
int a;
char *b;
};
void
foo(int a, char *b)
{
struct foo_args args;
args.a = a
args.b = strdup(b);
/* The rest of the foo() code. */
if (args.a != a || strcmp(args.b, b) != 0) {
printf("error - args got modified\n");
}
free(args.b);
}
If the above doesn't catch it, then the likely scenario is that either global, stack or heap memory is getting corrupted.
To have valgrind run the tool may not be practical, in which case you will need to create a 'wrapper' for foo() and ensure using valgrind or something similar that it is not doing what it is not supposed to do. The other option is to use a debugging library that tracks/monitors memory usage and flags memory errors as they occur.

Related

stack exception handling and destructors unwinding. How to use the information

In my program I have a garbage collector and I need to track which objects are kept on local variables in my program in order to avoid to garbage collect them while these are alive.
While I have used a linked list creating a composite type, I just realized after years that the C++ language must keep that linked list for it's own destructor purposes related especially to exception handling.
So I am thinking to simplify my code by using that information kept by the C++ exception handler. Is there a portable way to do it ?
If not is there information on that at least for g++ and clang ?
By the way as I am using multitasking I should be able to do it for every task (these are waiting when the garbage collector runs).
What I exactly need is to traverse the local variables that have a destructor set up (and do that in a non destructive way).
This is what a destructor is for. If you need to know when an object, whether on the heap, or on the stack, goes out of scope, then you would use a destructor for this.
By examining the code emitted by the gnu gcc compiler, I found that the information regarding which destructors are to be executed is "saved" in code form id est the code knows what it has to destruct and takes care of it. So it is too diffult to get that information back.
The implementation could had been different and the code could had pushed into the processor stack as a return address the address of the destructor code along with the address of the struct instance.
So I was ready to abandon the effort but I found another hacky route.
I noticed that the gnu compiler, aligns structures on the stack on boundaries of 16 bytes.
As my structure is 8 bytes, there are 8 more unused bytes.
So I thought I could use them in order to keep track of objects allocated on the stack.
It must be said that depending on the compilation flags as -m64, -mx32 things change and changes must be taken care of.
Any way I managed to have the code bellow compile and work in both -m64 and -mx32 modes.
It isn't pure "C++" (exactly the opposite) but the code works and has the intended effect.
#include <stdio.h>
#define ONSTACK(x) (((size_t)x)&0xf0000000)==0xf0000000
struct A;
void* P=nullptr;
struct A{
static int i;
long long a;
A(){
if(ONSTACK(this)){
*(void**)(this+1)=P;
P=(void*)(this+1);
}
printf("this=%p\n",this);
a=0x1111111111111111*i;i++;
}
~A(){
if(ONSTACK(this))
P=*(void**)(this+1);
*(void**)(this+1)=nullptr;
a=0xFFFFFFFFFFFFFFFF;}
};
int A::i=0xA;
void* p;
void r1(){
A x;int xx=0x66666666;
A y;int yy=0x77777777;
A z;int zz=0x88888888;
int b=0x2222222222222222;
printf("hello world %p\n",&b);
for(void* i=P;i!=nullptr;i=*(void**)i){
printf("a=%llX\n",(((A*)i)-1)->a);
}
}
void r2(){
long long r2a;
r2a=0x3333333333333333;
r1();
}
A A0;
int main(int argc, char **argv)
{ int i;p=(char*)&i-0x100;
printf("sizeof(P)=%i\n",sizeof(P));
r2();
}

Subtle Memory Leak, and is this common practice?

I think I might be creating a memory leak here:
void commandoptions(){
cout<< "You have the following options: \n 1). Buy Something.\n 2).Check you balance. \n3). See what you have bought.\n4.) Leave the store.\n\n Enter a number to make your choice:";
int input;
cin>>input;
if (input==1) buy();
//Continue the list of options.....
else
commandoptions(); //MEMORY LEAK IF YOU DELETE THE ELSE STATEMENTS!
}
inline void buy(){
//buy something
commandoptions();
}
Let's say commandoptions has just exectued for the first time the program has been run. The user selects '1', meaning the buy() subroutine is executed by the commandoptions() subroutine.
After buy() executes, it calls commandoptions() again.
Does the first commandoptions() ever return? Or did I just make a memory leak?
If I make a subroutine that does nothing but call itself, it will cause a stackoverflow because the other 'cycles' of that subroutine never exit. Am I doing/close to doing that here?
Note that I used the inline keyword on buy... does that make any difference?
I'd happily ask my professor, he just doesn't seem available. :/
EDIT: I can't believe it didn't occur to me to use a loop, but thanks, I learned something new about my terminology!
A memory leak is where you have allocated some memory using new like so:
char* memory = new char[100]; //allocate 100 bytes
and then you forget, after using this memory to delete the memory
delete[] memory; //return used memory back to system.
If you forget to delete then you are leaving this memory as in-use while your program is running and cannot be reused for something else. Seeing that memory is a limited resource, doing this millions of times for example, without the program terminating, would end you with no memory left to use.
This is why we clean up after ourselves.
In C++ you'd use an idiom like RAII to prevent memory leaks.
class RAII
{
public:
RAII() { memory = new char[100]; }
~RAII() { delete[] memory }
//other functions doing stuff
private:
char* memory;
};
Now you can use this RAII class, as so
{ // some scope
RAII r; // allocate some memory
//do stuff with r
} // end of scope destroys r and calls destructor, deleting memory
Your code doesn't show any memory allocations, therefore has no visible leak.
Your code does seem to have endless recursion, without a base case that will terminate the recursion.
Inline keyword won't cause a memory leak.
If this is all the code you have, there shouldn't be a memory leak. It does look like you have infinite recursion though. If the user types '1' then commandoptions() gets called again inside of buy(). Suppose they type '1' in that one. Repeat ad infinum, you then eventually crash because the stack got too deep.
Even if the user doesn't type '1', you still call commandoptions() again inside of commandoptions() at the else, which will have the exact same result -- a crash because of infinite recursion.
I don't see a memory leak with the exact code given however.
This is basically a recursion without a base case. So, the recursion will never end (until you run out of stack space that is).
For what you're trying to do, you're better off using a loop, rather than recursion.
And to answer your specific questions :
No, commandoptions never returns.
If you use a very broad definition of a memory leak, then this is a memory leak, since you're creating stack frames without ever removing them again. Most people wouldn't label it as such though (including me).
Yes, you are indeed gonna cause a stack overflow eventually.
The inline keyword won't make a difference in this.
This is not about memory leak, you are making infinite calls to commandoptions function no matter what the value of input is, which will result in stack crash. You need some exit point in your commandoptions function.
There is no memory leak here. What does happen (at least it looks that way in that butchered code snippet of yours) is that you get into an infinite loop. You might run out of stack space if tail call optimization doesn't kick in or isn't supported by your compiler (it's a bit hard to see whether or not your calls actually are in tail position though).

LLVM exceptions; how to unwind

at the moment, i'm inserting variables into the beginning of block scope using CreateEntryBlockAlloca:
template <typename VariableType>
static inline llvm::AllocaInst *CreateEntryBlockAlloca(BuilderParameter& buildParameters,
const std::string &VarName) {
HAssertMsg( 1 != 0 , "Not Implemented");
};
template <>
inline llvm::AllocaInst *CreateEntryBlockAlloca<double>(BuilderParameter& buildParameters,
const std::string &VarName) {
llvm::Function* TheFunction = buildParameters.dag.llvmFunction;
llvm::IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
TheFunction->getEntryBlock().begin());
return TmpB.CreateAlloca(llvm::Type::getDoubleTy(buildParameters.getLLVMContext()), 0,
VarName.c_str());
}
Now, i want to add Allocas for non-POD types (that might require a destructor/cleanup function at exit). However, it is not enough to add destructor calls at the end of the exit scope block, since it is not clear how to have them be invoked when a regular DWARF exception is thrown (for the purpose of this argument, lets say that exceptions are thrown from Call points that invoke C++ functions which only throw a POD type, so no, in my case, ignorance is bliss, and i would like to stay away from intrinsic llvm exceptions unless i understand them better).
I was thinking that may be i could have a table with offsets in the stack with the Alloca registers, and have the exception handler (at the bottom of the stack, at the invocation point of the JIT function) walk over those offsets on the table and call destructors appropiately.
The thing i don't know is how to query the offset of the Alloca'ed registers created with CreateAlloca. How can i do that reliably?
Also, if you think there is a better way to achieve this, please enlighten me on the path of the llvm
Technical Comment: the JIT code is being called inside a boost::context which only invokes the JIT code inside a try catch, and does nothing on the catch, it just exits from the context and returns to the main execution stack. the idea is that if i handle the unwinding in the main execution stack, any function i call (for say, cleaning up stack variables) will not overwrite those same stack contents from the terminated JIT context, so it will not be corrupted. Hope i'm making enough sense
The thing i don't know is how to query the offset of the Alloca'ed registers created with CreateAlloca. How can i do that reliably?
You can use the address of an alloca directly... there isn't any simple way to get its offset into the stack frame, though.
Why exactly do you not want to use the intrinsic LLVM exceptions? They really are not that hard to use, especially in the simple case where your code never actually catches anything. You can basically just take the code clang generates in the simple case, and copy-paste it.
Edit:
To see how to use exceptions in IR in the simple case, try pasting the following C++ code into the demo page at http://llvm.org/demo/:
class X { public: ~X() __attribute((nothrow)); };
void a(X* p);
void b() { X x; a(&x); }
It's really not that complicated.

How can I get the size of a memory block allocated using malloc()? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicates:
How can I get the size of an array from a pointer in C?
Is there any way to determine the size of a C++ array programmatically? And if not, why?
I get a pointer to a chunk of allocated memory out of a C style function.
Now, it would be really interesting for debugging purposes to know how
big the allocated memory block that this pointer points is.
Is there anything more elegant than provoking an exception by blindly running over its boundaries?
Thanks in advance,
Andreas
EDIT:
I use VC++2005 on Windows, and GCC 4.3 on Linux
EDIT2:
I have _msize under VC++2005
Unfortunately it results in an exception in debug mode....
EDIT3:
Well. I have tried the way I described above with the exception, and it works.
At least while I am debugging and ensuring that immediately after the call
to the library exits I run over the buffer boundaries. Works like a charm.
It just isn't elegant and in no way usable in production code.
It's not standard but if your library has a msize() function that will give you the size.
A common solution is to wrap malloc with your own function that logs each request along with the size and resulting memory range, in the release build you can switch back to the 'real' malloc.
If you don't mind sleazy violence for the sake of debugging, you can #define macros to hook calls to malloc and free and pad the first 4 bytes with the size.
To the tune of
void *malloc_hook(size_t size) {
size += sizeof (size_t);
void *ptr = malloc(size);
*(size_t *) ptr = size;
return ((size_t *) ptr) + 1;
}
void free_hook (void *ptr) {
ptr = (void *) (((size_t *) ptr) - 1);
free(ptr);
}
size_t report_size(ptr) {
return * (((size_t *) ptr) - 1);
}
then
#define malloc(x) malloc_hook(x)
and so on
The C runtime library does not provide such a function. Furthermore, deliberately provoking an exception will not tell you how big the block is either.
Usually the way this problem is solved in C is to maintain a separate variable which keeps track of the size of the allocated block. Of course, this is sometimes inconvenient but there's generally no other way to know.
Your C runtime library may provide some heap debug functions that can query allocated blocks (after all, free() needs to know how big the block is), but any of this sort of thing will be nonportable.
With gcc and the GNU linker, you can easily wrap malloc
#include <stdlib.h>
#include <stdio.h>
void* __real_malloc(size_t sz);
void* __wrap_malloc(size_t sz)
{
void *ptr;
ptr = __real_malloc(sz);
fprintf(stderr, "malloc of size %d yields pointer %p\n", sz, ptr);
/* if you wish to save the pointer and the size to a data structure,
then remember to add wrap code for calloc, realloc and free */
return ptr;
}
int main()
{
char *x;
x = malloc(103);
return 0;
}
and compile with
gcc a.c -o a -Wall -Werror -Wl,--wrap=malloc
(Of course, this will also work with c++ code compiled with g++, and with the new operator (through it's mangled name) if you wish.)
In effect, the statically/dynamically loaded library will also use your __wrap_malloc.
No, and you can't rely on an exception when overrunning its boundaries, unless it's in your implementation's documentation. It's part of the stuff you really don't need to know about to write programs. Dig into your compiler's documentation or source code if you really want to know.
There is no standard C function to do this. Depending on your platform, there may be a non-portable method - what OS and C library are you using?
Note that provoking an exception is unreliable - there may be other allocations immediately after the chunk you have, and so you might not get an exception until long after you exceed the limits of your current chunk.
Memory checkers like Valgrind's memcheck and Google's TCMalloc (the heap checker part) keep track of this sort of thing.
You can use TCMalloc to dump a heap profile that shows where things got allocated, or you can just have it check to make sure your heap is the same at two points in program execution using SameHeap().
Partial solution: on Windows you can use the PageHeap to catch a memory access outside the allocated block.
PageHeap is an alternate memory manager present in the Windows kernel (in the NT varieties but nobody should be using any other version nowadays). It takes every allocation in a process and returns a memory block that has its end aligned with the end of a memory page, then it makes the following page unaccessible (no read, no write access). If the program tries to read or write past the end of the block, you'll get an access violation you can catch with your favorite debugger.
How to get it: Download and install the package Debugging Tools for Windows from Microsoft: http://www.microsoft.com/whdc/devtools/debugging/default.mspx
then launch the GFlags utility, go to the 3rd tab and enter the name of your executable, then Hit the key. Check the PageHeap checkbox, click OK and you're good to go.
The last thing: when you're done with debugging, don't ever forget to launch GFlags again, and disable PageHeap for the application. GFlags enters this setting into the Registry (under HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\), so it is persistent, even across reboots.
Also, be aware that using PageHeap can increase the memory needs of your application tremendously.
The way to do what you want is to BE the allocator. If you filter all requests, and then record them for debugging purposes, then you can find out what you want when the memory is free'd.
Additionally, you can check at the end of the program to see if all allocated blocks were freed, and if not, list them. An ambitious library of this sort could even take FUNCTION and LINE parameters via a macro to let you know exactly where you are leaking memory.
Finally, Microsoft's MSVCRT provides a a debuggable heap that has many useful tools that you can use in your debug version to find memory problems: http://msdn.microsoft.com/en-us/library/bebs9zyz.aspx
On Linux, you can use valgrind to find many errors. http://valgrind.org/

_heapwalk reports _HEAPBADNODE, causes breakpoint or loops endlessly

I use _heapwalk to gather statistics about the Process' standard heap.
Under certain circumstances i observe unexpected behaviours like:
_HEAPBADNODE is returned
some breakpoint is triggered inside _heapwalk, telling me the heap might got corrupted
access violation inside _heapWalk.
I saw different behaviours on different Computers. On one Windows XP 32 bit machine everything looked fine, whereas on two Windows XP 64 bit machines i saw the mentioned symptoms.
I saw this behaviour only if LowFragmentationHeap was enabled.
I played around a bit.
I walked the heap several times right one after another inside my program. First time doing nothing in between the subsequent calls to _heapWalk (everything fine). Then again, this time doing some stuff (for gathering statistics) in between two subsequent calls to _heapWalk. Depending upon what I did there, I sometimes got the described symptoms.
Here finally a question:
What exactly is safe and what is not safe to do in between two subsequent calls to _heapWalk during a complete heap walk run?
Naturally, i shall not manipulate the heap. Therefore i doublechecked that i don't call new and delete.
However, my observation is that function calls with some parameter passing causes my heap walk run to fail already. I subsequently added function calls and increasing number of parameters passed to these. My feeling was two function calls with two paramters being passed did not work anymore.
However I would like to know why.
Any ideas why this does not happen on some machines?
Any ideas why this only happens if LowFragmentationHeap is enabled?
Sample Code finally:
#include <malloc.h>
void staticMethodB( int a, int b )
{
}
void staticMethodA( int a, int b, int c)
{
staticMethodB( 3, 6);
return;
}
...
_HEAPINFO hinfo;
hinfo._pentry = NULL;
while( ( heapstatus = _heapwalk( &hinfo ) ) == _HEAPOK )
{
//doing nothing here works fine
//however if i call functions here with parameters, this causes
//_HEAPBADNODE or something else
staticMethodA( 3,4,5);
}
switch( heapstatus )
{
...
case _HEAPBADNODE:
assert( false );
/*ERROR - bad node in heap */
break;
...
Use HeapWalk, not _heapwalk.
Try locking the heap during your heap enumeration with HeapLock and HeapUnlock.
It certainly sounds like your function calls are modifying the Heap and invalidating the enumeration. Some vague advice, perhaps you can create a new Heap specifically for any memory needed by these function calls. This might require significant reworking of these static functions, I know.