warning #13212: Reference to ebx in function requiring stack alignment

warning #13212: Reference to ebx in function requiring stack alignment - c++

I am trying to compile the below code with ICC 2018:
__asm {
mov ebx, xx ;xx address to registers
}
where xx is of type int16. This is the first instruction inside my function.
I get the below warning with the above assembly code:
warning #13212: Reference to ebx in function requiring stack alignment
Surprisingly, when I replaced ebx with eax or esi, I saw the warning go away. I am not able to understand why I am seeing the issue only with ebx, as far as I know, both ebx and eax has same architecture(32 bit registers).
Also, I didn't see the warning when I compiled the same code with ICC 2013.
Can anyone help me resolve this warning?
Thanks!

The compiler on the platform of choice (ICC as it mimics MSVC's behavior) uses EBX to save the original stack pointer value if additional alignment is required. Therefore you cannot overwrite it safely.
The program's behavior would become undefined. The compiler warning just tells you about that.
To help with save/restore of all registers affected by assembly blocks, an extended syntax with so called clobber lists is recommended. Your example uses MSVC-style __asm{...} syntax. In MSVC-style syntax, the compiler detects what registers you touch and saves/restores them for you.
ICC also supports GCC-like notation for extended asm with clobber lists: asm("...":::). It also supports simpler GCC asm("...") without the clobber list part. See this question for more details (thanks Peter Cordes for the link and explanation).
Documentation that I found useful when I was learning to use clobber lists (I actually use it all the time because it is impossible to remember its rather human-unfriendly syntax):
https://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s5
https://software.intel.com/en-us/node/694235
The simple inline assembly blocks without clobber lists can be safely used only in the following situations:
Instructions of the block do not modify registers defined in the ABI. Thus GPRs, stack counter, flags should be untouched; if there are floating-point calculations in the function, FPU/vector registers are off limits as well. Even memory writes can lead to bugs because the compiler relies on known values to reside in memory. In contrast, one can issue INT3, HLT, WRMSR etc instructions which either touch no registers or affect only system registers which the compiler do not use. However, the majority of such instructions are privileged and cannot be used in user applications. One can also read all available registers provided there are no side effects of such reads.
The assembler block is the only statement in a function's body. In this case, it has to abide to calling conventions of the chosen platform: how function's arguments are passed, where its exit code should be placed etc. The block will also need to cope with compiler-generated prologue and epilogue code blocks that have their own assumptions about registers. Their code is not strictly stable, nor portable nor guaranteed to be the same with different optimization levels. With GCC on x86, I was unable to disable prologue/epilogue generation, so there is still some risk to violate compiler assumptions.
You save all clobbered registers yourself and restore them afterwards. This is relatively easy because you can see your own assembler code and can tell if a register gets modified by it or not. However, make a mistake and a compiler will not be here for you to point it out. It is very nice of ICC 2018 to actually give a warning even though it could have just treated the asm block as a black box.
You "stole" a register from compiler. GCC allows doing that with register asm statement (do not remember if the same trick works with other compilers). You can thus declare that a variable is bound to a certain register. Be aware that such technique reduces number of registers available to compiler for its register allocation phase, and that will degrade quality of code it generates. Ask for too many registers, and the compiler will be helpless and refuse to work. Similarly, one cannot ask for registers with a dedicated role to be taken away from a compiler, such as stack pointer or program counter.
That said, the extended asm syntax with clobber lists provides a nice alternative. It turns an asm section from a black box to something of an inline internal "function" that declares its own inputs, outputs and resources it overwrites which are shared with the outer function.

Related

Patching arm64 binary to replace all 'call' instructions to point to a specific function

How do I replace all the function calls in an arm64 binary with call to a specific function. The intent is to 'insert' a indirection such that I can log all function calls.
Example:
mov x29, sp
mov w0, #10
bl bar(int)
...
# Replace "bl bar" with my_func. my_func will now take all the parameters and forward it to foo.
mov x29, sp
mov w0, #10
bl my_func(...)
The replacement function prints the pointer to the function, and then invokes the callee with the provided arguments. I'm also not sure how this forwarding will work for all the cases but the intent is to have something like this:
template<class F, class... Args>
void my_func(F&& f, Args&&... args) {
printf("calling: %p", f);
std::invoke(std::forward<F>(f), std::forward<Args>(args));
}

TL:DR: write asm wrapper functions that call a C++ void logger(void *fptr) which returns. Don't try to tailcall from C++ because that's not possible in the general case.
An alternate approach might be to "hook" every callee, instead of redirecting at the call site. But then you'd miss calls to functions in libraries you weren't instrumenting.
I don't think C++ lets you forward any/all args without knowing what they are. That's easy to do in asm for a specific calling convention, since the final invocation of the real function can be a tailcall jump, with return address and all arg-passing registers set up how they were, and the stack pointer. But only if you're not trying to remove an arg.
So instead of having C++ do the tailcall to the real function, have asm wrappers just call a logging function. Either printf directly, or a function like extern "C" void log_call(void *fptr); which returns. It's is compiled normally so it'll follow the ABI, so the hand-written asm trampoline / wrapper function knows what it needs to restore before jumping.
Capturing the target address
bl my_func won't put the address of bar anywhere.
For direct calls you could use the return address (in lr) to look up the target, e.g. in a hash table. Otherwise you'd need a separate trampoline for every function you're hooking. (Modifying the code to hook at the target function instead of the call sites wouldn't have this problem, but you'd have to replace the first instruction with a jump somewhere which logs and then returns. And which does whatever that replaced first instruction did. Or replace the first couple instructions with one that saves the return address and then calls.)
But any indirect calls like blr x8 will need a special stub.
Probably one trampoline stub for each different possible register that holds a function address.
Those stubs will need to be written in asm.
If you were trying to call a wrapper in C++ the way you imagined, it would be tricky because the real args might be using all the register-arg slots. And changing the stack pointer to add a stack arg makes it a new 5th arg or something weird. So it works much better just to call a C++ function to do the logging, then restore all the arg-passing registers which you saved on the stack. (16 bytes at a time with stp.)
That also avoids the problem of trying to make a transparent function with C++
Removing one arg and forwarding the rest
Your design requires my_func to remove one arg and then forward an unknown number of other args of unknown type to another function. That's not even possible in ARM64 asm, therefore not surprising that C++ doesn't have syntax that would require the compiler to do it.
If the arg was actually a void* or function pointer, it would take one registers, so removing it would move the next 3 regs down (x1 to x0, etc.) and the first stack arg then goes in x3. But the stack has to stay 16-byte aligned, so you can't load just it and leave the later stack args in the right place.
A workaround for that in some cases would be to make that f arg 16 bytes, so it takes two registers. Then you can mov x3,x2 down to x0,x1, and ldp 16 bytes of stack args. Except what if that arg was one that always gets passed in memory, not registers, e.g. part of an even larger object, or non-POD or whatever the criterion for the C++ ABI to make sure it always has an address.
So maybe f could be 32 bytes so it goes on the stack, and can be removed without touching arg-passing registers or needing to pull any stack args back into registers.
Of course in the real case you didn't have a C++ function that can add a new first arg and then pass on all the rest either. That's something you could again only do in special cases, like passing on an f.
It's something you could do in asm on 32-bit x86 with a pure stack-args calling convention and no stack-alignment requirement; you can move the return address up one slot and jump, so you eventually return to the original call-site with the stack pointer restored to how it was before calling the trampoline that added a new first arg and copied the return address lower.
But C++ won't have any constructs that impose requirements on ABIs beyond what C does.
Scanning a binary for bl instructions
That will miss any tailcalls that use b instead of bl. That might be ok, but if not I don't see a way to fix it. Unconditional bl will be all over the place inside functions. (With some heuristics for identifying functions, a b outside the current function can be assumed to be a tailcall, while others aren't, since compilers usually make all the code for a single function contiguous.
Except when some blocks go in a .text.cold section if the compiler identifies them as unlikely.)
AArch64 has fixed-width instructions that require alignment, so consistent disassembly of the compiler-generated instructions is easy, unlike x86. So you can identify all the bl instructions.
But if AArch64 compilers mix in any constant data between functions, like 32-bit ARM compilers do (literal pools for PC-relative loads), false positives are possible even if you limit it to looking at parts of the binary that are in executable ELF sections. (Or program segments if section headers have been stripped.)
I don't think bl gets used for anything other than function calls in compiler-generated code. (e.g. not to private helper functions the compiler invented.)
You might want a library to help parse ELF headers and find the right binary offsets. Looking for bl instructions might be something you do by scanning the machine code, not disassembly.
If you're modifying compiler asm output before even assembling, that would make something easier; you could add instructions are callsites. But for existing binaries you can't compile from source.

Why address-of operator ('&') can be used with objects that are declared with the register storage class specifier in C++?

In C programming language we are not allowed to use address-of operator(&) with variables which are declared with register storage class specifier.
It gives error: address of register variable ‘var_name’ requested
But if we make a c++ program and perform the same task (i.e use the & with register storage variable) it doesn't gives us any error.
eg.
#include <iostream>
using namespace std;
int main()
{
register int a;
int * ptr;
a = 5;
ptr = &a;
cout << ptr << endl;
return 0;
}
Output :-
0x7ffcfed93624
Well this must be an extra feature of C++, but the question is on the difference between register class storage in C and C++.

The restriction on taking the address was deliberately removed in C++ - there was no benefit to it, and it made the language more complicated. (E.g. what would happen if you bound a reference to a register variable?)
The register keyword hasn't been much use for many years - compilers are very good at figuring out what to put in registers by themselves. Indeed in C++ the keyword is currently deprecated and will eventually be removed.

The register storage class originally hinted to the compiler that the variable so qualified was to be used so frequently that keeping its value in memory would be a performance drawback. The vast majority of CPU architectures (maybe not SPARC? Not even certain there's a counterexample) cannot perform any operation between two variables without first loading one or both from memory into its registers. Loading variables from memory into registers and writing them back to memory once operated upon takes many times more CPU cycles than the operations themselves. Thus, if a variable is used frequently, one can achieve a performance gain by setting aside a register for it and not bothering with memory at all.
Doing so, however, has a variety of requirements. Many are different for every CPU architecture:
All processors have a fixed number of registers, but each processor model has a different number. In the 80s you might have had 4 that could reasonably be used for a register variable.
Most processors do not support the use of every register for every instruction. In the 80s it was not uncommon to have only one register that you could use for addition and subtraction, and you probably couldn't use that same register as a pointer.
Calling conventions dictated differing sets of registers that could be expected to be overwritten by subroutines i.e. function calls.
The size of a register differs between processors, so there are cases where a register variable will not fit in a register.
Because C is intended to be independent of platform, these restrictions could not be enforced by the standard. In other words, while it may be impossible to compile a procedure with 20 register variables for a system that only had 4 machine registers, the C program itself should not be "wrong", as there is no logical reason a machine cannot have 20 registers. Thus, the register storage class was always just a hint that the compiler could ignore if the specific target platform would not support it.
The inability to reference a register is different. A register is specifically not kept updated in memory and not kept current if changes are made to memory; that's the whole point of the storage class. Since they are not intended to have a guaranteed representation in memory, they cannot logically have an address in memory that will be meaningful to external code that may obtain the pointer. Registers have no address to their own CPU, and they almost never have an address accessible to any coprocessor. Therefore, any attempt to obtain a reference to a register is always a mistake. The C standard could comfortably enforce this rule.
As computing evolved, however, some trends developed that weakened the purpose of the register storage class itself:
Processors came with greater numbers of registers. Today you probably have at least 16, and they can probably all be used interchangeably for most purposes.
Multi-core processors and distributed code execution has become very common; only one core has access to any one register and they never share without involving memory anyway.
Algorithms for allocating registers to variables became very effective.
Indeed, compilers are now so good at allocating variables to registers that they will usually do a better job at optimization than any human. They certainly know which ones you are using most frequently without you telling them. It would be more complicated for the compiler (i.e. not for the standard or for the programmer) to produce these optimizations if they were required to honor your manual register hints. It became increasingly common for compilers to categorically ignore them. By the time C++ existed, it was obsolete. It is included in the standard for backward compatibility, to keep C++ as close as possible to a proper superset of C. The requirements of a compiler to honor the hint and thus the requirements to enforce the conditions under which the hint could be honored were weakened accordingly. Today, the storage class itself is deprecated.
Therefore, even though it is still the case today (and will be until computers don't even have registers) that you cannot logically have a reference to a CPU register, the expectation that the register storage class will be honored is so long gone that it is unreasonable for the standard to require compilers to require you to be logical in your use of it.

A referenced register would be the register itself. If the calling function passed ESI as a referenced parameter, then the called function would use ESI as the parameter. As pointed out by Alan Stokes, the issue is if another function also calls the same function, but this time with EDI as the same referenced parameter.
In order for this to work, two overloaded like instances of the called function would need to be created, one taking ESI as a parameter, one taking EDI as a parameter. I don't know if any actual C++ compiler actually implements such an optimization in general, but that is how this could be done.
One example of register by reference is the way std::swap() gets optimized (both parameters are references), which often ends up as inlined code. Sometimes no swap takes place: for example, std::swap(a, b), no swap takes place, instead the sense of a and b is swapped in the code that follows (references to what was a become references to b and vice versa).
Otherwise, a reference parameter will force the variable to be located in memory instead of a register.

would you recommend using assembly to access arguments in this exceptional case?

consider the following function that won't get inlined and assume x86 as platform:
void doSomething(int & in){
//do something
}
firstly I'm not sure such scenario would happen but since I think it is possible I'm gonna ask so IF whenever in any caller this function is called the argument to be supplied lies exactly at the top of the caller stack frame so that in the called function access to that through ebp register(after callee has moved content of esp into ebp) in assembly language is possible do you suggest we ignore declaring a parameter at all for function and use assembly to access our arguments in this exceptional case or just leave function definition as it was and leave it to compiler to do what it does? since I haven't read anywhere that compiler would consider such exceptional case as a factor for calling convention and I think it'll simply generate code to pass a pointer to the argument to the callee stack frame or one of registers

First of all, it's SO easy for this to break - for example, you get a different version of compiler, that generates code differently. Or you change optimisation features. Never mind the situation where you suddenly need to use doSomething in a different place and then it won't work, because the variable is no longer on the top of the stack.
Second, assuming that the code inside the function is short enough, it's highly likely that the compiler will inline the function, so you don't "lose" anything at all.
Third, a single argument in modern compilers, is typically passed in a register anyway, so there is no benefit in this when optimisation is enabled.
If you really think there is worthwhile benefit in this, and the compiler won't inline or otherwise optimise the code [have you looked at the generated code?], then try using forceinline or always_inline or whatever it is called in your compiler (most compilers have such an option). If that doesn't work, use a macro to inline it by hand. Or simply move the code to where it is called by "copy-n-paste".

Your note "the argument to be supplied lies exactly at the top of the caller stack frame so that in the called function access to that through ebp register" contains a factual misunderstanding.
That's because of the following things:
you're assuming a stack-based calling convention, i.e. function arguments being pushed to the stack by the caller before calling the function. That's not generally the case; even on 32bit x86, there's non-stack-based calling conventions (for example, Windows fastcall or the GNU GCC ones used in the 32bit linux kernel). If such are used, the argument wouldn't be found on top of the stack, but rather in ... whatever register is used to hold the first argument.
But even if you have stack-based parameter passing ... still:
you've missed that on x86 at the very least the call instruction is pushing a return address onto the top of the stack, so that when the first instruction of a function reached that way is executing, ESP will not point to the first arg of that function, but to the return address.
you've missed that EBP is a callee-saved (preserved over function calls) registers, and not initialized on your behalf by the architecture - it's necessary for the generated code to explicitly set it up. A function which wants to use it (even if only as a framepointer) is therefore obliged to save it somewhere before using it. That means the normal prologue will have push EBP; mov EBP, ESP (you cannot only do MOV EBP, ESP because that would overwrite the caller's EBP which is invalid / which you may not do). Therefore, if you like to refer to the first argument of the function, you'd need [ EBP + 8 ] not [ EBP ].
If you're not using framepointers, then the first argument (due to the call which was used to reach the function having pushed a return address) is at [ ESP + 4 ] not [ ESP ].
I hope this clarifies a little.
I agree with the other posters that clarifying the question would help, what exactly you want to achieve and why you think assembly language might be useful here.

No, I would not. Calling conventions may vary (between x86 and x86_64); Parameters could be pushed to the stack or put into register, and I'm not sure you can know for sure where they'll be.
Writing this in assembly, unless you really know what you're doing is likely to lead to undefined behavior code.

C++ (nested) function call instructions - registers

In C++ FAQ:
Assuming a typical C++ implementation that has registers and a stack,
the registers and parameters get written to the stack just before the
call to g(), then the parameters get read from the stack inside g()
and read again to restore the registers while g() returns to f().
regarding the nested function call
void f()
{
int x = /*...*/;
int y = /*...*/;
int z = /*...*/;
...code that uses x, y and z...
g(x, y, z);
...more code that uses x, y and z...
}
1/ Are all implementation of C++ with registers and stack? Does it mean: implementation dependent on compiler/processor/computer architecture?
2/ What is sequence of instructions (without assembly language, just the big picture) when i call f() ? I have read diverging things on this topic, and also I don't remember that registers where mentionned, but only stack.
3/ what are additional specificities/points to underline when you deal with nested functions?
thanks

For number 2 this depends on many things including the compiler and the platform. Basically the different ways of passing and returning arguments to functions are called calling conventions. the article Calling conventions on the x86 platform goes into some detail on sequence of operations and you can see how ugly and complicated it gets with just this small combination of platforms and compilers which is most likely why you have heard all sorts of different scenarios, The gen on function calling conventions. covers a wider set of scenarios including 64 bit platforms but is harder to read. It gets even more complicated because gcc may not actually push and pop the stack but directly manipulate the stack pointer, we can see an example of this, albeit in assembly here. It is hard to generalize about calling conventions, if the number of arguments is small enough many calling conventions can avoid using the stack at all and will use registers exclusively.
As to number 3, nested functions does not change anything, it will just repeat the procedure over again for the next function call.
As to number 1 As Sean pointed out .Net compiles to byte code with performs all it's operations on the stack. The Wikipedia page on Common Intermediate Language has a good example.
The x86-64 ABI document is another great document if you want to understand how one specific calling convention works in detail. Figure 3.5 and 3.6 are neat since they give a nice example of a function with many parameters and how each parameter is passed using a combination of general purpose registers, floating point registers and the stack. This sort of nice diagram is a rare find when looking at documents that cover calling conventions.

1. Although register/stack implementations are the most common underlying implementations of a C++ compiler there's nothing to stop you using a different architecture. For example, you could write a compiler to generate Java bytecode or .NET byte code, in which case you'd have a stack based C++ compiler.
2. When you call f() the typical approach is:
Push the return address on the stack and jump to f()
In f():
Allocate space for the locals x,y and z. This is normally done on the stack. Take a look at this article on call stacks.
When you get to g(x,y,z) the compiler will generate code to push the values onto the stack by accessing their values in the stack frame of f(). Note that C/C++ pushes parameters from right to left.
When you get to the end of f() the compiler inserts a return instructions. The top of the stack has the address to return to (it was pushed prior to the call to f() )
3. There's nothing special about nested functions as everything follows the same basic template:
To call a function - push parameters and call the function.
Within the function - allocate space for local variables within a stack
Now this is the general approach. Compilers will introduce their own optimizations to improve performace. For example a compiler may choose to store the first 2 parameters in registers (for example).
NOTE: Although parameter passing by the stack is by far the most common approach there are others. Take a look at this article on register windows if you're interested in finding out more.

Visual Studio not able to show the value of 'this' in release mode (with debug information)

Original question:
Why is the this pointer 0 in a VS c++ release build?
When breaking in a Visual Studio 2008 SP1 release build with the /Zi (Compiler: Debug Information Format - Program Database) and /DEBUG (Linker: Generate Debug Info, yes) options, why are 'this'-pointers always 0x00000000?
EDIT: Rephrased question:
My original question was quite unclear, sorry for that. When using the Visual Studio 2008 debugger to step through a program I can see all variables, except the local object's member variables. This is probably cause the debugger derives these from the this pointer, but VS always says it's 0x00000000, so it cannot derive the current object's member variables (it does not know the memory position of the object)
When loading a megadump (Like a Windows minidump, but containing the entire memory space of the process), I can look at all my local variables (defined in the function) and entire tree-structures on the heap even I have pointers to.
For example: when breaking in A::foo() in Release mode
'this' will have value 0x00000000
'f_' will show garbage
Somehow this information needs to be available to the process. Is this a missing feature in VS2008? Any other debugger that does handle this properly?
class A
{
void foo() { /*break here*/ }
int f_;
};

As some others have mentioned, compiling in Release mode makes certain optimizations (especially eliminating the use of ebp/rbp as a frame pointer) that break assumptions on which the debugger relies for figuring out your local variables. However, knowing why it happens isn't very helpful for debugging your program!
Here's a way you can work around it: at the very beginning of a method call (breaking on the first line of the function, not the opening brace), the this pointer will always be found in a specific register (ecx on 32-bit systems or rcx on 64-bit systems). The debugger knows that and so you should be able to see the value of this right at the start of your method call. You can then copy the address from the Value column and watch that specifically (as (MyObject *)0x003f00f0 or whatever), which will allow you to see into this later in the method.
If that's not good enough (for example, because you only want to stop when a bug manifests itself, which is a very small percentage of the time the given method is called), you can try this slightly more advanced (and less reliable) trick. Usually, the this pointer is taken out of ecx/rcx very early in a function call, because that is a "caller-saves" register, meaning that its value may be clobbered and not restored by function calls your method makes (it's also needed for some instructions that can only use that register for their operand, like REP* and some of the shift instructions). However, if your method uses the this pointer a lot (including the implicit use of referring to member variables or calling virtual member functions), the compiler will probably have saved this in another register, a "callee-saves" register (meaning that any function that clobbers it must restore it before returning).
The practical upshot of this is that, in your watch window, you can try looking at (MyObject *) ebp, (MyObject *) esi, and so on with other registers, until you find that you're looking at a pointer that is probably the correct one (because the member variables line up with your expectation of the contents of this at the time of your breakpoint). On x86, the calle-saved registers are ebp, esi, edi, and ebx. On x86-64, they are rbp, rsi, rdi, rbx, r12, r13, r14, and r15. If you don't want to search all those, you could always try looking at the disassembly of your function prologue to see what ecx (or rcx) is being copied into.

Local variables (including this) when viewed in the Locals window cannot be relied upon in the Release build in the way that they can in Debug builds. Whether the variable value shown is correct at any given instruction depends on how the underlying register is being used at that point. If the code runs OK in Debug it's most unlikely that the value is actually 0.
Optimization in Release builds makes values in the Locals window a crap shoot, to the naked eye. Without concurrent display and correlation of the Disassembly window, you cannot be sure that the Locals window is telling you the actual value of the variable. If you step through the code (maybe in Disassembly not Source) to a line that actually uses this, it's more likely that you will see a valid value there.

Because you wrote a bugged program and called a member function on a NULL pointer.
Edit: Reread your question. Most likely, it's because the optimizer did a number on your code and the debugger can't read it anymore. If you have a problem specific to Release build, then it's a hint that your code has a dodgy #ifdef in it, or you invoked UB that just happens to work in Debug mode. Else, debug with Debug build. However, that's not terribly helpful if you actually have a problem in Release mode you can't find.

Your function foo is inline (it's declared in the class definition, so is implicitly inline), and doesn't access any members. Therefore the optimizer will likely not actually pass the this pointer at all when it compiles the code, so it is not available to the debugger.
In release builds, the optimizer will rearrange code quite substantially in order to improve performance, particularly with inline functions (though it does optimize other functions too, especially if whole program optimization is enabled). Rather than passing this, it may instead pass a pointer to a used member directly, or even just pass the member's value in a register that it loaded for a previous function call.
Sometimes the debug info is enough that the debugger can actually piece together a this pointer, and the values of local variables. Often, it is not, and the this pointer shown in the watch window (and consequently the member variables) are nonsense.

Because it is a release build. The entire point in optimizations is to change the implementation details of the program, while preserving the overall functionality.
Does the program still work? Then it doesn't matter that the this pointer is seemingly null.
In general, when you're working with a release build, you should expect that the debugger is going to get confused. Code is going to be reordered, variables removed entirely, or containing weird unexpected values.
When optimizations are enabled, no guarantees are given about any of these things. But the compiler won't break your program. If it worked without optimizations, it'll still work with optimizations. If it suddenly doesn't work, it's because you have a bug that was only exposed because the compiler optimized and modified the code.

Are they "const" functions?
A const function is one which is declared with the keyword const, and this indicates that it will not change any of the members, only read them (like accessor functions)
An optimising compiler may not bother passing the 'this' pointer to some const functions if it doesn't even read from non-static member variables
An optimising compiler may search for functions which could be const, make them constant, and then not pass a this pointer into them, causing the debugger to be unable to find the hook.

It isn't the this pointer that is NULL, but rather the pointer you are using to call a member function:
class A
{
public:
void f() {}
};
int main()
{
A* a = NULL;
a->f(); // DO'H! NULL pointer access ...
// FIX
A* a = new A;
a->f(); // Aha!
}

As others already said you should make sure that the compiler does not do anything which can confuse the debugger, optimizations are likely to do.
The fact that you have NULL pointer can happen IF you call the function statically like :
A* b=NULL;
b->foo();
The function is not static here but called a static way.
The best spot to find the real this pointer is the take a look at the stack. For non-static class functions the this pointer MUST be the first ( hidden ) argument of your function.
class A
{
void foo() { } // this is "void foo(A *this)" really
int f_;
};
If your this prointer is null here, then you have problem before calling the function. If the pointer is correct here then you debugger is kinda messed up.
I've been using Code::Blocks with Mingw for years now, with the built in debugger ( gdb )
I only have problems with the pointer when I had optimizations turned on, otherwise it always knows the this pointer and can dreference it any time.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js