Compiler mov'es this pointer to wrong address - c++

I have a simple polymorphic construction, with one pure virtual function Foo.
The only flaw in the big project it's used in is that the project uses a couple of global statics for centralized parameter loading and for event logging (can't easily get rid of that legacy code).
Project info:
Platform toolset: v110_xp
MFC in static library
MBCS charset
calling convention: __cdecl
All optimizations disabled
Warning level 4, no warnings on the whole project
Code:
class Base
{
public:
Base(){}
virtual ~Base(void){}
virtual void Foo(void) = 0;
};
class Derived
: public Base
{
public:
Derived(void) : Base(){}
virtual void Foo(void) override
{
double a = sqrt(4.9);
double b = -a;
}
Calling code (doesn't really matter, same behaviour everywhere)
BOOL MainMFCApp::InitInstance()
{
Derived* d = new Derived();
d->Foo();
delete d;
...
}
The problem is that when run in debug (not tested with release), and when we end up inside function Foo, the this pointer is 'corrupted':
this = 0xcccccccc
this.__vfptr = <unable to read memory>
When I dive into the assembly code entering the function I see the following:
13:
14:
15: void Derived::Foo(void)
16: {
015E3570 55 push ebp
015E3571 8B EC mov ebp,esp
015E3573 83 E4 F8 and esp,0FFFFFFF8h
015E3576 81 EC EC 00 00 00 sub esp,0ECh
015E357C 53 push ebx
015E357D 56 push esi
015E357E 57 push edi
015E357F 51 push ecx
015E3580 8D BD 14 FF FF FF lea edi,[ebp-0ECh]
015E3586 B9 3B 00 00 00 mov ecx,3Bh
015E358B B8 CC CC CC CC mov eax,0CCCCCCCCh
015E3590 F3 AB rep stos dword ptr es:[edi]
015E3592 59 pop ecx
015E3593 89 8C 24 F0 00 00 00 mov dword ptr [esp+0F0h],ecx
17: double a = sqrt(4.9);
015E359A F2 0F 10 05 00 55 09 02 movsd xmm0,mmword ptr ds:[2095500h]
015E35A2 E8 72 D4 FD FF call __libm_sse2_sqrt_precise (015C0A19h)
015E35A7 F2 0F 11 84 24 E0 00 00 00 movsd mmword ptr [esp+0E0h],xmm0
18: double b = -a;
015E35B0 F2 0F 10 84 24 E0 00 00 00 movsd xmm0,mmword ptr [esp+0E0h]
015E35B9 66 0F 57 05 10 55 09 02 xorpd xmm0,xmmword ptr ds:[2095510h]
015E35C1 F2 0F 11 84 24 D0 00 00 00 movsd mmword ptr [esp+0D0h],xmm0
19: return;
20: }
015E35CA 5F pop edi
015E35CB 5E pop esi
015E35CC 5B pop ebx
015E35CD 8B E5 mov esp,ebp
015E35CF 5D pop ebp
015E35D0 C3 ret
--- No source file -------------------------------------------------------------
015E35D1 CC int 3
...
015E35EF CC int 3
Breakpoint at line 17, right before entering the function body: Using the watch window to inspect the object behind register ecx (with a cast to Derived*) shows that ecx contains the pointer I need (to the object), but for some reason it is mov'ed to the seemingly random address [esp+0F0h].
And now the really interesting/flabbergasting part: When I change
double b = -a;
to
double b = -1.0 * a;
and compile again, everything magically works. The function assembly has now changed to:
13:
14:
15: void Derived::Foo(void)
16: {
00863570 55 push ebp
00863571 8B EC mov ebp,esp
00863573 81 EC EC 00 00 00 sub esp,0ECh
00863579 53 push ebx
0086357A 56 push esi
0086357B 57 push edi
0086357C 51 push ecx
0086357D 8D BD 14 FF FF FF lea edi,[ebp-0ECh]
00863583 B9 3B 00 00 00 mov ecx,3Bh
00863588 B8 CC CC CC CC mov eax,0CCCCCCCCh
0086358D F3 AB rep stos dword ptr es:[edi]
0086358F 59 pop ecx
00863590 89 4D F8 mov dword ptr [this],ecx
17: double a = sqrt(4.9);
00863593 F2 0F 10 05 00 55 31 01 movsd xmm0,mmword ptr ds:[1315500h]
0086359B E8 79 D4 FD FF call __libm_sse2_sqrt_precise (0840A19h)
008635A0 F2 0F 11 45 E8 movsd mmword ptr [a],xmm0
18: double b = -1.0 * a;
008635A5 F2 0F 10 05 10 55 31 01 movsd xmm0,mmword ptr ds:[1315510h]
008635AD F2 0F 59 45 E8 mulsd xmm0,mmword ptr [a]
008635B2 F2 0F 11 45 D8 movsd mmword ptr [b],xmm0
19: return;
20: }
008635B7 5F pop edi
008635B8 5E pop esi
008635B9 5B pop ebx
008635BA 81 C4 EC 00 00 00 add esp,0ECh
008635C0 3B EC cmp ebp,esp
008635C2 E8 32 CD FC FF call __RTC_CheckEsp (08302F9h)
008635C7 8B E5 mov esp,ebp
008635C9 5D pop ebp
008635CA C3 ret
--- No source file -------------------------------------------------------------
008635CB CC int 3
...
008635EF CC int 3
Now the generated code nicely moves the pointer in register ecx to this. Other difference:
different memory addresses/offsets
mulsd instead of xorpd to negate the variable
and esp,0FFFFFFF8h disappeared (?? used to align the stack pointer esp ??)
more cleanup (after the function body)?? (add cmp call)
The assembly part where the parameters get pushed to the stack is the same for both situations:
53: d->Foo();
011A500B 8B 45 E0 mov eax,dword ptr [d]
011A500E 8B 10 mov edx,dword ptr [eax]
011A5010 8B F4 mov esi,esp
011A5012 8B 4D E0 mov ecx,dword ptr [d]
011A5015 8B 42 04 mov eax,dword ptr [edx+4]
011A5018 FF D0 call eax
Of course when I try to replicate this with a Minimal, Complete, and Verifiable example, everything works as intened. But in my big project, it fails consistently.
I'm not sure which parameters can influence compilation, and don't know enough of assembly to even see what's going on there;
therefor I'm asking here in the hope that someone has seen this before or recognizes this behaviour.
Note: it also works again, when I remove the sqrt call.
Update:
No problems in release
VS2012 SP4 (v11.0.61030.00)
problem persists when referencing member variables (iso no member references)
TODO: try without global statics

Related

C++ corrupts registers [duplicate]

This question already has answers here:
What are callee and caller saved registers?
(6 answers)
Calling convention on x64 [duplicate]
(1 answer)
Closed 1 year ago.
When calling a C++ function, the RAX AND RCX registers are changed.
How can I force the compiler to store the value of the registers?
I could call push pop, but I plan to call more complex functions and it is important for me that the registers are not corrupted (even for floating point numbers).
Architecture: amd64
IDE: Visual Studio 19
.data
extern TestCall: proto
.code
HookFunc proc
mov rax, [rbp + 8h]
call TestCall
ret
HookFunc endp
end
extern "C" void TestCall()
{
cout << "TestCall" << endl;
}
Disassembler
extern "C" __declspec(dllexport) void TestCall()
{
00007FFAA9C76FB0 40 55 push rbp
00007FFAA9C76FB2 57 push rdi
00007FFAA9C76FB3 48 81 EC E8 00 00 00 sub rsp,0E8h
00007FFAA9C76FBA 48 8D 6C 24 20 lea rbp,[rsp+20h]
00007FFAA9C76FBF 48 8D 0D 70 B0 01 00 lea rcx,[__ED185583_dllmain#cpp (07FFAA9C92036h)]
00007FFAA9C76FC6 E8 DD A7 FF FF call __CheckForDebuggerJustMyCode (07FFAA9C717A8h)
cout << "TestCall" << endl;
00007FFAA9C76FCB 48 8D 15 EE EF 00 00 lea rdx,[string "TestCall" (07FFAA9C85FC0h)]
00007FFAA9C76FD2 48 8B 0D 7F 82 01 00 mov rcx,qword ptr [__imp_std::cout (07FFAA9C8F258h)]
00007FFAA9C76FD9 E8 FE A0 FF FF call std::operator<<<std::char_traits<char> > (07FFAA9C710DCh)
00007FFAA9C76FDE 48 8D 15 66 A0 FF FF lea rdx,[std::endl<char,std::char_traits<char> > (07FFAA9C7104Bh)]
00007FFAA9C76FE5 48 8B C8 mov rcx,rax
00007FFAA9C76FE8 FF 15 92 82 01 00 call qword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (07FFAA9C8F280h)]
}
00007FFAA9C76FEE 48 8D A5 C8 00 00 00 lea rsp,[rbp+0C8h]
00007FFAA9C76FF5 5F pop rdi
00007FFAA9C76FF6 5D pop rbp
00007FFAA9C76FF7 C3 ret

C++ explicit template definition - code is still duplicated

I have some template code that implements pretty heavy computations, but I only need it for floats and doubles. The goal is that the template instantiation is only done once in one compilation unit and not repeated for every file.
I tried to follow the ideas from the following Stackoverflow posts:
using extern template (C++11)
Can I put `extern template` into a header file?
separating constructor implementation with template from header file
and similiar duplicate questions. I came up with the following test to illustrate the issue:
A.h
#pragma once
#include <cmath>
template<typename T>
struct A
{
static T foo(T a, T b)
{
//do some heavy computations
T v1 = pow(a, b);
return pow(v1, b);
}
};
//explicit template instantiations, the declaration
extern template struct A<float>;
extern template struct A<double>;
A.cpp
#include "A.h"
//explicit template instantiations, the definition
template struct A<float>;
template struct A<double>;
Main.cpp
#include "A.h"
int main()
{
//use A
float result = A<float>::foo(0, 0);
return (int)result; //return it so that it doesn't get optimized away
}
When I now look at the generated .obj file (dumpbin /DISASM), I get the following output:
A.obj
Dump of file A.obj
File Type: COFF OBJECT
?foo#?$A#M##SAMMM#Z (public: static float __cdecl A<float>::foo(float,float)):
0000000000000000: F3 0F 11 4C 24 10 movss dword ptr [rsp+10h],xmm1
0000000000000006: F3 0F 11 44 24 08 movss dword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
0000000000000031: F3 0F 10 85 00 01 movss xmm0,dword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call ?pow##YAMMM#Z
000000000000003E: F3 0F 11 45 04 movss dword ptr [rbp+4],xmm0
0000000000000043: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
000000000000004B: F3 0F 10 45 04 movss xmm0,dword ptr [rbp+4]
0000000000000050: E8 00 00 00 00 call ?pow##YAMMM#Z
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret
?foo#?$A#N##SANNN#Z (public: static double __cdecl A<double>::foo(double,double)):
0000000000000000: F2 0F 11 4C 24 10 movsd mmword ptr [rsp+10h],xmm1
0000000000000006: F2 0F 11 44 24 08 movsd mmword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F2 0F 10 8D 08 01 movsd xmm1,mmword ptr [rbp+108h]
00 00
0000000000000031: F2 0F 10 85 00 01 movsd xmm0,mmword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call pow
000000000000003E: F2 0F 11 45 08 movsd mmword ptr [rbp+8],xmm0
0000000000000043: F2 0F 10 8D 08 01 movsd xmm1,mmword ptr [rbp+108h]
00 00
000000000000004B: F2 0F 10 45 08 movsd xmm0,mmword ptr [rbp+8]
0000000000000050: E8 00 00 00 00 call pow
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret
....
Main.obj
Dump of file Main.obj
File Type: COFF OBJECT
?foo#?$A#M##SAMMM#Z (public: static float __cdecl A<float>::foo(float,float)):
0000000000000000: F3 0F 11 4C 24 10 movss dword ptr [rsp+10h],xmm1
0000000000000006: F3 0F 11 44 24 08 movss dword ptr [rsp+8],xmm0
000000000000000C: 55 push rbp
000000000000000D: 57 push rdi
000000000000000E: 48 81 EC 18 01 00 sub rsp,118h
00
0000000000000015: 48 8D 6C 24 30 lea rbp,[rsp+30h]
000000000000001A: 48 8B FC mov rdi,rsp
000000000000001D: B9 46 00 00 00 mov ecx,46h
0000000000000022: B8 CC CC CC CC mov eax,0CCCCCCCCh
0000000000000027: F3 AB rep stos dword ptr [rdi]
0000000000000029: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
0000000000000031: F3 0F 10 85 00 01 movss xmm0,dword ptr [rbp+100h]
00 00
0000000000000039: E8 00 00 00 00 call ?pow##YAMMM#Z
000000000000003E: F3 0F 11 45 04 movss dword ptr [rbp+4],xmm0
0000000000000043: F3 0F 10 8D 08 01 movss xmm1,dword ptr [rbp+108h]
00 00
000000000000004B: F3 0F 10 45 04 movss xmm0,dword ptr [rbp+4]
0000000000000050: E8 00 00 00 00 call ?pow##YAMMM#Z
0000000000000055: 48 8D A5 E8 00 00 lea rsp,[rbp+0E8h]
00
000000000000005C: 5F pop rdi
000000000000005D: 5D pop rbp
000000000000005E: C3 ret
....
A::foo is instantiated in A.obj as expected. But the code is again put into Main.obj as well, completely ignoring the extern keyword.
How can I tell the compiler (Visual Studio 2017, Release mode) to NOT inline the method, but to use the version from A.obj?
You can do that with __declspec(noinline).
But inlined version will likely be faster. If you worry about binary size, your .exe file will only have a single instance of that function. The code from A.obj is unused and will be discarded by linker during dead code elimination step.
Update: Put this in your A.h:
static __declspec( noinline ) T foo( T a, T b )
{
//do some heavy computations
T v1 = pow( a, b );
return pow( v1, b );
}
I’ve built with Visual C++ 2017 15.6.7, Release 32 and 64 bits, for both platforms Main.cpp compiles to this:
; Line 5
call ?foo#?$A#M##SAMMM#Z ; A<float>::foo
; Line 6
cvttss2si eax, xmm0
However, if you’re doing that trying to decrease compilation time, I’m not sure noinline gonna help. Instead, remove the function body from A.h (leave declaration), move it into A.cpp. Ideally, also remove eigen headers from A.h (or leave bare minimum that define data structures), and include eigen headers into A.cpp.

C++ assembly code analysis (compiled with clang)

I am trying to figure out how the C++ binary code looks like, especially for virtual function calls. I have come up with few curious things. I have this following C++ code:
#include <iostream>
using namespace std;
class Base {
public:
virtual void print() { cout << "from base" << endl; }
};
class Derived : public Base {
public:
virtual void print() { cout << "from derived" << endl; }
};
int main() {
Base *b;
Derived d;
d.print();
b = &d;
b->print();
return 0;
}
I compiled it with clang++, and then use objdump:
00000000004008b0 <main>:
4008b0: 55 push rbp
4008b1: 48 89 e5 mov rbp,rsp
4008b4: 48 83 ec 20 sub rsp,0x20
4008b8: 48 8d 7d e8 lea rdi,[rbp-0x18]
4008bc: c7 45 fc 00 00 00 00 mov DWORD PTR [rbp-0x4],0x0
4008c3: e8 28 00 00 00 call 4008f0 <Derived::Derived()>
4008c8: 48 8d 7d e8 lea rdi,[rbp-0x18]
4008cc: e8 5f 00 00 00 call 400930 <Derived::print()>
4008d1: 48 8d 7d e8 lea rdi,[rbp-0x18]
4008d5: 48 89 7d f0 mov QWORD PTR [rbp-0x10],rdi
4008d9: 48 8b 7d f0 mov rdi,QWORD PTR [rbp-0x10]
4008dd: 48 8b 07 mov rax,QWORD PTR [rdi]
4008e0: ff 10 call QWORD PTR [rax]
4008e2: 31 c0 xor eax,eax
4008e4: 48 83 c4 20 add rsp,0x20
4008e8: 5d pop rbp
4008e9: c3 ret
4008ea: 66 0f 1f 44 00 00 nop WORD PTR [rax+rax*1+0x0]
My question is why in assembly code, we have the following code:
4008b8: 48 8d 7d e8 lea rdi,[rbp-0x18]
4008d1: 48 8d 7d e8 lea rdi,[rbp-0x18]
The local variable d in main() is stored at location [rbp-0x18]. This is in the automatic storage allocated on the stack for main().
lea rdi,[rbp-0x18]
This line loads the address of d into the rdi register. By convention, member functions of Derived treat rdi as the this pointer.

Why does GAS inline assembly wrapped in a function generate different instructions for the caller than a pure assembly function

I've been writing some basic functions using GCC's asm to practice for an actual application.
My functions pretty, wrap, and pure generate the same instructions to unpack a 64 bit integer into a 128 bit vector. add1 and add2 which call pretty and wrap respectively also generate the same instructions. But add3 differs by saving its xmm0 register by pushing it to the stack rather than by copying it to another xmm register. This I don't understand because the compiler can see the details of pure to know none of the other xmm registers will be clobbered.
Here is the C++
#include <immintrin.h>
__m128i pretty(long long b) { return (__m128i){b,b}; }
__m128i wrap(long long b) {
asm ("mov qword ptr [rsp-0x10], rdi\n"
"vmovddup xmm0, qword ptr [rsp-0x10]\n"
:
: "r"(b)
);
}
extern "C" __m128i pure(long long b);
asm (".text\n.global pure\n\t.type pure, #function\n"
"pure:\n\t"
"mov qword ptr [rsp-0x10], rdi\n\t"
"vmovddup xmm0, qword ptr [rsp-0x10]\n\t"
"ret\n\t"
);
__m128i add1(__m128i in, long long in2) { return in + pretty(in2);}
__m128i add2(__m128i in, long long in2) { return in + wrap(in2);}
__m128i add3(__m128i in, long long in2) { return in + pure(in2);}
Compiled with g++ -c so.cpp -march=native -masm=intel -O3 -fno-inline and disassembled with objdump -d -M intel so.o | c++filt.
so.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <pure>:
0: 48 89 7c 24 f0 mov QWORD PTR [rsp-0x10],rdi
5: c5 fb 12 44 24 f0 vmovddup xmm0,QWORD PTR [rsp-0x10]
b: c3 ret
c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
0000000000000010 <pretty(long long)>:
10: 48 89 7c 24 f0 mov QWORD PTR [rsp-0x10],rdi
15: c5 fb 12 44 24 f0 vmovddup xmm0,QWORD PTR [rsp-0x10]
1b: c3 ret
1c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
0000000000000020 <wrap(long long)>:
20: 48 89 7c 24 f0 mov QWORD PTR [rsp-0x10],rdi
25: c5 fb 12 44 24 f0 vmovddup xmm0,QWORD PTR [rsp-0x10]
2b: c3 ret
2c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
0000000000000030 <add1(long long __vector(2), long long)>:
30: c5 f8 28 c8 vmovaps xmm1,xmm0
34: 48 83 ec 08 sub rsp,0x8
38: e8 00 00 00 00 call 3d <add1(long long __vector(2), long long)+0xd>
3d: 48 83 c4 08 add rsp,0x8
41: c5 f9 d4 c1 vpaddq xmm0,xmm0,xmm1
45: c3 ret
46: 66 2e 0f 1f 84 00 00 nop WORD PTR cs:[rax+rax*1+0x0]
4d: 00 00 00
0000000000000050 <add2(long long __vector(2), long long)>:
50: c5 f8 28 c8 vmovaps xmm1,xmm0
54: 48 83 ec 08 sub rsp,0x8
58: e8 00 00 00 00 call 5d <add2(long long __vector(2), long long)+0xd>
5d: 48 83 c4 08 add rsp,0x8
61: c5 f9 d4 c1 vpaddq xmm0,xmm0,xmm1
65: c3 ret
66: 66 2e 0f 1f 84 00 00 nop WORD PTR cs:[rax+rax*1+0x0]
6d: 00 00 00
0000000000000070 <add3(long long __vector(2), long long)>:
70: 48 83 ec 18 sub rsp,0x18
74: c5 f8 29 04 24 vmovaps XMMWORD PTR [rsp],xmm0
79: e8 00 00 00 00 call 7e <add3(long long __vector(2), long long)+0xe>
7e: c5 f9 d4 04 24 vpaddq xmm0,xmm0,XMMWORD PTR [rsp]
83: 48 83 c4 18 add rsp,0x18
87: c3 ret
GCC does not understand assembly language.
Since pure is an external function it cannot determine which registers it alters so according to the ABI has to assume all the xmm registers are changed.
wrap has undefined behaviour as the asm statement clobbers xmm0 and [rsp-0x10] which are not listed as clobbers or outputs (to a value which may or may not depend on b), and the function has no return statement.
Edit: The ABI does not apply to inline assembly, I expect your program will not work if you remove -fno-inline from the command line.

Is accessing c++ member class through "this->member" faster/slower than implicit call to "member"

After some searching on our friend google, I could not get a clear view on the following point.
I'm used to call class members with this->. Even if not needed, I find it more explicit as it helps when maintaining some heavy piece of algorithm with loads of vars.
As I'm working on a supposed-to-be-optimised algorithm, I was wondering whether using this-> would alter runtime performance or not.
Does it ?
No, the call is exactly the same in both cases.
It doesn't make any difference. Here's a demonstration with GCC. The source is simple class, but I've restricted this post to the difference for clarity.
% diff -s with-this.cpp without-this.cpp
7c7
< this->x = 5;
---
> x = 5;
% g++ -c with-this.cpp without-this.cpp
% diff -s with-this.o without-this.o
Files with-this.o and without-this.o are identical
Answer has been given by zennehoy and here's assembly code (generated by Microsoft C++ compiler) for a simple test class:
class C
{
int n;
public:
void boo(){n = 1;}
void goo(){this->n = 2;}
};
int main()
{
C c;
c.boo();
c.goo();
return 0;
}
Disassembly Window in Visual Studio shows that assembly code is the same for both functions:
class C
{
int n;
public:
void boo(){n = 1;}
001B2F80 55 push ebp
001B2F81 8B EC mov ebp,esp
001B2F83 81 EC CC 00 00 00 sub esp,0CCh
001B2F89 53 push ebx
001B2F8A 56 push esi
001B2F8B 57 push edi
001B2F8C 51 push ecx
001B2F8D 8D BD 34 FF FF FF lea edi,[ebp-0CCh]
001B2F93 B9 33 00 00 00 mov ecx,33h
001B2F98 B8 CC CC CC CC mov eax,0CCCCCCCCh
001B2F9D F3 AB rep stos dword ptr es:[edi]
001B2F9F 59 pop ecx
001B2FA0 89 4D F8 mov dword ptr [ebp-8],ecx
001B2FA3 8B 45 F8 mov eax,dword ptr [this]
001B2FA6 C7 00 01 00 00 00 mov dword ptr [eax],1
001B2FAC 5F pop edi
001B2FAD 5E pop esi
001B2FAE 5B pop ebx
001B2FAF 8B E5 mov esp,ebp
001B2FB1 5D pop ebp
001B2FB2 C3 ret
...
--- ..\main.cpp -----------------------------
void goo(){this->n = 2;}
001B2FC0 55 push ebp
001B2FC1 8B EC mov ebp,esp
001B2FC3 81 EC CC 00 00 00 sub esp,0CCh
001B2FC9 53 push ebx
001B2FCA 56 push esi
001B2FCB 57 push edi
001B2FCC 51 push ecx
001B2FCD 8D BD 34 FF FF FF lea edi,[ebp-0CCh]
001B2FD3 B9 33 00 00 00 mov ecx,33h
001B2FD8 B8 CC CC CC CC mov eax,0CCCCCCCCh
001B2FDD F3 AB rep stos dword ptr es:[edi]
001B2FDF 59 pop ecx
001B2FE0 89 4D F8 mov dword ptr [ebp-8],ecx
001B2FE3 8B 45 F8 mov eax,dword ptr [this]
001B2FE6 C7 00 02 00 00 00 mov dword ptr [eax],2
001B2FEC 5F pop edi
001B2FED 5E pop esi
001B2FEE 5B pop ebx
001B2FEF 8B E5 mov esp,ebp
001B2FF1 5D pop ebp
001B2FF2 C3 ret
And the code in the main:
C c;
c.boo();
001B2F0E 8D 4D F8 lea ecx,[c]
001B2F11 E8 00 E4 FF FF call C::boo (1B1316h)
c.goo();
001B2F16 8D 4D F8 lea ecx,[c]
001B2F19 E8 29 E5 FF FF call C::goo (1B1447h)
Microsoft compiler uses __thiscall calling convention by default for class member calls and this pointer is passed via ECX register.
There are several layers involved in the compilation of a language.
The difference between accessing member as member, this->member, MyClass::member etc... is a syntactic difference.
More precisely, it's a matter of name lookup, and how the front-end of the compiler will "find" the exact element you are referring to. Therefore, you might speed up compilation by being more precise... though it will be unnoticeable (there are much more time-consuming tasks involved in C++, like opening all those includes).
Since (in this case) you are referring to the same element, it should not matter.
Now, an interesting parallel can be done with interpreted languages. In an interpreted language, the name lookup will be delayed to the moment where the line (or function) is called. Therefore, it could have an impact at runtime (though once again, probably not really noticeable).