In the wake of this question about static methods in managed code, I'm interesting if the answers there is relevant to unmanaged code like c++.
I make thousands of instances, and my question is mainly about static methods. Do this methods save memory compared regular methods?
thank you, and sorry about my poor English.
All methods require their binary code to be in memory in order to run. The executable code for static and non-static methods is (largely) the same.
Both types of methods require only one place in memory, so they're not replicated with every instance of the class.
Let's now take a look at some code:
class A
{
public:
void foo();
static void goo();
};
void A::foo()
{
004113D0 push ebp
004113D1 mov ebp,esp
004113D3 sub esp,0CCh
004113D9 push ebx
004113DA push esi
004113DB push edi
004113DC push ecx
004113DD lea edi,[ebp-0CCh]
004113E3 mov ecx,33h
004113E8 mov eax,0CCCCCCCCh
004113ED rep stos dword ptr es:[edi]
004113EF pop ecx
004113F0 mov dword ptr [ebp-8],ecx
}
004113F3 pop edi
004113F4 pop esi
004113F5 pop ebx
004113F6 mov esp,ebp
004113F8 pop ebp
004113F9 ret
void A::goo()
{
00411530 push ebp
00411531 mov ebp,esp
00411533 sub esp,0C0h
00411539 push ebx
0041153A push esi
0041153B push edi
0041153C lea edi,[ebp-0C0h]
00411542 mov ecx,30h
00411547 mov eax,0CCCCCCCCh
0041154C rep stos dword ptr es:[edi]
}
0041154E pop edi
0041154F pop esi
00411550 pop ebx
00411551 mov esp,ebp
00411553 pop ebp
00411554 ret
int main()
{
A a;
a.foo();
0041141E lea ecx,[a]
00411421 call foo (4111E5h)
a.goo();
00411426 call A::goo (4111EAh)
return 0;
}
There are only minor differences, such as pushing the this pointer onto the stack for the non-static function, but they are minor, and probably a decent optimizer will reduce the differences even further.
A decision about whether or not to use static functions should be strictly design-driven, not memory-driven.
Static methods are essentially just free functions and so their memory footprint is the same. Member functions have an extra parameter and so the added memory is slightly larger, although it's meaningless to care about such things.
The amount of memory a function takes up is per-class, not per-instance. You shouldn't be concerned.
Short answer: No. A method is a function with an implicit first argument equal to its class, and a static function lacks this first argument. Actually, the situation is just the same as in garbage collected languages, so the answers to the other question apply fully.
The difference between a static and instance method is just the first parameter. In C++ all instance methods compile to a normal function with a substituted first parameter called this which is a pointer to the object on which the method was called.
On most architectures this will be an 8-byte value, so it's not really significant unless you're doing some very resource-strict embedded systems coding.
Related
Do most modern compilers end up optimizing the following code so that extra instructions aren't used for the object inner?
func Test(TypeObject *object):
InnerTypedObject *inner = object->inner
print(inner->a)
print(inner->b)
print(inner->c)
I figured that compilers would be able to figure out that inner->a and object->inner.a refer to the same thing, so it would avoid allocating inner altogether. I figured the local variable is probably saved on a register, so I'm not really concerned about performance. Mainly wanted to know if we'd get the same generated machine code.
Thanks to Jerry Coffin for the comment - my original answer was actually quite wrong...
For this code:
struct TypeObject {
int a;
int b;
int c;
};
void print(int x);
void test(TypeObject *object) {
print(object->a);
print(object->b);
print(object->c);
}
https://godbolt.org/g/SrNWkp produces something like this:
test(TypeObject*):
push rbx // save the rbx register
mov rbx, rdi // copy the parameter (which is "object") to rbx
mov edi, DWORD PTR [rbx] // copy inner->a to edi
call print(int)
mov edi, DWORD PTR [rbx+4] // copy inner->b to edi
call print(int)
mov edi, DWORD PTR [rbx+8] // copy inner->c to edi
jmp print(int)
pop rbx // restore rbx
And for this code:
struct InnerTypedObject {
int a;
int b;
int c;
};
struct TypeObject {
InnerTypedObject * inner;
};
void print(int x);
void test(TypeObject *object) {
InnerTypedObject *inner = object->inner;
print(inner->a);
print(inner->b);
print(inner->c);
}
https://godbolt.org/g/NC2pa3 produces something like this:
test(TypeObject*):
push rbx // save the rbx register
mov rbx, QWORD PTR [rdi] // copy "*object" (which is "inner") to rbx
mov edi, DWORD PTR [rbx] // copy inner->a to edi
call print(int)
mov edi, DWORD PTR [rbx+4] // copy inner->b to edi
call print(int)
mov edi, DWORD PTR [rbx+8] // copy inner->c to edi
jmp print(int)
pop rbx // restore rbx
So the code is still dereferencing object - it stores the pointer once and then uses it three times just like the original code did. The reason for not being able to optimize it better is that what is stored in a pointer is extremely hard to track so the optimizer has to assume it doesn't know what is in there for sure.
Even though both bits of assembly have the same number of instructions, there is an extra memory dereference in the one with "inner" so it could be expensive if the data isn't already in the cache.
I have a question about performance. I think this can also applies to other languages (not only C++).
Imagine that I have this function:
int addNumber(int a, int b){
int result = a + b;
return result;
}
Is there any performance improvement if I write the code above like this?
int addNumber(int a, int b){
return a + b;
}
I have this question because the second function doesn´t declare a 3rd variable. But would the compiler detect this in the first code?
To answer this question you can look at the generated assembler code. With -O2, x86-64 gcc 6.2 generates exactly the same code for both methods:
addNumber(int, int):
lea eax, [rdi+rsi]
ret
addNumber2(int, int):
lea eax, [rdi+rsi]
ret
Only without optimization turned on, there is a difference:
addNumber(int, int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-20], edi
mov DWORD PTR [rbp-24], esi
mov edx, DWORD PTR [rbp-20]
mov eax, DWORD PTR [rbp-24]
add eax, edx
mov DWORD PTR [rbp-4], eax
mov eax, DWORD PTR [rbp-4]
pop rbp
ret
addNumber2(int, int):
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], edi
mov DWORD PTR [rbp-8], esi
mov edx, DWORD PTR [rbp-4]
mov eax, DWORD PTR [rbp-8]
add eax, edx
pop rbp
ret
However, performance comparison without optimization is meaningless
In principle there is no difference between the two approaches. The majority of compilers have handled this type of optimisation for some decades.
Additionally, if the function can be inlined (e.g. its definition is visible to the compiler when compiling code that uses such a function) the majority of compilers will eliminate the function altogether, and simply emit code to add the two variables passed and store the result as required by the caller.
Obviously, the comments above assume compiling with a relevant optimisation setting (e.g. not doing a debug build without optimisation).
Personally, I would not write such a function anyway. It is easier, in the caller, to write c = a + b instead of c = addNumber(a, b), so having a function like that offers no benefit to either programmer (effort to understand) or program (performance, etc). You might as well write comments that give no useful information.
c = a + b; // add a and b and store into c
Any self-respecting code reviewer would complain bitterly about uninformative functions or uninformative comments.
I'd only use such a function if its name conveyed some special meaning (i.e. more than just adding two values) for the application
c = FunkyOperation(a,b);
int FunkyOperation(int a, int b)
{
/* Many useful ways of implementing this operation.
One of those ways happens to be addition, but we need to
go through 25 pages of obscure mathematical proof to
realise that
*/
return a + b;
}
I know this has been discussed a few times, but my situation is a bit different.
I have a third-party dll exporting some classes. Unfortunately, the header file is not available.
It is still possible to call exported functions. But I cannot get around passing the right 'this' pointer (which is passed in RCX register).
First I use dumpbin /exports to extract the function names (names are changed as the third-party library and function names are confidential).
4873 1308 0018B380 ?GetId#ThirdPartyClass#ThirdPartyNamespace##QEBAJXZ = ??GetId#ThirdPartyClass#ThirdPartyNamespace##QEBAJXZ (public: long __cdecl ThirdPartyNamespace::ThirdPartyClass::GetId(void)const )
Now, the API allows me to register my callback that receives a pointer to ThirdPartyNamespace::ThirdPartyClass (there is only forward declaration of ThirdPartyClass).
Here how I am trying to call ThirdPartyNamespace::ThirdPartyClass::GetId():
long (ThirdPartyNamespace::ThirdPartyClass::*_pFnGetId)() const;
HMODULE hModule = GetModuleHandle("ThirdPartyDLL.dll");
*(FARPROC*)&_pFnGetId= GetProcAddress(hModule, "?GetId#ThirdPartyClass#ThirdPartyNamespace##QEBAJXZ");
long id = (ptr->*_pFnGetId)();
Everything looks fine (i.e. if I step in - I get indeed inside ThirdPartyClass::GetId method. But the this pointer is not good. While the ptr is good and if in debugger I manually change rcx to the ptr - it works fine. But compiler does not pass ptr for some reason. Here is disassembly:
long id = (ptr->*_pFnGetId)();
000000005C882362 movsxd rax,dword ptr [rdi+30h]
000000005C882366 test eax,eax
000000005C882368 jne MyClass::MyCallback+223h (05C882373h)
000000005C88236A movsxd rcx,dword ptr [rdi+28h]
000000005C88236E add rcx,rsi
000000005C882371 jmp MyClass::MyCallback+240h (05C882390h)
000000005C882373 movsxd r8,dword ptr [rdi+2Ch]
000000005C882377 mov rcx,rax
000000005C88237A mov rax,qword ptr [r8+rsi]
000000005C88237E movsxd rdx,dword ptr [rax+rcx]
000000005C882382 movsxd rcx,dword ptr [rdi+28h]
000000005C882386 lea rax,[r8+rdx]
000000005C88238A add rcx,rax
000000005C88238D add rcx,rsi
000000005C882390 call qword ptr [rdi+20h]
000000005C882393 mov ebp,eax
Before executing these commands, rsi contains the pointer to the object of ThirdPartyClass (i.e. ptr), but instead of passing it in rcx directly, some arithmetic is performed on it and as a result, this pointer gets completely wrong.
some traces which I don't understand why compiler is doing it as it end up calling non-virtual function ThirdPartyClass::GetId():
000000005C88237A mov rax,qword ptr [r8+rsi]
R8 0000000000000000
RSI 000000004C691AA0 // good pointer to ThirdPartyClass object
RAX 0000000008E87728 // this gets pointer to virtual functions table of ThirdPartyClass
000000005C88237E movsxd rdx,dword ptr [rax+rcx]
RAX 0000000008E87728
RCX FFFFFFFFFFFFFFFF
RDX FFFFFFFFC0F3C600
000000005C882382 movsxd rcx,dword ptr [rdi+28h]
RCX 0000000000000000
RDI 000000005C9BE690
000000005C882386 lea rax,[r8+rdx]
RAX FFFFFFFFC0F3C600
RDX FFFFFFFFC0F3C600
R8 0000000000000000
000000005C88238A add rcx,rax
RAX FFFFFFFFC0F3C600
RCX FFFFFFFFC0F3C600
000000005C88238D add rcx,rsi
RCX 000000000D5CE0A0
RSI 000000004C691AA0
000000005C882390 call qword ptr [rdi+20h]
In my view, it should be as simple as
long id = (ptr->*_pFnGetId)();
mov rcx,rsi
call qword ptr [rdi+20h]
mov ebp,eax
And if I set rcx equal to rsi before the call qword ptr [rdi+20h] it returns me expected value.
Am I doing something completely wrong?
Thanks in advance.
Ok, I found a solution, by incident (as I already used similar approach and it worked in slightly different situation.
The solution is to trick the compiler by defining a fake class and calling member method by pointer, but pretending that it is a pointer to the known (to compiler) class.
Perhaps, it does not matter, but I know that ThirdPartyNamespace::ThirdPartyClass has virtual functions, so I declare fake class with virtual function as well.
class FakeCall
{
private:
FakeCall(){}
virtual ~FakeCall(){}
};
The rest as in the initial code except once small thing, instead of calling ptr->*_pFnGetId (where ptr is pointer to unknown, forward declared class ThirdPartyNamespace::ThirdPartyClass), I am pretending I am calling member method in my FakeCall class:
FakeCall * fake = (FakeCall*)ptr;
long sico = (fake->*_pFnGetId)();
Disassembly looks exactly as expected:
long sico = (fake->*_pFnGetSico)();
000000005A612096 mov rcx,rax
000000005A612099 call qword ptr [r12+20h]
000000005A61209E mov esi,eax
And it works perfectly!
Some observations:
The member method pointer, as I thought initially, nothing more than a normal function pointer.
Microsoft compiler (at least VS2008) goes crazy if calling member method for not defined class (i.e. only forward declaration of the name).
I noticed that the constructor will move this to eax before returning. This is a return value or something else?
class CTest {
int val_;
public:
CTest() {
0093F700 push ebp
0093F701 mov ebp,esp
0093F703 sub esp,0CCh
0093F709 push ebx
0093F70A push esi
0093F70B push edi
0093F70C push ecx
0093F70D lea edi,[ebp-0CCh]
0093F713 mov ecx,33h
0093F718 mov eax,0CCCCCCCCh
0093F71D rep stos dword ptr es:[edi]
0093F71F pop ecx
0093F720 mov dword ptr [this],ecx
val_ = 1;
0093F723 mov eax,dword ptr [this]
0093F726 mov dword ptr [eax],1
}
0093F72C mov eax,dword ptr [this]
0093F72F pop edi
0093F730 pop esi
0093F731 pop ebx
0093F732 mov esp,ebp
0093F734 pop ebp
0093F735 ret
VS2012 debug mode
I found that new will use its "return value". Seems like if(operator new() == 0) return 0; else return constructor();
class CTest {
int val_;
public:
CTest() {
val_ = 1;
__asm {
mov eax, 0x12345678
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret
}
}
};
int main() {
CTest *test = new CTest; // test == 0x12345678
return 0;
}
Your second question disagrees with your first. How can new use if ( operator new() == 0 ) return 0; else return constructor(); if constructor() is producing the condition result?
Anyway…
What the compiler does with registers is the compiler's business. Registers tend to hold whatever information is immediately useful, and if the compiler is written with the belief that every time the constructor is used, the object is used immediately afterwards, it may reasonably choose to put this in a register.
An ABI may require constructors to do this, but I doubt any do. Anyway, such protocols only apply to things exported from libraries, not strictly within programs.
Any new expression does check the result of operator new against 0 before proceeding to initialize an object. operator new may signal failure by returning nullptr (or NULL, etc.).
This can actually be a problem with placement new expressions, because it represents unavoidable runtime overhead as the given pointer is generally already known to be non-null.
This can be a feature by design, in C++ and other languages, returning a reference to a given instance allows a more "idiomatic" use of the features offered by the object itself, in short it's the Named parameter Idiom .
But this is just 1 option, it can be useful sometimes, especially if you are able to design your library in a way that it only "takes actions" without having the need to pass a significant amount of parameters, so the chain of method calls stays readable.
I'm using Visual Studio 2010.
I have a class with the following constructor:
CVideoAnnotation::CVideoAnnotation(std::string aPort, DWORD aBaudRate)
I create an instance of CVideoAnnotation as follows:
CVideoAnnotation cVideoAnnotation("COM3", CBR_9600);
'CBR_9600' is a macro that resolves to 9600.
Down in the constructor, aBaudRate is 9600 as expected. However, aPort does not get passed properly. When I hover the cursor over it, IntelliSense gives a value of <Bad Ptr>.
Does anybody have any thoughts on why the string does not pass properly?
Thanks,
Dave
As an update to my original question, I'm adding the assembly code for the constructor call and the population of locals once inside the constructor.
CVideoAnnotation cVideoAnnotation("COM3", CBR_9600);
0041177D push 2580h
00411782 sub esp,20h
00411785 mov ecx,esp
00411787 mov dword ptr [ebp-174h],esp
0041178D push offset string "COM3" (4198C8h)
00411792 call std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> > (41131Bh)
00411797 mov dword ptr [ebp-17Ch],eax
0041179D lea ecx,[ebp-11h]
004117A0 call dword ptr [__imp_CVideoAnnotation::CVideoAnnotation (41D4DCh)]
004117A6 mov dword ptr [ebp-180h],eax
004117AC mov dword ptr [ebp-4],0
CVideoAnnotation::CVideoAnnotation(std::string aPort, DWORD aBaudRate)
{
100137F0 push ebp
100137F1 mov ebp,esp
100137F3 push 0FFFFFFFFh
100137F5 push offset __ehhandler$??0CVideoAnnotation##QAE#V?$basic_string#DU?$char_traits#D#std##V?$allocator#D#2##std##K#Z (1001DC82h)
100137FA mov eax,dword ptr fs:[00000000h]
10013800 push eax
10013801 sub esp,164h
10013807 push ebx
10013808 push esi
10013809 push edi
1001380A push ecx
1001380B lea edi,[ebp-170h]
10013811 mov ecx,59h
10013816 mov eax,0CCCCCCCCh
1001381B rep stos dword ptr es:[edi]
1001381D pop ecx
1001381E mov eax,dword ptr [___security_cookie (10026090h)]
10013823 xor eax,ebp
10013825 mov dword ptr [ebp-10h],eax
10013828 push eax
10013829 lea eax,[ebp-0Ch]
1001382C mov dword ptr fs:[00000000h],eax
10013832 mov dword ptr [ebp-18h],ecx
10013835 mov dword ptr [ebp-84h],0
1001383F mov dword ptr [ebp-4],0
If you are implementing CVideoAnnotation in a separate DLL then you are having a well know issue of crossing DLL boundaries when using STL containers. To verify that this is the case create a new constructor taking a const char* instead of std::string and try..
Another thing, instead of std::string prefer to use const std::string&
Passing string to and from DLLs should work just fine. As noted above using STL containers in this way can be problematic but string is fine.
Do you have any evidence beyond Intellisense that the string is messed up? In Release builds Intelliesnse can be sketchy. Try adding cout << aPort << endl; into that constructor to make sure.
If it really is incorrect I would just debug from the caller into the constructor call - see where the string is pushed as a parameter, and see where on the stack it's picked up from for use inside the constructor. Something must be out of sync here, assuming both DLL and app use the same version of the C++ runtimes, and you should be able to tell why by inspecting the assembler code.
"COM3" is not a std::string it is a const char *, try creating a std::string to pass in.
I too was fighting what looked like bad byte alignment resulting in . Based on the suggestions here that crossing library boundaries can be problematic, I suspected (even though I am using a static library instead of DLL) that I might be dealing with the same sort of problem. I was instantiating a class in one library whose code was in another. When I moved the instantiation into the library where its code was the problem went away. I don't understand why this worked but at least I can now move forward.