C++ non-static method folding - c++

Referring to this question, stackoverflow.com/q/14188612, are there situations when the compiler folds the method instantiation of two objects?
Let's say we have the following class with a private "stateless" method add, that does not modify the class members:
class Element
{
public:
Class(int a, int b) : a_(a), b_(b)
{
c_ = add(a, b);
}
private:
int add(int a, int b)
{
return a + b;
}
private:
int a_;
int b_;
int c_;
}
int main(void)
{
Element a(1, 2);
Element b(3, 4);
}
Can we sometimes expect that add will actually be compiled as a static-like method? Or, to be more clear, the address of a.add to be equal to b.add (add stored only once).
This is merely a question related to understanding compiler optimizations.

The compiler will always generate one binary method/function for add,
independent of how many objects you have. Anything else isn´t just stupid, but impossible:
The compiler can´t possibly know/calculate how many objects will exist during runtime just from the code. While it is possible with your example, more complicated programs will instantiate variables (or not) based on input given at runtime (keyboard, files...).
Note that templates can lead to more than one generation, one for each template type used in the code (but for that, the code is enough to know everything, and it has nothing to do with the object count).

When you define a method inside the class definition it usually means that the method should be inlined into each caller. The compiler can choose not to, but quite often you might find that the method doesn't actually exist in your output program at all (not true in debug builds, of course).

For non-inline member functions, the standard says
There shall be at most one definition of a non-inline member function in a program
There is no entity in C++ b.add or a.add from which you can take the address. The address-of operator needs a qualified-id to a member function of the form C::m(...) to get the address of a function. In your case, the address of add is
auto ptr = &Element::add;
and is independent from any instance. This gives a member-function pointer, which can only be used to call the function together with an object, e.g. (a.*ptr)(0,1) or (b.*ptr)(2,3) if add were a public method.

Related

How to use a template to create two very similar member functions in C++

I am coding a C++ library and one class has two member functions that only differ in one function call:
int MyClass::member_func_1(int a) {
// ...
int b = some_function();
// ...
}
int MyClass::member_func_2(int a) {
// ...
int b = some_other_function();
// ...
}
Is there a way of not having to duplicate the code of these two functions, while still keeping the two member functions with the same function signatures?
Since it is a library all code must be generated when the library is compiled.
I have ruled out the option of having only one member function with an extra boolean argument to choose between some_function and some_other_function for performance reasons.
I know how to use a macro to do it, but could a template be used, or is there another better way?
A template solution is only a good solution if the call to either of the member functions is executed in a tight loop, where the overhead of testing a boolean or making an indirect function call would be too costly.
In this case, if C++17 standard is an option, the if constexpr syntax is probably the simplest way. Like in:
class MyClass
{
template<bool other> int member_func(int a)
{
//...
int b;
if constexpr (other)
b = some_other_function();
else
b = some_function();
//...
}
};
Since the only difference in the implementation of your two member functions is the calls to different member functions, that's what you should abstract out of those functions. You can do that by writing a single member function that takes a pointer to member function as an additional argument:
int common_member_func(int a, int (MyClass::*func)())
{
// ...
int b = (this->*func)();
// ...
}
and now the implementation of your two member functions would be:
int member_func_1(int a)
{
return common_member_func(a, &MyClass::some_function);
}
int member_func_2(int a)
{
return common_member_func(a, &MyClass::some_other_function);
}
Here's a demo.
You may use a boolean for switching the internal method to be called or pass a function pointer / std::function and call that one directly. the performance penalty will be ridiculously negligible.
If you really want to make your code less readable because of that virtually non-existant "performance reason", you can implement the common method templated, with a boolean template parameter. The if() clause referring the template parameter can then be a constexpr (realised during compilation).
However note that this will also lead to the method being effectively doubled in your applications code, and when you have many alternating calls, this might have an adverse effect on your performance similar to the original if() clause, i.e. not noticable at all.

Capture this in a lambda defined externally

I have a class which has a std::function as a member variable.
class Animal
{
public:
Animal(const std::function<double(const int x)> MakeNoise) : MakeNoise(MakeNoise) {}
void Print(const int x) { std::cout << this->MakeNoise(x) << std::endl; }
private:
const std::function<double(const int x)> MakeNoise;
int a = 4;
int b = 8;
int c = 12;
};
I would like to be able to swap out the MakeNoise function without subclassing Animal by passing various lambdas.
const auto MakeNoise1 = [this](const int x)
{
return a + b + x;
}
const auto MakeNoise2 = [this](const int x)
{
return a + b + c + x;
}
Is it possible to capture this if the definition of the function is in a different file?
Is it possible to use [&] (capture by reference) to capture x pass into Print ?
Lastly, is there a better way to define this class so I can swap the function in and out?
If I add this the compiler says error: invalid use of ‘this’ at top level which makes sense since the definition of the lambda is not within a class.
I do not think this is possible to do directly. After all, the lambdas as given don't know anything about Animal. You can work around it by having the function's signature be
(const Animal& animal, const int x) instead and accessing it through animal.
this is a special name inside member functions. If MakeNoise1 wants to capture this, it needs to be in a member of Animal. Your compiler told you so, and you interpreted that message corerctly.
That's not a big restriction, since Animal::a is private anyway.
You can define Animal methods in other .cpp files, but you'd still need to declare these methods in class Animal, so this might or might not match your larger design.
If you are defining the lambda outside a member function, then no, you cannot capture this because there is no this to capture. A capture is a way to provide access to variables that are defined at the point where the lambda is defined. A capture cannot capture things that do not exist at the point of capturing.
What you want to do is provide access to variables that are defined at the point where the lambda is invoked. This is the job of parameters, as in this->MakeNoise(this, x) or perhaps MakeNoise(*this, x). (Your Print wrapper can easily provide the extra parameter. In fact, adding parameters is a common motivation for writing wrapper functions.) However, I suspect that this might not be the best approach.
Instead of thinking about how to access this, think about what MakeNoise is supposed to do and what it needs to do that. Does it really need the entire Animal including private data? If so, it probably should be a member function. Bite the bullet and create a plethora of derived classes (and provide protected access to the data). Does it instead need the entire Animal, but only the public interface? If so, a lambda that takes both const Animal & and const int as parameters might be reasonable. Furthermore, it might be reasonable to expand the public interface to accommodate this.
Perhaps, though, you are in the case where MakeNoise does not really need an Animal so much as a few key bits of data. This is the point where you have to look at your design and your levels of abstraction. We cannot do this for you because, as is appropriate for a StackOverflow question, we do not have the complete picture. However, I can present for consideration the possibility that things other than animals can make a noise. Are your MakeNoise lambdas supposed to be abstract enough to not care what is making the noise? If so, you might consider adding specific data as parameters to your lambdas. Your Print function would become something more like the following.
void Print(const int x) { std::cout << MakeNoise(x, a, b, c) << std::endl; }
I am assuming that Animal has been (appropriately) simplified for this question, and that an Animal object really has a lot more data than a, b, and c. If this assumption is false, you are in the case of needing an entire Animal. However, if the parameters you would need to pass to MakeNoise are few in comparison to the data in Animal, this might be a better semantic fit to your design. Might. It all comes back to making design choices that are sensible and consistent. Think abstractly while avoiding over-engineering. Keep in mind that you need to provide the same parameters to each lambda (but the various lambdas can have different captures).
Here is an example lambda that could be used for this last approach, assuming the type of MakeNoise – both of the data member and of the parameter to the constructor – has been updated.
int main()
{
Animal cheetah{ [](int x, int a, int b, int c) -> double
{
return a + b + c + x;
}
};
cheetah.Print(2);
}
If you really want to use const int instead of int, you could. To me, it seems unnecessarily restrictive for a non-reference, but that's more style than substance.

Why Do We Need const methods?

class function const is used to tell the compiler that a class function will not change a member variable. Thus, a constant object of that type can safely call it. Below is a simple example.
#include <iostream>
using namespace std;
class X {
private:
int a{1};
public:
void PrintA() const {
cout << a << "\n";
}
};
int main() {
const X x;
x.PrintA();
}
We tell the compiler that #PrintA is const, so constant objects can safely call it. However, it seems that the compiler is actually smart enough to detect that a function is read-only or not, independent of the const keyword. If I add an a=10 in the above code like so
#include <iostream>
using namespace std;
class X {
private:
int a{1};
public:
void PrintA() const {
cout << a << "\n";
a = 10;
}
};
int main() {
const X x;
x.PrintA();
}
I get
exp.cpp: In member function ‘void X::PrintA() const’:
exp.cpp:11:9: error: assignment of member ‘X::a’ in read-only object
a = 10;
In other words, the const key-word can't trick the compiler into allowing the mutation of a constant object. So my question is, why do developers need to declare a method const? It seems like, even without that hint, the compiler distinguishes read-only and non-read-only methods, so can properly catch cases of attempts to mutate constant objects.
It's not a hint -- it's part of the interface of the method. If you remove the const, the error in PrintA will go away and you'll get an error in main instead. You need const for the same reason you need public and private -- to define the interface you want. The compiler will then check to make sure you don't violate that interface you've declared.
the compiler distinguishes read-only and non-read-only methods
First consider how easily the compiler can do this with the const designation as it exists today.
To determine if the implementation of PrintA obeys the rules, the compiler only needs to look at that implementation.
To determine if x.PrintA(); is valid for const X x; it only needs the declaration of PrintA.
Now imagine if we didn't have function-level const
To determine if the implementation of PrintA obeys the rules, the compiler has to determine if it's not acting read-only and then scan across your entire program to find if it ever gets called on a const object.
I'm sure that would massively bloat the link time of large programs.
But then a significant concern are virtual functions. Imagine one derived class overrides with a read-only implementation, but then a different derived class overrides with a non-read-only implementation. Then if such a method is called on a const object, what is the compiler to do since it may not be able to determine at compile-time which implementation is going to be called? Would we just have to rule out virtuals from ever being possible to call on const objects? That would be unfortunately limiting.
Furthermore, this idea wouldn't work when callers vs implementations are separated across DLL boundaries (even for non-virtual functions), since those are only connected together at run-time.
So overall it just seems more difficult/problematic for us the have the ability to declare const objects if we were to leave it to the compiler to have to figure out if methods are implemented in a const way or not.

Propagate method staticness to derived class

template<typename T> struct Derived: T
{
/*static*/ int foo(int x) { return T::foo(x) + 1; }
};
If T::foo(int x) is static then Derived<T>::foo(int x) should also be static. Otherwise Derived<T>::foo(int x) should be non-static.
Is there a way to let the compiler take care of this?
No, you cannot propagate staticness in the sense you ask. Incidentally, you could also ask the same thing about const:
int foo(int x) { return bar(x) + 1; } // Infer that foo is const because bar is
C++ specifiers are meant to convey intent about the interface, on which users can rely even if the implementation changes:
static int foo(x); // This method does not require an object
int foo(x) const; // This method will not modify the object
In case - through templates, for example - the implementation may vary, your interface must reflect the lowest common denominator. For const, for example, methods need to be non-const. For static, which is your question, you cannot declare static.
Note that this is not a huge imposition. Even if a method is static, you can still call it using with object semantics. So, in your case, you'll have to just use object semantics. In particular, regarding your clarification in the comments
If allocator is static then container doesn't need to hold it's pointer. So decorators must preserve staticness.
note that decorators can also not preserve staticness, because containers can hold pointers in any case, and call them via object notation, regardless of their constness.
Use below construct:
static_assert(std::is_same<decltype(&Derived::foo), decltype(&T::foo)>::value or
(std::is_member_function_pointer<decltype(&Derived::foo)>::value and
std::is_member_function_pointer<decltype(&T::foo)>::value),
"Derived::foo & T::foo are not having same static-ness");
I have tested quickly with my g++ and it works fine.
What it does
Takes address of both the methods and if they are comparable then
they must be static. This means that the methods signatures are
expected to be same (as implied in your example).
If (1) fails then check if both the foo are function pointers of
their respective classes types. Here no strictness of type, but you can impose with some more meta programming. Leaving up to you.
If both of above fails, then the compiler gives an error. This static_assert can be put within class Derived.
Notes: (1) If Derived::foo is static & T::foo is not then anyways, it gives error. (2) or & and are official keywords of C++. If certain compilers like MSVC doesn't support then use || & && respectively.

How is inheritance implemented at the memory level?

Suppose I have
class A { public: void print(){cout<<"A"; }};
class B: public A { public: void print(){cout<<"B"; }};
class C: public A { };
How is inheritance implemented at the memory level?
Does C copy print() code to itself or does it have a pointer to the it that points somewhere in A part of the code?
How does the same thing happen when we override the previous definition, for example in B (at the memory level)?
Compilers are allowed to implement this however they choose. But they generally follow CFront's old implementation.
For classes/objects without inheritance
Consider:
#include <iostream>
class A {
void foo()
{
std::cout << "foo\n";
}
static int bar()
{
return 42;
}
};
A a;
a.foo();
A::bar();
The compiler changes those last three lines into something similar to:
struct A a = <compiler-generated constructor>;
A_foo(a); // the "a" parameter is the "this" pointer, there are not objects as far as
// assembly code is concerned, instead member functions (i.e., methods) are
// simply functions that take a hidden this pointer
A_bar(); // since bar() is static, there is no need to pass the this pointer
Once upon a time I would have guessed that this was handled with pointers-to-functions in each A object created. However, that approach would mean that every A object would contain identical information (pointer to the same function) which would waste a lot of space. It's easy enough for the compiler to take care of these details.
For classes/objects with non-virtual inheritance
Of course, that wasn't really what you asked. But we can extend this to inheritance, and it's what you'd expect:
class B : public A {
void blarg()
{
// who knows, something goes here
}
int bar()
{
return 5;
}
};
B b;
b.blarg();
b.foo();
b.bar();
The compiler turns the last four lines into something like:
struct B b = <compiler-generated constructor>
B_blarg(b);
A_foo(b.A_portion_of_object);
B_bar(b);
Notes on virtual methods
Things get a little trickier when you talk about virtual methods. In that case, each class gets a class-specific array of pointers-to-functions, one such pointer for each virtual function. This array is called the vtable ("virtual table"), and each object created has a pointer to the relevant vtable. Calls to virtual functions are resolved by looking up the correct function to call in the vtable.
Check out the C++ ABI for any questions regarding the in-memory layout of things. It's labelled "Itanium C++ ABI", but it's become the standard ABI for C++ implemented by most compilers.
I don't think the standard makes any guarantees. Compilers can choose to make multiple copies of functions, combine copies that happen to access the same memory offsets on totally different types, etc. Inlining is just one of the more obvious cases of this.
But most compilers will not generate a copy of the code for A::print to use when called through a C instance. There may be a pointer to A in the compiler's internal symbol table for C, but at runtime you're most likely going to see that:
A a; C c; a.print(); c.print();
has turned into something much along the lines of:
A a;
C c;
ECX = &a; /* set up 'this' pointer */
call A::print;
ECX = up_cast<A*>(&c); /* set up 'this' pointer */
call A::print;
with both call instructions jumping to the exact same address in code memory.
Of course, since you've asked the compiler to inline A::print, the code will most likely be copied to every call site (but since it replaces the call A::print, it's not actually adding much to the program size).
There will not be any information stored in a object to describe a member function.
aobject.print();
bobject.print();
cobject.print();
The compiler will just convert the above statements to direct call to function print, essentially nothing is stored in a object.
pseudo assembly instruction will be like below
00B5A2C3 call print(006de180)
Since print is member function you would have an additional parameter; this pointer. That will be passes as just every other argument to the function.
In your example here, there's no copying of anything. Generally an object doesn't know what class it's in at runtime -- what happens is, when the program is compiled, the compiler says "hey, this variable is of type C, let's see if there's a C::print(). No, ok, how about A::print()? Yes? Ok, call that!"
Virtual methods work differently, in that pointers to the right functions are stored in a "vtable"* referenced in the object. That still doesn't matter if you're working directly with a C, cause it still follows the steps above. But for pointers, it might say like "Oh, C::print()? The address is the first entry in the vtable." and the compiler inserts instructions to grab that address at runtime and call to it.
* Technically, this is not required to be true. I'm pretty sure you won't find any mention in the standard of "vtables"; it's by definition implementation-specific. It just happens to be the method the first C++ compilers used, and happens to work better all-around than other methods, so it's the one nearly every C++ compiler in existence uses.