After a long time of C-style procedural coding, I am just beginning to 'get' OOP. So I suspect there may be standard way of dealing with the situation I am facing. I have an application with the class hierarchy shown below:
#include <iostream>
using namespace std;
class A {
public:
virtual int intf() { return 0;} // Only needed by B
virtual double df() {return 0.0;} // Only needed by C
};
class B : public A {
int intf() {return 2;}
// B objects have no use for df()
};
class C : public B {
double df() {return 3.14;}
// C objects have no use for intf()
};
int main(){
// Main needs to instantiate both B and C.
B b;
C c;
A* pa2b = &b;
A* pa2c = &c;
cout << pa2b->intf() << endl;
cout << pa2b->df() << endl;
cout << pa2c->intf() << endl;
cout << pa2c->df() << endl;
return 0;
}
Now this program compiles and runs fine. However, I have question about its design. Class A is the common interface and does not need to be instantiated. Class B and C need to be. Regarding the functions: intf() is needed by B but not C, and df() is needed by C but not B. If I make intf() {df()} pure virtual in A, then there is no reasonable definition of df() {intf()} for B {C}.
Edit: B and C share some data members and also some member functions other than f(). I have not shown it my stripped down code.
Finally, as is standard, my application needs to access both B and C through a pointer to A. So my question is: Is there a way to 'clean up' this design so that unrequired/empty member function definitions (such as I have done in declaration/definition of A) can be eliminated? There is a clear "IS-A" relationship between the classes. So even though I share every newbie's thrill about inheritance, I dont feel I have stretched my design just so I could use inheritance.
Background in case it helps: I am implementing a regression suite. Class A implements functions and matrices common to every regression (such as dependent and independent variables). Class B is logistic regression with two classes ('0' and '1') and defines cost functions, and training algorithm for two-class logistic regression. Class C is multi-class logistic regression. It extends class B by training for multiple classes using the "one-vs-all" algorithm. So in a sense C is a binary logistic regression if you think of your class of interest as positive examples and all others as negative examples. Then you do it for every class to implement multi-class regression. The functions (intf and df) in question return the output. In case of logistic regression, the return value is a vector, while for multiclass regression, it is a matrix. And, as stated above, B and C dont have any use for each others' return functions. Except that I cant seem to be able to eliminate redundant definitions in A (the regression class).
Thanks for your help.
Look at the Liskov Substitution Principle (http://en.wikipedia.org/wiki/Liskov_substitution_principle). It states that subclasses must fulfill the same contract as the superclass. In your example, neither subclass does this. The "Is-A" relationship is not enough to justify inheritance.
One option would be to use a single template method something like this:
template <typename T>
class A<T> {
T getValue();
}
class B : A<int> {
int getValue();
}
class C: A<double> {
double getValue();
}
this would allow the contract to be fulfilled by both subclasses while allowing the return type of the method to vary based on the subclass definition.
If you want to learn more object oriented programming "best practices", google "Robert Martin SOLID"
You touched one of the most controversial point of OOP: the is-a == derivation pattern, resulting int the "god object" anti-pattern, since everything is ultimately a child-of-god, with god knowing every method of everyone and having an "answer" (read "default implementation") for everything.
"Is-a" is not enough to justify inheritance, where no replace-ability exist, but in real world no object is really fully replaceable with another, otherwise it will not be different.
You are in the "land of nowhere" where the substitution principle doesn't work well, but -at the same time- virtual functions look the best tool to implement dynamic dispatch.
The only thing you can do come to a compromise, and sacrifice one of the two.
As far the situation looks like, since B and C have nothing in common (no shared useful methods), simply don't let those method to originate from A.
If you have something to "share" is probably a runtime mechanism to discover the type of B or C before entering B related specific code or C related specific code.
This is typically done with a common base having a runtime-type indicator to switch upon, or just a virtual function (typically the destructor) to let dynamic_cast to be able to work.
class A
{
public:
virtual ~A() {}
template<class T>
T* is() { return dynamic_cast<T*>(this); }
};
class B: public A
{
public:
int intf() { return 2; }
};
class C: public A
{
public:
double df() { return 3.14; }
};
int main()
{
using namespace std;
B b;
C c;
A* ba = &b;
A* ca = &c;
B* pb = ba->is<B>();
if(pb) cout << pb->intf() << endl;
C* pc = ca->is<C>();
if(pc) cout << pc->df() << endl;
}
Related
I've just started learning about OOP in C++. I was wondering why is the virtual keyword needed to instruct the compiler to do late binding ? Why can't the compiler know at compile time that the pointer is pointing to a derived class ?
class A {
public: int f() { return 'A';}
};
class B : public A {
public: int f() { return 'B';}
};
int main() {
A* pa;
B b;
pa = &b;
cout << pa->f() << endl;
}
Regarding not knowing at compile time, it is often the case the behavior is only known at runtime. Consider this example
#include <iostream>
struct A {};
struct B : A {};
struct C : A {};
int main()
{
int x;
std::cin >> x;
A* a = x == 1 ? new B : new C;
}
In this example, how could the compiler know if a will point to a B* or C*? It cannot because the behavior is dependent on runtime values.
How could it (in full generality)? For example
#include <cstdlib>
struct Parent {};
struct Child : Parent {};
int main()
{
Parent* p = std::rand() % 2 ? new Parent() : new Child();
}
Lets say you have a simple class-hierarchy like
class Animal
{
// Generic animal attributes and properties
};
class Mammal : public Animal
{
// Attributes and properties specific to mammals
};
class Fish : public Animal
{
// Attributes and properties specific to fishes
};
class Cat : public Mammal
{
// Attributes and properties specific to cats
};
class Shark : public Fish
{
// Attributes and properties specific to sharks
};
class Hammerhead : public Shark
{
// Attributes and properties specific to hammerhead sharks
};
[A little long-winded, but I want to have the "concrete" classes to be far away from each other]
Now lets say we have a function like
void do_something_with_animals(Animal* animal);
And finally let's call this function:
Fish *my_fish = new Hammerhead;
Mammal* my_cat = new Cat;
do_something_with_animals(my_fish);
do_something_with_animals(my_cat);
Now if we think a little, in the do_something_with_animals function there is really no way of knowing exactly what the argument animal might point to. Is it a Mammal? A Fish? A specific Fish sub-type?
This is even harder for the compiler if the do_something_with_animals function is defined in a different translation unit, where the definition of the Mammal and Fish classes (or any of its sub-classes) might not even be available.
The virtual keyword marks individual functions as late-bound. This isn't about what the compiler can or cannot know about any pointers to the object. It's about communicating programmer intent ("this function is meant to be overridden") and efficiency ("this function needs the late-binding mechanism enabled").
(I started out with some comments on an answer, but decided I should just write up my own answer.)
I've rearranged your code slightly here to make it easier to compile and view the output:
#include <iostream>
#ifdef V
#define VIRTUAL virtual
#else
#define VIRTUAL /*nothing*/
#endif
class A {
public: VIRTUAL char f() { return 'A';}
};
class B : public A {
public: char f() { return 'B';}
};
int main() {
A* pa;
B b;
pa = &b;
std::cout << pa->f() << std::endl;
}
Compiling and running it shows:
$ c++ t.cc && ./a.out
A
$ c++ -DV t.cc && ./a.out
B
which shows that the virtual keyword changes the behavior of the program. This is in fact required by the language standard. Your question could, I think, be best rephrased as Why is the standard written this way (which has a more useful general answer) rather than Can the compiler optimize my code (which has a specific but useless answer: yes, it can, but it's still required to print A, not B).
The language definition doesn't forbid the compiler from doing special optimization tricks. Instead—and especially so in this case, for C++— the language specification specifically tries to make it easier for compiler-writers to optimize. This winds up putting more of a burden on C++ programmers.
If C++ were a different language ...
The feature you're talking about, which is the virtual keyword, specifically exists because of this. The language could have been defined differently (and some other languages are): they could have said that compiler writers must not ever assume that, given some valid A* pa, pa points to some actual instance of type A. Then:
std::cout << pa->f() << std::endl;
would always have to figure out: What is the real underlying type of *pa and hence what function f shall I call here?
In this hypothetical (not-C++) language,1 a compiler that optimizes could take your code and build it to call B::f() directly, because pa points to an instance of type B. But in this same language, a compiler that tries to optimize heavily could not make assumptions about functions where the underlying type of pa is determined by something not predictable at compile-time:
void f(A* pa) {
std::cout << pa->f() << std::endl;
}
int main(int argc, char **argv) {
A a;
B b;
f(argc > 1 ? &b : &a);
}
This program needs to print A when called with no extra arguments, and B when called with extra arguments. So if our not-C++ language lacks a virtual keyword, or defines it as a no-op, function f—which calls either A::f() or B::f() at run-time—must always figure out which underlying function to call.
1It's not C either. The name D is taken. Perhaps P, from the BCPL progression?
Conclusion
Because C++ does have the virtual keyword, the variant we build that has a non-virtual f() in base class A can optimize pa->f() calls by assuming that pa->f() calls A::f(). Hence, instead of actually calling A::f(), an optimizing compiler can just write "A\n" to std::cout. Whether or not the C++ compiler optimizes, the call must produce A rather than B.
The variant with the virtual keyword inserted must not assume that pa->f() calls A::f(). If it can optimize enough to see that pa->f() calls B::f(), and therefore, at compile time, eliminate the call entirely and have the function write "B\n", that's OK! If it can't optimize that much, that's OK too—at least, as far as the language specification goes.
You, as a programmer, are required to know this about the virtual keyword, and to use it whenever you want the compiler to be forced to pick the right function based on the actual runtime class, whether or not the compiler is smart enough to do that at compile-time. If you want to allow and force the compiler to just use the base-class function every time, you can omit the virtual keyword.
I have a classic diamond problem like this in C++
A
/ \
B C
\ /
D
I know this would normally be solved by making B and C inherit virtually from A.
But my issue is that classes A and B come from a third party library I can't edit and B's inheritance from A is not marked virtual.
Is there a way to solve this?
Thanks for the help ;-)
An simple way to solve this problem is to introduce an Adapter class. This way, the hierarchy becomes
A
/
B AdapterC
\ /
D
And the code of AdapterC would look like
class AdapterC
{
public:
explicit AdapterC(C c) : c(std::move(c)) {}
operator C& () { return c; } //Maybe this should be explicit too...
/** Interface of C that you want to expose to D, e.g.
int doSomething(double d) { return c.doSomething(d); }
**/
private:
C c;
};
As the saying goes, "All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections". Of course, it might be a lot of work to write and maintain this Adapter. Hence, I think people that comment your question are probably right and that you should revisit your design.
Key design issue
If you can't change the inheritance of A in your library to virtual, there is no way to make a diamond with a single A element a the top. The standard explicitly allows for mixing virtual and non-virtual inheritance of the same base class:
10.1/6: for an object c of class type C, a single subobject of type V is shared by every base subobject of c that has a virtual base
class of type V. (...).
10.1/7: A class can have both virtual and non-virtual base classes of a given type.
Example:
namespace mylib { // namesape just to higlight the boundaries of the library
struct Person { // A
static int counter;
int id;
Person() : id(++counter) {}
void whoami() { cout << "I'm "<<id<<endl; }
}; //A
struct Friend: Person {}; //B -> A
int Person::counter=0;
}
struct Employee : virtual mylib::Person {}; // C->A
struct Colleague : Employee, mylib::Friend {}; // D->(B,c)
...
mylib::Friend p1; // ok !
p1.whoami();
Employee p2; // ok !
p2.whoami();
Colleague p3; // Attention: No diamond !
//p3.whoami(); // ouch !! not allowed: no diamond so for which base
// object has the function to be called ?
p3.Employee::whoami(); // first occurrence of A
p3.mylib::Friend::whoami(); // second second occurrence of A
Online demo
Alternative design
As you have no way to intervene in your external library you have to organize things differently. But however you'll do it, it will be sweat and tears.
You could define C (Employee in my example) by using composition of A (Person in my example). The A subobject would either be created or in special cases taken over from another object. You'd need to undertake the effort to replicate A's interface, forwarding the calls to an A subobject.
The general idea would look like:
class Employee {
mylib::Person *a;
bool owna;
protected:
Employee (mylib::Person& x) : a(&x), owna(false) { } // use existing A
public:
Employee () : a(new mylib::Person), owna(true) { } // create A subobject
~Employee () { if (owna) delete a; }
void whoami() { a->whoami(); } // A - fowarding
};
If you do this, you could then define D with multiple inheritance, with a trick in the constructor:
struct Colleague : mylib::Friend, Employee {
Colleague () : mylib::Friend(), Employee(*static_cast<Person*>(this)) {};
using Friend::whoami;
};
The only issue would then the ambiguity of the member functions of A interface (that have been provided in C as explained above). You therefore have to tell with a using clause that for A, you go via B and not via C.
In final, you could use this:
Employee p2;
p2.whoami();
Colleague p3; // Artifical diamond !
p3.whoami(); // YES !!
p3.Employee::whoami(); // first occurence of A
p3.mylib::Friend::whoami(); // second second occurence of A
// all whoami refer to the same A !!!
It works nicely: Online demo
Conclusion
So yes, it's possible to solve this, but it's very tricky. As I said: it will be sweat and tears.
For instance, you have no problem to convert a Colleague to a Person. But for Employee, you'd need to provide conversion operators. You have to implement the rule of 3/5 in Employee And you have to take care of everything that could go wrong (failed allocation, etc...). It will not be a piece of cake.
So it's really worth to reconsider your design, as Lightness Races in Orbit suggested in the comments :-)
First off, I know I can not do it, and I think it's not a duplicate questions (this and this questions deal with the same problem, but they only want an explanation of why it does not work).
So, I have a similar concept of classes and inheritance and I would, somehow, elegantly, want to do something that's forbidden. Here's a very simple code snippet that reflects what I want to do:
#include <iostream>
class A{
protected:
int var;
std::vector <double> heavyVar;
public:
A() {var=1;}
virtual ~A() {}
virtual void func() {
std::cout << "Default behavior" << this->var << std::endl;
}
// somewhere along the way, heavyVar is filled with a lot of stuff
};
class B: public A{
protected:
A* myA;
public:
B(A &a) : A() {
this->myA = &a;
this->var = this->myA->var;
// copy some simple data, e.g. flags
// but don't copy a heavy vector variable
}
virtual ~B() {}
virtual void func() {
this->myA->func();
std::cout << "This class is a decorator interface only" << std::endl;
}
};
class C: public B{
private:
int lotsOfCalc(const std::vector <double> &hv){
// do some calculations with the vector contents
}
public:
C(A &a) : B(a) {
// the actual decorator
}
virtual ~C() {}
virtual void func() {
B::func(); // base functionality
int heavyCalc = lotsOfCalc(this->myA->heavyVar); // illegal
// here, I actually access a heavy object (not int), and thus
// would not like to copy it
std::cout << "Expanded functionality " << heavyCalc << std::endl;
}
};
int main(void){
A a;
B b(a);
C c(a);
a.func();
b.func();
c.func();
return 0;
}
The reason for doing this is that I'm actually trying to implement a Decorator Pattern (class B has the myA inner variable that I want to decorate), but I would also like to use some of the protected members of class A while doing the "decorated" calculations (in class B and all of it's subclasses). Hence, this example is not a proper example of a decorator (not even a simple one). In the example, I only focused on demonstrating the problematic functionality (what I want to use but I can't). Not even all the classes/interfaces needed to implement a Decorator pattern are used in this example (I don't have an abstract base class interface, inherited by concrete base class instances as well as an abstract decorator intreface, to be used as a superclass for concrete decorators). I only mention Decorators for the context (the reason I want a A* pointer).
In this particular case, I don't see much sense in making (my equivalent of) int var public (or even, writing a publicly accessible getter) for two reasons:
the more obvious one, I do not want the users to actually use the information directly (I have some functions that return the information relevant to and/or written in my protected variables, but not the variable value itself)
the protected variable in my case is much more heavy to copy than an int (it's a 2D std::vector of doubles), and copying it in to the instance of a derived class would be unnecessarily time- and memory-consuming
Right now, I have two different ways of making my code do what I want it to do, but I don't like neither of them, and I'm searching for a C++ concept that was actually intended for doing something of this sort (I can't be the first person to desire this behavior).
What I have so far and why I don't like it:
1. declaring all the (relevant) inherited classes friends to the base class:
class A{
....
friend class B;
friend class C;
};
I don't like this solution because it would force me to modify my base class every time I write a new subclass class, and this is exactly what I'm trying to avoid. (I want to use only the 'A' interface in the main modules of the system.)
2. casting the A* pointer into a pointer of the inherited class and working with that
void B::func(){
B *uglyHack = static_cast<B*>(myA);
std::cout << uglyHack->var + 1 << std::endl;
}
The variable name is pretty suggestive towards my feelings of using this approach, but this is the one I am using right now. Since I designed this classes, I know how to be careful and to use only the stuff that is actually implemented in class A while treating it as a class B. But, if somebody else continues the work on my project, he might not be so familiar with the code. Also, casting a variable pointer in to something that I am very well aware that it is not just feels pure evil to me.
I am trying to keep this projects' code as nice and cleanly designed as possible, so if anybody has any suggestions towards a solution that does not require the modification of a base class every now and then or usage of evil concepts, I would very much appreciate it.
I do believe that you might want to reconsider the design, but a solution to the specific question of how can I access the member? could be:
class A{
protected:
int var;
static int& varAccessor( A& a ) {
return a.var;
}
};
And then in the derived type call the protected accessor passing the member object by reference:
varAccessor( this->myA ) = 5;
Now, if you are thinking on the decorator pattern, I don't think this is the way to go.
The source of the confusion is that most people don't realize that a type has two separate interfaces, the public interface towards users and the virtual interface for implementation providers (i.e. derived types) as in many cases functions are both public and virtual (i.e. the language allows binding of the two semantically different interfaces). In the Decorator pattern you use the base interface to provide an implementation. Inheritance is there so that the derived type can provide the operation for the user by means of some actual work (decoration) and then forwarding the work to the actual object. The inheritance relationship is not there for you to access the implementation object in any way through protected elements, and that in itself is dangerous. If you are passed an object of a derived type that has stricter invariants regarding that protected member (i.e. for objects of type X, var must be an odd number), the approach you are taking would let a decorator (of sorts) break the invariants of that X type that should just be decorated.
I can't find any examples of the decorator pattern being used in this way. It looks like in C++ it's used to decorate and then delegate back to the decoratee's public abstract interface and not accessing non-public members from it.
In fact, I don't see in your example decoration happening. You've just changed the behavior in the child class which indicates to me you just want plain inheritance (consider that if you use your B to decorate another B the effects don't end up chaining like it would in a normal decoration).
I think I found a nice way to do what I want in the inheritance structure I have.
Firstly, in the base class (the one that is a base for all the other classes, as well as abstract base class interface in the Decorator Pattern), I add a friend class declaration only for the first subclass (the one that would be acting as abstract decorator interface):
class A{
....
friend class B;
};
Then, I add protected access functions in the subclass for all the interesting variables in the base class:
class B : public A{
...
protected:
A *myA;
int getAVar() {return myA->var;}
std::vector <double> &getAHeavyVar {return myA->heavyVar;}
};
And finally, I can access just the things I need from all the classes that inherit class B (the ones that would be concrete decorators) in a controlled manner (as opposed to static_cast<>) through the access function without the need to make all the subclasses of B friends of class A:
class C : public B{
....
public:
virtual void func() {
B::func(); // base functionality
int heavyCalc = lotsOfCalc(this->getAHeavyVar); // legal now!
// here, I actually access a heavy object (not int), and thus
// would not like to copy it
std::cout << "Expanded functionality " << heavyCalc << std::endl;
std::cout << "And also the int: " << this->getAVar << std::endl;
// this time, completely legal
}
};
I was also trying to give only certain functions in the class B a friend access (declaring them as friend functions) but that did not work since I would need to declare the functions inside of class B before the friend declaration in class A. Since in this case class B inherits class A, that would give me circular dependency (forward declaration of class B is not enough for using only friend functions, but it works fine for a friend class declaration).
This might have been asked a million times before or might be incredibly stupid but why is it not implemented?
class A
{
public:
A(){ a = 5;}
int a;
};
class B:public A
{
public:
B(){ a = 0.5;}
float a;
};
int main()
{
A * a = new B();
cout<<a->a;
getch();
return 0;
}
This code will access A::a. How do I access B::a?
To access B::a:
cout << static_cast<B*>(a)->a;
To explicitly access both A::a and B::a:
cout << static_cast<B*>(a)->A::a;
cout << static_cast<B*>(a)->B::a;
(dynamic_cast is sometimes better than static_cast, but it can't be used here because A and B are not polymorphic.)
As to why C++ doesn't have virtual variables: Virtual functions permit polymorphism; in other words, they let a classes of two different types be treated the same by calling code, with any differences in the internal behavior of those two classes being encapsulated within the virtual functions.
Virtual member variables wouldn't really make sense; there's no behavior to encapsulate with simply accessing a variable.
Also keep in mind that C++ is statically typed. Virtual functions let you change behavior at runtime; your example code is trying to change not only behavior but data types at runtime (A::a is int, B::a is float), and C++ doesn't work that way. If you need to accommodate different data types at runtime, you need to encapsulate those differences within virtual functions that hide the differences in data types. For example (demo code only; for real code, you'd overload operator<< instead):
class A
{
public:
A(){ a = 5;}
int a;
virtual void output_to(ostream& o) const { o << a; }
};
class B:public A
{
public:
B(){ a = 0.5;}
float a;
void output_to(ostream& o) const { o << a; }
};
Also keep in mind that making member variables public like this can break encapsulation and is generally frowned upon.
By not making data public, and accessing them through virtual functions.
Consider for a moment, how what you ask for would have to be implemented. Basically, it would force any access to any data member to go through a virtual function. Remember, you are accessing data through a pointer to an A object, and class A doesn't know what you've done in class B.
In other words, we could make accessing any data member anywhere much slower -- or you could write a virtual method. Guess which C++'s designers chose..
You can't do this and C++ does not support it because it breaks with fundamental C++ principles.
A float is a different type than an int, and name lookup as well as determining what conversions will be needed for a value assignment happens at compile time. However what is really named by a->a including its actual type would only be known at runtime.
You can use templates to parameterize class A
template<typename T>
class A
{
public:
// see also constructor initializer lists
A(T t){ a = t; }
T a;
};
Then you can pass the type, however only at compile time for the above mentioned principle's reason.
A<int> a(5);
A<float> b(5.5f);
(dynamic_cast<B*>(a))->a ?
Why do you need that after all? Are virtual functions not enought?
You can downcast your variable to access B::a.
Something like:
((B*)a)->a
I think it is the same in most OO programming languages. I can't think of any one implementing virtual variables concept...
You can create such effect like this:
#include <iostream>
class A {
public:
double value;
A() {}
virtual ~A() {}
virtual void doSomething() {}
};
class B : public A {
public:
void doSomething() {
A::value = 3.14;
}
};
int main() {
A* a = new B();
a->doSomething();
std::cout << a->value << std::endl;
delete a;
return 0;
}
In the example above you could say that the value of A has the same effect as a virtual variable should have.
Edit: This is the actual answer to your question, but seeing your code example I noticed that you're seeking for different types in the virtual variable. You could replace double value with an union like this:
union {
int intValue;
float floatValue;
} value
and acces it like:
a->value.intValue = 3;
assert(a->value.floatValue == 3);
Note, for speed reasons I would avoid this.
Because according to the C standard, the offset of a field within a class or struct is required to be a compile-time constant. This also applies to when accessing base class fields.
Your example wouldn't work with virtual getters either, as the override requires the same type signature. If that was necessary, your virtual getter would have to return at algebraic type and the receiving code would have to check at run-time if it was of the expected type.
Leaving aside the argument that virtual methods should be private, virtual methods are intended as an extra layer of encapsulation (encapsulating variations in behavior). Directly accessing fields goes against encapsulation to begin with so it would be a bit hypocritical to make virtual fields. And since fields don't define behavior they merely store data, there isn't really any behavior to be virtualized. The very fact that you have a public int or float is an anti-pattern.
This isn't supported by C++ because it violates the principles of encapsulation.
Your classes should expose and implement a public (possibly virtual) interface that tells class users nothing about the internal workings of your class. The interface should describe operations (and results) that the class can do at an abstract level, not as "set this variable to X".
Since C++ lacks the interface feature of Java and C#, what is the preferred way to simulate interfaces in C++ classes? My guess would be multiple inheritance of abstract classes.
What are the implications in terms of memory overhead/performance?
Are there any naming conventions for such simulated interfaces, such as SerializableInterface?
Since C++ has multiple inheritance unlike C# and Java, yes you can make a series of abstract classes.
As for convention, it is up to you; however, I like to precede the class names with an I.
class IStringNotifier
{
public:
virtual void sendMessage(std::string &strMessage) = 0;
virtual ~IStringNotifier() { }
};
The performance is nothing to worry about in terms of comparison between C# and Java. Basically you will just have the overhead of having a lookup table for your functions or a vtable just like any sort of inheritance with virtual methods would have given.
There's really no need to 'simulate' anything as it is not that C++ is missing anything that Java can do with interfaces.
From a C++ pointer of view, Java makes an "artificial" disctinction between an interface and a class. An interface is just a class all of whose methods are abstract and which cannot contain any data members.
Java makes this restriction as it does not allow unconstrained multiple inheritance, but it does allow a class to implement multiple interfaces.
In C++, a class is a class and an interface is a class. extends is achieved by public inheritance and implements is also achieved by public inheritance.
Inheriting from multiple non-interface classes can result in extra complications but can be useful in some situations. If you restrict yourself to only inheriting classes from at most one non-interface class and any number of completely abstract classes then you aren't going to encounter any other difficulties than you would have in Java (other C++ / Java differences excepted).
In terms of memory and overhead costs, if you are re-creating a Java style class hierarchy then you have probably already paid the virtual function cost on your classes in any case. Given that you are using different runtime environments anyway, there's not going to be any fundamental difference in overhead between the two in terms of cost of the different inheritance models.
"What are the implications in terms of memory overhead/performance?"
Usually none except those of using virtual calls at all, although nothing much is guaranteed by the standard in terms of performance.
On memory overhead, the "empty base class" optimization explicitly permits the compiler to layout structures such that adding a base class which has no data members does not increase the size of your objects. I think you're unlikely to have to deal with a compiler which does not do this, but I could be wrong.
Adding the first virtual member function to a class usually increases objects by the size of a pointer, compared with if they had no virtual member functions. Adding further virtual member functions makes no additional difference. Adding virtual base classes might make a further difference, but you don't need that for what you're talking about.
Adding multiple base classes with virtual member functions probably means that in effect you only get the empty base class optimisation once, because in a typical implementation the object will need multiple vtable pointers. So if you need multiple interfaces on each class, you may be adding to the size of the objects.
On performance, a virtual function call has a tiny bit more overhead than a non-virtual function call, and more importantly you can assume that it generally (always?) won't be inlined. Adding an empty base class doesn't usually add any code to construction or destruction, because the empty base constructor and destructor can be inlined into the derived class constructor/destructor code.
There are tricks you can use to avoid virtual functions if you want explicit interfaces, but you don't need dynamic polymorphism. However, if you're trying to emulate Java then I assume that's not the case.
Example code:
#include <iostream>
// A is an interface
struct A {
virtual ~A() {};
virtual int a(int) = 0;
};
// B is an interface
struct B {
virtual ~B() {};
virtual int b(int) = 0;
};
// C has no interfaces, but does have a virtual member function
struct C {
~C() {}
int c;
virtual int getc(int) { return c; }
};
// D has one interface
struct D : public A {
~D() {}
int d;
int a(int) { return d; }
};
// E has two interfaces
struct E : public A, public B{
~E() {}
int e;
int a(int) { return e; }
int b(int) { return e; }
};
int main() {
E e; D d; C c;
std::cout << "A : " << sizeof(A) << "\n";
std::cout << "B : " << sizeof(B) << "\n";
std::cout << "C : " << sizeof(C) << "\n";
std::cout << "D : " << sizeof(D) << "\n";
std::cout << "E : " << sizeof(E) << "\n";
}
Output (GCC on a 32bit platform):
A : 4
B : 4
C : 8
D : 8
E : 12
Interfaces in C++ are classes which have only pure virtual functions. E.g. :
class ISerializable
{
public:
virtual ~ISerializable() = 0;
virtual void serialize( stream& target ) = 0;
};
This is not a simulated interface, it is an interface like the ones in Java, but does not carry the drawbacks.
E.g. you can add methods and members without negative consequences :
class ISerializable
{
public:
virtual ~ISerializable() = 0;
virtual void serialize( stream& target ) = 0;
protected:
void serialize_atomic( int i, stream& t );
bool serialized;
};
To the naming conventions ... there are no real naming conventions defined in the C++ language. So choose the one in your environment.
The overhead is 1 static table and in derived classes which did not yet have virtual functions, a pointer to the static table.
In C++ we can go further than the plain behaviour-less interfaces of Java & co.
We can add explicit contracts (as in Design by Contract) with the NVI pattern.
struct Contract1 : noncopyable
{
virtual ~Contract1();
Res f(Param p) {
assert(f_precondition(p) && "C1::f precondition failed");
const Res r = do_f(p);
assert(f_postcondition(p,r) && "C1::f postcondition failed");
return r;
}
private:
virtual Res do_f(Param p) = 0;
};
struct Concrete : virtual Contract1, virtual Contract2
{
...
};
Interfaces in C++ can also occur statically, by documenting the requirements on template type parameters.
Templates pattern match syntax, so you don't have to specify up front that a particular type implements a particular interface, so long as it has the right members. This is in contrast to Java's <? extends Interface> or C#'s where T : IInterface style constraints, which require the substituted type to know about (I)Interface.
A great example of this is the Iterator family, which are implemented by, among other things, pointers.
If you don't use virtual inheritance, the overhead should be no worse than regular inheritance with at least one virtual function. Each abstract class inheritted from will add a pointer to each object.
However, if you do something like the Empty Base Class Optimization, you can minimize that:
struct A
{
void func1() = 0;
};
struct B: A
{
void func2() = 0;
};
struct C: B
{
int i;
};
The size of C will be two words.
By the way MSVC 2008 has __interface keyword.
A Visual C++ interface can be defined as follows:
- Can inherit from zero or more base
interfaces.
- Cannot inherit from a base class.
- Can only contain public, pure virtual
methods.
- Cannot contain constructors,
destructors, or operators.
- Cannot contain static methods.
- Cannot contain data members;
properties are allowed.
This feature is Microsoft Specific. Caution: __interface has no virtual destructor that is required if you delete objects by its interface pointers.
There is no good way to implement an interface the way you're asking. The problem with an approach such as as completely abstract ISerializable base class lies in the way that C++ implements multiple inheritance. Consider the following:
class Base
{
};
class ISerializable
{
public:
virtual string toSerial() = 0;
virtual void fromSerial(const string& s) = 0;
};
class Subclass : public Base, public ISerializable
{
};
void someFunc(fstream& out, const ISerializable& o)
{
out << o.toSerial();
}
Clearly the intent is for the function toSerial() to serialize all of the members of Subclass including those that it inherits from Base class. The problem is that there is no path from ISerializable to Base. You can see this graphically if you execute the following:
void fn(Base& b)
{
cout << (void*)&b << endl;
}
void fn(ISerializable& i)
{
cout << (void*)&i << endl;
}
void someFunc(Subclass& s)
{
fn(s);
fn(s);
}
The value output by the first call is not the same as the value output by the second call. Even though a reference to s is passed in both cases, the compiler adjusts the address passed to match the proper base class type.