How do you 'de-serialize' a derived class from serialized data? Or maybe I should say, is there a better way to 'de-serialize' data into derived classes?
For example, suppose you had a pure virtual base class (B) that is inherited by three other classes, X, Y and Z. Moreover, we have a method, serialize(), that will translate X:B, Y:B and Z:B into serialized data.
This way it can be zapped across a socket, a named pipe, etc. to a remote process.
The problem I have is, how do we create an appropriate object from the serialized data?
The only solution I can come up with is including an identifier in the serialized data that indicates the final derived object type. Where the receiver, first parses the derived type field from the serialized data, and then uses a switch statement (or some sort of logic like that) to invoke the appropriate constructor.
For example:
B deserialize( serial_data )
{
parse the derived type from the serial_data
switch (derived type)
case X
return X(serial_data)
case Y
return Y(serial_data)
case Z
return Z(serial_data)
}
So after learning the derived object type we invoke the appropriate derived type constructor.
However, this feels awkward and cumbersome. I'm hoping there is a more eloquent way of doing this. Is there?
In fact, it's a more general issue than serialization called Virtual Constructor.
The traditional approach is to a Factory, which based on an ID returns the right derived type. There are two solutions:
the switch method as you noticed, though you need to allocate on the heap
the prototype method
The prototype method goes like so:
// Cloneability
class Base
{
public:
virtual Base* clone() const = 0;
};
class Derived: public Base
{
public:
virtual Derived* clone() const { return new Derived(*this); }
};
// Factory
class Factory
{
public:
Base* get(std::string const& id) const;
void set(std::string const& id, Base* exemplar);
private:
typedef std::map < std::string, Base* > exemplars_type;
exemplars_type mExemplars;
};
It is somewhat traditional to make the Factory a singleton, but it's another matter entirely.
For deserialization proper, it's easier if you have a virtual method deserialize to call on the object.
EDIT: How does the Factory work ?
In C++ you can't create a type you don't know about. The idea above is therefore that the task of building a Derived object is given to the Derived class, by way of the clone method.
Next comes the Factory. We are going to use a map which will associate a "tag" (for example "Derived") to an instance of an object (say Derived here).
Factory factory;
Derived derived;
factory.set("Derived", &derived);
Now, when we want to create an object which type we don't know at compile time (because the type is decided on the fly), we pass a tag to the factory and ask for an object in return.
std::unique_ptr<Base> base = factory.get("Derived");
Under the cover, the Factory will find the Base* associated to the "Derived" tag and invoke the clone method of the object. This will actually (here) create an object of runtime-type Derived.
We can verify this by using the typeid operator:
assert( typeid(base) == typeid(Derived) );
inmemory:
--------
type1 {
chartype a;
inttype b;
};
serialize(new type1());
serialized(ignore { and ,):
---------------------------
type1id,len{chartypeid,adata,inttypeid,bdata}
i guess, in an ideal serialization protocol, every non-primitive type need to be prefixed with typeid,len. Even if you serialize a single type that is not derived from anything, you would add a type id, because the other end has to know what type its getting (regardless of inheritance structure). So you have to mention derived class ids in the serialization, because logically they are different types. Correct me if i am wrong.
Related
I have an abstract base class named component.
It has derived non-abstract classes like resistor, generator etc...
In my circuit class, I have an heterogenous std::vector<sim::component*> named component_list, which I use to handle all the components inserted in the circuit.
Then I have the following function :
void circuit::insert(sim::component& comp, std::vector<sim::node*> nodes)
In the function definition, I want to copy the component named comp
in order to insert a pointer to it in my component_list
(so that I can manage its lifetime)
I tried something along those lines :
sim::component *copy = new sim::component(comp)
but of course, sim::component is abstract and I can't instanciate it
How can I make a copy of the object, which real class is unknown at compile-time ?
One traditional way to solve it is to let the objects clone themselves, plus a bit of CRTP.
I. First, you make your abstract class clonable:
struct Component {
virtual Component *clone() const = 0;
virtual ~Component() {}
};
Now, every Component should define its own implementation of clone().
II. Which is easily automated via CRTP:
template<class Concrete> struct CompBase: Component {
Component *clone() const {
return new Concrete(static_cast<Concrete const &>(*this));
}
virtual ~CompBase() {}
};
struct Generator: CompBase<Generator>; // already has clone() defined
Note that I've used plain pointers in the example, though it is generally recommended to use more smart analogs. std::unique_ptr would fit quite nice, along with std::make_unique.
Which creates another opportunity: with unique_ptr you can even forget about cloning and simply pass unique_ptrs as objects, each one with its own concrete class instance inside, and store them in a vector.
I am a relatively new C++ programmer.
In writing some code I've created something similar in concept to the code below. When a friend pointed out this is in fact a factory pattern I read about the pattern and saw it is in similar.
In all of the examples I've found the factory pattern is always implemented using a separate class such as class BaseFactory{...}; and not as I've implemented it using a static create() member function.
My questions are:
(1) Is this in fact a factory pattern?
(2) The code seems to work. Is there something incorrect in the way I've implemented it?
(3) If my implementation is correct, what are the pros/cons of implementing the static create() function as opposed to the separate BaseFactory class.
Thanks!
class Base {
...
virtual ~Base() {}
static Base* create(bool type);
}
class Derived0 : public Base {
...
};
class Derived1 : public Base {
...
};
Base* Base::create(bool type) {
if(type == 0) {
return new Derived0();
}
else {
return new Derived1();
}
}
void foo(bool type) {
Base* pBase = Base::create(type);
pBase->doSomething();
}
This is not a typical way to implement the factory pattern, the main reason being that the factory class isn't typically a base of the classes it creates. A common guideline for when to use inheritance is "Make sure public inheritance models "is-a"". In your case this means that objects of type Derived0 or Derived1 should also be of type Base, and the derived classes should represent a more specialised concept than the Base.
However, the factory pattern pretty much always involves inheritance as the factory will return a pointer to a base type (yous does this too). This means the client code doesn't need to know what type of object the factory created, only that it matches the base class's interface.
With regard to having a static create functions, it depends on the situation. One advantage, as your example shows, is that you won't need to create an instance of the factory in order to use it.
Your factory is ok, except for the fact that you merged the factory and the interface, breaking the SRP principle.
Instead of making the create static method in the base class, create it in another (factory) class.
class base{
.....
virtual void function1();
virtual void function2();
};
class derived::public base{
int function1();
int function2();
};
int main()
{
derived d;
base *b = &d;
int k = b->function1() // Why use this instead of the following line?
int k = d.function1(); // With this, the need for virtual functions is gone, right?
}
I am not a CompSci engineer and I would like to know this. Why use virtual functions if we can avoid base class pointers?
The power of polymorphism isn't really apparent in your simple example, but if you extend it a bit it might become clearer.
class vehicle{
.....
virtual int getEmission();
}
class car : public vehicle{
int getEmission();
}
class bus : public vehicle{
int getEmission();
}
int main()
{
car a;
car b;
car c;
bus d;
bus e;
vehicle *traffic[]={&a,&b,&c,&d,&e};
int totalEmission=0;
for(int i=0;i<5;i++)
{
totalEmission+=traffic[i]->getEmission();
}
}
This lets you iterate through a list of pointers and have different methods get called depending on the underlying type. Basically it lets you write code where you don't need to know what the child type is at compile time, but the code will perform the right function anyway.
You're correct, if you have an object you don't need to refer to it via a pointer. You also don't need a virtual destructor when the object will be destroyed as the type it was created.
The utility comes when you get a pointer to an object from another piece of code, and you don't really know what the most derived type is. You can have two or more derived types built on the same base, and have a function that returns a pointer to the base type. Virtual functions will allow you to use the pointer without worrying about which derived type you're using, until it's time to destroy the object. The virtual destructor will destroy the object without you knowing which derived class it corresponds to.
Here's the simplest example of using virtual functions:
base *b = new derived;
b->function1();
delete b;
its to implement polymorphism. Unless you have base class pointer
pointing to derived object you cannot have polymorphism here.
One of the key features of derived classes is that a pointer to a
derived class is type-compatible with a pointer to its base class.
Polymorphism is the art of taking advantage of this simple but
powerful and versatile feature, that brings Object Oriented
Methodologies to its full potential.
In C++, a special type/subtype relationship exists in which a base
class pointer or a reference can address any of its derived class
subtypes without programmer intervention. This ability to manipulate
more than one type with a pointer or a reference to a base class is
spoken of as polymorphism.
Subtype polymorphism allows us to write the kernel of our application
independent of the individual types we wish to manipulate. Rather, we
program the public interface of the base class of our abstraction
through base class pointers and references. At run-time, the actual
type being referenced is resolved and the appropriate instance of the
public interface is invoked. The run-time resolution of the
appropriate function to invoke is termed dynamic binding (by default,
functions are resolved statically at compile-time). In C++, dynamic
binding is supported through a mechanism referred to as class virtual
functions. Subtype polymorphism through inheritance and dynamic
binding provide the foundation for objectoriented programming
The primary benefit of an inheritance hierarchy is that we can program
to the public interface of the abstract base class rather than to the
individual types that form its inheritance hierarchy, in this way
shielding our code from changes in that hierarchy. We define eval(),
for example, as a public virtual function of the abstract Query base
class. By writing code such as
_rop->eval();
user code is shielded from the variety and volatility of our query language. This not only allows for the addition, revision,
or removal of types without requiring changes to user programs, but
frees the provider of a new query type from having to recode behavior
or actions common to all types in the hierarchy itself. This is
supported by two special characteristics of inheritance: polymorphism
and dynamic binding. When we speak of polymorphism within C++, we
primarily mean the ability of a pointer or a reference of a base class
to address any of its derived classes. For example, if we define a
nonmember function eval() as follows, // pquery can address any of the
classes derived from Query
void eval( const Query *pquery ) { pquery->eval(); }
we can invoke it legally, passing in the address of an object of any of the
four query types:
int main()
{
AndQuery aq;
NotQuery notq;
OrQuery *oq = new OrQuery;
NameQuery nq( "Botticelli" ); // ok: each is derived from Query
// compiler converts to base class automatically
eval( &aq );
eval( ¬q );
eval( oq );
eval( &nq );
}
whereas an attempt to invoke eval() with the address of an object not derived from Query
results in a compile-time error:
int main()
{ string name("Scooby-Doo" ); // error: string is not derived from Query
eval( &name);
}
Within eval(), the execution of pquery->eval(); must invoke the
appropriate eval() virtual member function based on the actual class
object pquery addresses. In the previous example, pquery in turn
addresses an AndQuery object, a NotQuery object, an OrQuery object,
and a NameQuery object. At each invocation point during the execution
of our program, the actual class type addressed by pquery is
determined, and the appropriate eval() instance is called. Dynamic
binding is the mechanism through which this is accomplished.
In the object-oriented paradigm, the programmer manipulates an unknown instance of a bound but infinite set of types. (The set of
types is bound by its inheritance hierarchy. In theory, however, there
is no limit to the depth and breadth of that hierarchy.) In C++ this
is achieved through the manipulation of objects through base class
pointers and references only. In the object-based paradigm, the
programmer
manipulates an instance of a fixed, singular type that is completely defined at the point of compilation. Although the
polymorphic manipulation of an object requires that the object be
accessed either through a pointer or a reference, the manipulation of
a pointer or a reference in C++ does not in itself necessarily result
in polymorphism. For example, consider
// no polymorphism
int *pi;
// no language-supported polymorphism
void *pvi;
// ok: pquery may address any Query derivation
Query *pquery;
In C++, polymorphism
exists only within individual class hierarchies. Pointers of type
void* can be described as polymorphic, but they are without explicit
language support — that is, they must be managed by the programmer
through explicit casts and some form of discriminant that keeps track
of the actual type being addressed.
You seem to have asked two questions (in the title and in the end):
Why use base class pointers for derived classes?
This is the very use of polymorphism. It allows you to treat objects uniformly while allowing you to have specific implementation. If this bothers you, then I assume you should ask: Why polymorphism?
Why use virtual destructors if we can avoid base class pointers?
The problem here is you cannot always avoid base class pointers to exploit the strength of polymorphism.
My first foray in to C++ is building an audio synthesis library (EZPlug).
The library is designed to make it easy to set up a graph of interconnected audio generator and processor objects. We can call the EZPlugGenerators
All of the processor units can accept one or more EZPlugGenerators as inputs.
Its important to me that all configuration methods on these EZPlugGenerators are chainable. In other words, the methods used in setting up the synthesis graph should always return a pointer to the parent object. That allows me to use a syntax which very nicely shows the nested nature of the object relationships like this:
mixer.addGenerator(
a(new Panner())
->setVolume(0.1)
->setSource(
a(new TriggererPeriodic())
->setFrequency(
v(new FixedValue(1), "envTriggerFreq")
)
->setTriggerable(
a(new Enveloper())
->setAllTimes(v(0.0001), v(0.05), v(0.0f, "envSustain"), v(0.01))
->setAudioSource(
a(new SineWaveMod())
->setFrequency(
a(new Adder())
->addGenerator(a(new Adder()))
->addGenerator(v(5000, "sineFreq"))
->addGenerator(
a(new Multiplier())
->addVal(v("sineFreq"))
->addVal(
a(new TriggererPeriodic())
->setFrequency(v("envTriggerFreq"))
->setTriggerable(
a(new Enveloper())
->setAllTimes(0.1, 0.1, 0, 0.0001)
->setAudioSource(v(1, "envAmount"))
)
)
)
)
)
)
)
);
The "a" and "v" functions in the above code store and return references to objects and handle retrieving them and destroying them.
I suspect my approach to C++ looks a little weird, but I'm finding that the language can actually accommodate the way I want to program fairly well.
Now to my question
I'd like to create a common superclass for all EZPlugGenerators which can accept inputs to inherit from. This superclass would have a method, "addInput", which would be overridden by each subclass. The problem comes from the fact that I want "addInput" to return a pointer to an instance of the subclass, not the superclass.
This isn't acceptable:
EZPlugProcessor* addInput(EZPlugGenerator* generator)
because that returns a pointer to an instance of the superclass, not the sublass destroying the chainability that I'm so happy with.
I tried this:
template<typename T> virtual T* addInput(EZPlugGenerator* obj){
but the compiler tells me I can't create a virtual template function.
I don't HAVE to use inheritance here. I can implement 'addInput' on every single EZPlugGenerator that can take an input. It just seems like gathering all of them under a single parent class will help make it clear that they all have something in common, and will help enforce the fact that addInputis the proper way to plug one object in to another.
So, is there a way I can use inheritance to dictate that every member of a group of classes must implement an 'addInput' method, while allowing that method to return a pointer to an instance of the child class?
Virtual functions in C++ can have covariant return types, which means that you can define
virtual EZPlugProcessor *addInput(EZPlugGenerator* generator) = 0;
in the base class, and then
struct MyProcessor : EZPlugProcessor {
virtual MyProcessor *addinput(EZPlugGenerator* generator) {
...
return this;
}
};
As long as the caller knows (by the type they're using) that the object is a MyProcessor, they can chain addInput together with other functions specific to MyProcessor.
If your inheritance hierarchy has more levels, then unfortunately you'll sometimes find yourself writing:
struct MySpecificProcessor : MyProcessor {
virtual MySpecificProcessor *addinput(EZPlugGenerator* generator) {
return static_cast<MySpecificProcessor*>(MyProcessor::addInput(generator));
}
};
because there's no way to specify in EZPlugProcessor that the return type of addInput is "pointer to the most-derived type of the object". Each derived class has to "activate" the covariance for itself.
Yes, C++ already provides for covariant return types.
class Base
{
public:
virtual Base* add() = 0 { return <some base ptr>; }
};
class Child : public Base
{
public:
virtual Child* add() { return <some child ptr>; }
};
On the other hand no one will ever be able to read your code so you might want to consider if there's an alternate way to set up the configuration than writing LISP chaining in C++.
I'm experiencing a challenging problem, which has not been solvable - hopefully until now. I'm developing my own framework and therefore trying to offer the user flexibility with all the code complexity under the hood.
First of all I have an abstract base class which users can implement, obviously simplified:
class IStateTransit
{
public:
bool ConnectionPossible(void) = 0;
}
// A user defines their own class like so
class MyStateTransit : public IStateTransit
{
public:
bool ConnectionPossible(void){ return true; }
}
Next, I define a factory class. Users can register their own custom state transit objects and refer to them later by simply using a string identifier they have chosen:
class TransitFactory : public Singleton<TransitFactory>
{
public:
template<typename T> void RegisterStateTransit(const string& name)
{
// If the transit type is not already registered, add it.
if(transits.find(name) == transits.end())
{
transits.insert(pair<string, IStateTransit*>(name, new T()));
};
}
IStateTransit* TransitFactory::GetStateTransit(const string& type) const
{
return transits.find(type)->second;
};
private:
map<string, IStateTransit*> transits;
}
Now the problem is (probably obviously) that whenever a user requests a transit by calling GetStateTransit the system currently keeps returning the same object - a pointer to the same object that is. I want to change this.
PROBLEM: How can I return a new (clone) of the original IStateTransit object without the user having to define their own copy constructor or virtual constructor. Ideally I would somehow like the GetStateTransit method to be able to cast the IStateTransit object down to the derived type it is at runtime and return a clone of that instance. The biggest hurdle is that I do not want the user to have to implement any extra (and probably complex) methods.
4 hours of Googling and trying has led me nowhere. The one who has the answer is a hero!
The problem is that you don't have the type information to perform the clone as you only have a pointer to base class type and no knowledge as to what derived types have been implemented and are available.
I think there's a reason that 4 hours of googling haven't turned anything up. If you want IStateTransit to be cloneable you have to have an interface where the derived class implementer provides some sort of clone method implementation.
I'm sorry if this isn't what you wanted to hear.
However, implementing a clone method shouldn't be a big burden. Only the class implementor knows how a class can be copied, given a correct copy constructor, clone can be implemented for a leaf-node class like this:
Base* clone() const
{
return new MyType(*this);
}
You could even macro-alize it; although I wouldn't.
If I understand the problem correctly, you shouldn't insert new T -s into the map, but rather objects that create new T-s.
struct ICreateTransit
{
virtual ~ICreateTransit() {}
virtual IStateTransite* create() const = 0;
};
template <class T>
struct CreateTransit: public ICreateTransit
{
virtual IStateTransit* create() const { return new T(); }
};
And now insert:
transits.insert(pair<string, ICreateTransit*>(name, new CreateTransit<T>()));
And retrieve "copies" with:
return transits.find(type)->second->create(); //hopefully with error handling
It shouldn't be impossible to modify StateTransit<T> so it holds a T of which to make copies of, should the default one not do.
I think the general name for techniques like this is called "type erasure" (derived types "remember" particular types, although the base class is unaware of those).
This problem to me sounds that the abstract factory pattern might be of help. Using this pattern the libraries client can define how your framework builds its types. The client can inject his own subclass of the factory into the framework and define there what types should be build.
What you need is (additionaly)
A base class for the factory
As a client: Derive a concrete factory
A way to inject (as a client) a subtype of the factory into the framework
Call the factory metods to create new types.
Does this help you?