Polymorphic member variable(s) - class design - c++

Wondering whether anyone can help identify a more elegant design approach - or potentially identifying shortcomings of the following design.
Currently, I have an abstract Response class that derives from a serializable JSON Object.
//objects.h
struct Object
{
[[nodiscard]] std::string serialize() const;
virtual void deserialize(const Poco::JSON::Object::Ptr &payload) = 0;
[[nodiscard]] virtual Poco::JSON::Object::Ptr to_json() const = 0;
};
// response.h
class Response : public Object
{
public:
std::unique_ptr<Data> data;
std::unique_ptr<Links> links;
};
Where both Data and Links member variables are abstract base classes - in which their respective set of subclasses contain various STL containers.
Now the problem I'm facing is one of class design - and how to avoid downcasting each member variable depending on the derived Response (and to identify a more clean hierarchy/design). For instance...
ResponseConcreteA response_a;
response_a.deserialize(object_a);
auto data_a = static_cast<DataConcreteA *>(response_a.data.get());
ResponseConcreteB response_b;
response_b.deserialize(object_b);
auto data_b = static_cast<DataConcreteB *>(response_b.data.get());
The seemingly obvious solution is to abandon polymorphic member variables and substitute them for the respective concrete types. However - my concern is that this is a deviation from the inherent relationship of a Response having Data & Links members which are each a particular polymorphic type.
One important thing to note is that the concrete types attributed to Data & Links are determined at compile time - there is no necessity for the derived classes to change at any point. There respective construction(s) is governed by the following preprocessed template:
#define DECLARE_RESPONSE_TYPE(type_name, data_name, links_name \
struct type_name final : public Response \
{ \
type_name() \
{ \
data.reset(new data_name()); \
links.reset(new links_name()); \
} \
~type_name() = default; \
void deserialize(const Poco::JSON::Object::Ptr &payload) override; \
Poco::JSON::Object::Ptr to_json() const override; \
};
Is there a more appropriate approach to avoid these polymorphic member variables in my design where constant downcasting is required (despite the fact that derived object pointed to is known at compile time). Thanks!

(I’m adapting one of my recent Reddit comments that answered basically the same question.)
in general
Don’t model serialization with inheritance! It’s a cross-cutting concern you want to attach to arbitrary types. Inheritance is the wrong tool for that. Some problems with the approach:
You force everything serializable to become a full-fledged polymorphic class with all the related overhead.
You need control over the type you want to serialize, which means you cannot use 3rd party types without wrapping them.
Because serialization is cross-cutting you’ll likely run into the inheritance diamond problem at some point.
Fundamental types cannot be serialized that way. You cannot make int derive from Serializable.
Pattern matching is a more flexible approach. In a nutshell, you template your serialization framework and depend on certain functions being available for serializable types. Quick, dirty and naive example:
struct Something {
// ...
};
// If these two functions exist a type has serialization support
void serialize(const Something&, SerializedDataStream&);
Something deserialize(SerializedDataStream&);
Now you can make anything serializable without touching the type at all. That’s vastly more flexible than inheritance, but probably makes the serialization framework somewhat trickier to implement. Additionally supporting (de)serialization member functions is a good idea for more complex types that need access to their private data to (de)serialize properly.
Have a look at Boost Serialization or Cereal for real world examples of the pattern matching approach.
in your particular situation
To serialize larger nested structures you split up the serialization functionality. Each type has to know how to serialize itself, but that’s as far as it goes. Serializing a complex member is delegated to that member, because it too has to know how to serialize itself. That way you build the final JSON step by step.
One important thing to note is that the concrete types attributed to Data & Links are determined at compile time
The obvious solution is to turn Response into a template.
template <typename Data, typename Links>
class Response // note: no more base class
{
public:
Data data;
Links links;
};
// externalized serialization functions
void serialize(const Response&, JSONDataStream&);
Response deserialize(JSONDataStream&);
That way you have the correct types available to find the correct overload of the serialization functions for Data and Links, and delegation boils down to simply call them. Whether that approach is feasible depends on the larger context. Retrofitting a template into a project that relies on polymorphism can lead to ripple effects throughout the whole code base. In other words, it can be a really expensive change.
The alternative is similar to what you’re already doing. Response itself still uses the pattern-matching approach to serialization. But you keep the polymorphism for Data and Links including the overriden virtual serialization functions. In each concrete derived type we’re back to the original idea of “each type knows how to serialize itself”. If the concrete Data and Links classes need to be serialized in other contexts (not as members of Response), too, implement the pattern-matching functions for them and call those from the overriden member functions. Otherwise serialization can happen directly in those member functions.
class Data
{
public:
virtual ~BaseData() = default;
void deserialize(const Poco::JSON::Object::Ptr &payload) = 0;
Poco::JSON::Object::Ptr to_json() const = 0;
//...
};
class ConcreteData
{
public:
~BaseData() override = default;
void deserialize(const Poco::JSON::Object::Ptr &payload)
{
// ...
}
Poco::JSON::Object::Ptr to_json() const
{
// ...
}
}
// ------
Poco::JSON::Object::Ptr Response::to_json() const
{
// ...
auto serializedData = data->to_json();
// ...
}

Related

How to provide a void* accessor for a templated type?

I have a custom container class that is templated:
template<typename T>
class MyContainer {
T Get();
void Put(T data);
};
I would like to pass a pointer to this container to a function that will access the container's data as generic data - i.e. char* or void*. Think serialization. This function is somewhat complicated so it would be nice to not specify it in the header due to the templates.
// Errors of course, no template argument
void DoSomething(MyContainer *container);
I'm ok with requiring users to provide a lambda or subclass or something that performs the conversion. But I can't seem to come up with a clean way of doing this.
I considered avoiding templates altogether by making MyContainer hold a container of some abstract MyData class that has a virtual void Serialize(void *dest) = 0; function. Users would subclass MyData to provide their types and serialization but that seems like it's getting pretty complicated. Also inefficient since it requires storing pointers to MyData to avoid object slicing and MyData is typically pretty small and the container will hold large amounts (a lot of pointer storage and dereferencing).
You don't need any char* or void* or inheritance.
Consider this simplified implementation:
template <class T>
void Serialize (std::ostream& os, const MyContainer<T>& ct) {
os << ct.Get();
}
Suddenly this works for any T that has a suitable operator<< overload.
What about user types that don't have a suitable operator<< overload? Just tell the users to provide one.
Of course you can use any overloaded function. It doesn't have to be named operator<<. You just need to communicate its name and signature to the users and ask them to overload it.
You can introduce a non-template base class for the container with a pure virtual function that returns a pointer to raw data and implement it in your container:
class IDataHolder
{
public:
virtual ~IDataHolder(); // or you can make destructor protected to forbid deleteing by pointer to base class
virtual const unsigned char* GetData() const = 0;
};
template<typename T>
class MyContainer : public IDataHolder
{
public:
T Get();
void Put(T data);
const unsigned char* GetData() const override { /* cast here internal data to pointer to byte */}
};
void Serialize(IDataHolder& container)
{
const auto* data = container.GetData();
// do the serialization
}
I would like to pass a pointer to this container to a function that will access the container's data as generic data - i.e. char* or void*. Think serialization.
Can't be done in general, because you don't know anything about T. In general, types cannot be handled (e.g. copied, accessed, etc.) as raw blobs through a char * or similar.
Therefore, you would need to restrict what T can be, ideally enforcing it, otherwise never using it for Ts that would trigger undefined behavior. For instance, you may want to assert that std::is_trivially_copyable_v<T> holds. Still, you will have to consider other possible issues when handling data like that, like endianness and packing.
This function is somewhat complicated so it would be nice to not specify it in the header due to the templates.
Not sure what you mean by this. Compilers can handle very easily headers, and in particular huge amounts of template code. As long as you don't reach the levels of e.g. some Boost libraries, your compile times won't explode.
I considered avoiding templates altogether by making MyContainer hold a container of some abstract MyData class that has a virtual void Serialize(void *dest) = 0; function. Users would subclass MyData to provide their types and serialization but that seems like it's getting pretty complicated. Also inefficient since it requires storing pointers to MyData to avoid object slicing and MyData is typically pretty small and the container will hold large amounts (a lot of pointer storage and dereferencing).
In general, if you want a template, do a template. Using dynamic dispatching for this will probably kill performance, specially if you have to go through dispatches for even simple types.
As a final point, I would suggest taking a look at some available serialization libraries to see how they achieved it, not just in terms of performance, but also in terms of easy of use, integration with existing code, etc. For instance, Boost Serialization and Google Protocol Buffers.

Simplify an extensible "Perform Operation X on Data Y" framework

tl;dr
My goal is to conditionally provide implementations for abstract virtual methods in an intermediate workhorse template class (depending on template parameters), but to leave them abstract otherwise so that classes derived from the template are reminded by the compiler to implement them if necessary.
I am also grateful for pointers towards better solutions in general.
Long version
I am working on an extensible framework to perform "operations" on "data". One main goal is to allow XML configs to determine program flow, and allow users to extend both allowed data types and operations at a later date, without having to modify framework code.
If either one (operations or data types) is kept fixed architecturally, there are good patterns to deal with the problem. If allowed operations are known ahead of time, use abstract virtual functions in your data types (new data have to implement all required functionality to be usable). If data types are known ahead of time, use the Visitor pattern (where the operation has to define virtual calls for all data types).
Now if both are meant to be extensible, I could not find a well-established solution.
My solution is to declare them independently from one another and then register "operation X for data type Y" via an operation factory. That way, users can add new data types, or implement additional or alternative operations and they can be produced and configured using the same XML framework.
If you create a matrix of (all data types) x (all operations), you end up with a lot of classes. Hence, they should be as minimal as possible, and eliminate trivial boilerplate code as far as possible, and this is where I could use some inspiration and help.
There are many operations that will often be trivial, but might not be in specific cases, such as Clone() and some more (omitted here for "brevity"). My goal is to conditionally provide implementations for abstract virtual methods if appropriate, but to leave them abstract otherwise.
Some solutions I considered
As in example below: provide default implementation for trivial operations. Consequence: Nontrivial operations need to remember to override with their own methods. Can lead to run-time problems if some future developer forgets to do that.
Do NOT provide defaults. Consequence: Nontrivial functions need to be basically copy & pasted for every final derived class. Lots of useless copy&paste code.
Provide an additional template class derived from cOperation base class that implements the boilerplate functions and nothing else (template parameters similar to specific operation workhorse templates). Derived final classes inherit from their concrete operation base class and that template. Consequence: both concreteOperationBase and boilerplateTemplate need to inherit virtually from cOperation. Potentially some run-time overhead, from what I found on SO. Future developers need to let their operations inherit virtually from cOperation.
std::enable_if magic. Didn't get the combination of virtual functions and templates to work.
Here is a (fairly) minimal compilable example of the situation:
//Base class for all operations on all data types. Will be inherited from. A lot. Base class does not define any concrete operation interface, nor does it necessarily know any concrete data types it might be performed on.
class cOperation
{
public:
virtual ~cOperation() {}
virtual std::unique_ptr<cOperation> Clone() const = 0;
virtual bool Serialize() const = 0;
//... more virtual calls that can be either trivial or quite involved ...
protected:
cOperation(const std::string& strOperationID, const std::string& strOperatesOnType)
: m_strOperationID()
, m_strOperatesOnType(strOperatesOnType)
{
//empty
}
private:
std::string m_strOperationID;
std::string m_strOperatesOnType;
};
//Base class for all data types. Will be inherited from. A lot. Does not know any operations that might be performed on it.
struct cDataTypeBase
{
virtual ~cDataTypeBase() {}
};
Now, I'll define an example data type.
//Some concrete data type. Still does not know any operations that might be performed on it.
struct cDataTypeA : public cDataTypeBase
{
static const std::string& GetDataName()
{
static const std::string strMyName = "cDataTypeA";
return strMyName;
}
};
And here is an example operation. It defines a concrete operation interface, but does not know the data types it might be performed on.
//Some concrete operation. Does not know all data types it might be expected to work on.
class cConcreteOperationX : public cOperation
{
public:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) = 0;
protected:
cConcreteOperationX(const std::string& strOperatesOnType)
: cOperation("concreteOperationX", strOperatesOnType)
{
//empty
}
};
The following template is meant to be the boilerplate workhorse. It implements as much trivial and repetitive code as possible and is provided alongside the concrete operation base class - concrete data types are still unknown, but are meant to be provided as template parameters.
//ConcreteOperationTemplate: absorb as much common/trivial code as possible, so concrete derived classes can have minimal code for easy addition of more supported data types
template <typename ConcreteDataType, typename DerivedOperationType, bool bHasTrivialCloneAndSerialize = false>
class cConcreteOperationXTemplate : public cConcreteOperationX
{
public:
//Can perform datatype cast here:
virtual bool doSomeConcreteOperationX(const cDataTypeBase& dataBase) override
{
const ConcreteDataType* pCastData = dynamic_cast<const ConcreteDataType*>(&dataBase);
if (pCastData == nullptr)
{
return false;
}
return doSomeConcreteOperationXOnCastData(*pCastData);
}
protected:
cConcreteOperationXTemplate()
: cConcreteOperationX(ConcreteDataType::GetDataName()) //requires ConcreteDataType to have a static method returning something appropriate
{
//empty
}
private:
//Clone can be implemented here via CRTP
virtual std::unique_ptr<cOperation> Clone() const override
{
return std::unique_ptr<cOperation>(new DerivedOperationType(*static_cast<const DerivedOperationType*>(this)));
}
//TODO: Some Magic here to enable trivial serializations, but leave non-trivials abstract
//Problem with current code is that virtual bool Serialize() override will also be overwritten for bHasTrivialCloneAndSerialize == false
virtual bool Serialize() const override
{
return true;
}
virtual bool doSomeConcreteOperationXOnCastData(const ConcreteDataType& castData) = 0;
};
Here are two implementations of the example operation on the example data type. One of them will be registered as the default operation, to be used if the user does not declare anything else in the config, and the other is a potentially much more involved non-default operation that might take many additional parameters into account (these would then have to be serialized in order to be correctly re-instantiated on the next program run). These operations need to know both the operation and the data type they relate to, but could potentially be implemented at a much later time, or in a different software component where the specific combination of operation and data type are required.
//Implementation of operation X on type A. Needs to know both of these, but can be implemented if and when required.
class cConcreteOperationXOnTypeADefault : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeADefault, true>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
};
//Different implementation of operation X on type A.
class cConcreteOperationXOnTypeASpecialSauce : public cConcreteOperationXTemplate<cDataTypeA, cConcreteOperationXOnTypeASpecialSauce/*, false*/>
{
virtual bool doSomeConcreteOperationXOnCastData(const cDataTypeA& castData) override
{
//...do stuff...
return true;
}
//Problem: Compiler does not remind me that cConcreteOperationXOnTypeASpecialSauce might need to implement this method
//virtual bool Serialize() override {}
};
int main(int argc, char* argv[])
{
std::map<std::string, std::map<std::string, std::unique_ptr<cOperation>>> mapOpIDAndDataTypeToOperation;
//...fill map, e.g. via XML config / factory method...
const cOperation& requestedOperation = *mapOpIDAndDataTypeToOperation.at("concreteOperationX").at("cDataTypeA");
//...do stuff...
return 0;
}
if you data types are not virtual (for each operation call you know both operation type and data type at compile time) you may consider following approach:
#include<iostream>
#include<string>
template<class T>
void empty(T t){
std::cout<<"warning about missing implementation"<<std::endl;
}
template<class T>
void simple_plus(T){
std::cout<<"simple plus"<<std::endl;
}
void plus_string(std::string){
std::cout<<"plus string"<<std::endl;
}
template<class Data, void Implementation(Data)>
class Operation{
public:
static void exec(Data d){
Implementation(d);
}
};
#define macro_def(OperationName) template<class T> class OperationName : public Operation<T, empty<T>>{};
#define macro_template_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName<TypeName>>{};
#define macro_inst( TypeName, OperationName, ImplementationName ) template<> class OperationName<TypeName> : public Operation<TypeName, ImplementationName>{};
// this part may be generated on base of .xml file and put into .h file, and then just #include generated.h
macro_def(Plus)
macro_template_inst(int, Plus, simple_plus)
macro_template_inst(double, Plus, simple_plus)
macro_inst(std::string, Plus, plus_string)
int main() {
Plus<int>::exec(2);
Plus<double>::exec(2.5);
Plus<float>::exec(2.5);
Plus<std::string>::exec("abc");
return 0;
}
Minus of this approach is that you'd have to compile project in 2 steps: 1) transform .xml to .h 2) compile project using generated .h file. On plus side compiler/ide (I use qtcreator with mingw) gives warning about unused parameter t in function
void empty(T t)
and stack trace where from it was called.

c++ wrap return type

I am wrapping a library which I did not write to make it more user friendly. There are a huge number of functions which are very basic so it's not ideal to have to wrap all of these when all that is really required is type conversion of the results.
A contrived example:
Say the library has a class QueryService, it has among others this method:
WeirdInt getId() const;
I'd like a standard int in my interface however, I can get an int out of WeirdInt no problem as I know how to do this. In this case lets say that WeirdInt has:
int getValue() const;
This is a very simple example, often the type conversion is more complicated and not always just a call to getValue().
There are literally hundreds of function calls that return types likes these and more are added all the time, so I'd like to try and reduce the burden on myself having to constantly add a bajillion methods every time the library does just to turn WeirdType into type.
I want to end up with a QueryServiceWrapper which has all the same functionality as QueryService, but where I've converted the types. Am I going to have to write an identically names method to wrap every method in QueryService? Or is there some magic I'm missing? There is a bit more to it as well, but not relevant to this question.
Thanks
The first approach I'd think is by trying with templates such that
you provide a standard implementation for all the wrapper types which have a trivial getValue() method
you specialize the template for all the others
Something like:
class WeirdInt
{
int v;
public:
WeirdInt(int v) : v(v) { }
int getValue() { return v; }
};
class ComplexInt
{
int v;
public:
ComplexInt(int v) : v(v) { }
int getValue() { return v; }
};
template<typename A, typename B>
A wrap(B type)
{
return type.getValue();
}
template<>
int wrap(ComplexInt type)
{
int v = type.getValue();
return v*2;
};
int x = wrap<int, WeirdInt>(WeirdInt(5));
int y = wrap<int, ComplexInt>(ComplexInt(10));
If the wrapper methods for QueryService have a simple pattern, you could also think of generating QueryServiceWrapper with some perl or python script, using some heuristics. Then you need to define some input parameters at most.
Even defining some macros would help in writing this wrapper class.
Briefly, If your aim is to encapsulate the functionality completely so that WeirdInt and QueryService are not exposed to the 'client' code such that you don't need to include any headers which declare them in the client code, then I doubt the approach you take will be able to benefit from any magic.
When I've done this before, my first step has been to use the pimpl idiom so that your header contains no implementation details as follows:
QueryServiceWrapper.h
class QueryServiceWrapperImpl;
class QueryServiceWrapper
{
public:
QueryServiceWrapper();
virtual ~QueryServiceWrapper();
int getId();
private:
QueryServiceWrapperImpl impl_;
};
and then in the definition, you can put the implementation details, safe in the knowledge that it will not leach out to any downstream code:
QueryServiceWrapper.cpp
struct QueryServiceWrapperImpl
{
public:
QueryService svc_;
};
// ...
int QueryServiceWrapper::getValue()
{
return impl_->svc_.getId().getValue();
}
Without knowing what different methods need to be employed to do the conversion, it's difficult add too much more here, but you could certainly use template functions to do conversion of the most popular types.
The downside here is that you'd have to implement everything yourself. This could be a double edged sword as it's then possible to implement only that functionality that you really need. There's generally no point in wrapping functionality that is never used.
I don't know of a 'silver bullet' that will implement the functions - or even empty wrappers on the functions. I've normally done this by a combination of shell scripts to either create the empty classes that I want or taking a copy of the header and using text manipulation using sed or Perl to change original types to the new types for the wrapper class.
It's tempting in these cases to use public inheritance to enable access to the base functions while allowing functions to be overridden. However, this is not applicable in your case as you want to change return types (not sufficient for an overload) and (presumably) you want to prevent exposure of the original Weird types.
The way forward here has to be to use aggregation although in such as case there is no way you can easily avoid re-implementing (some of) the interfaces unless you are prepared to automate the creation of the class (using code generation) to some extent.
more complex approach is to introduce a required number of facade classes over original QueryService, each of which has a limited set of functions for one particular query or query-type. I don't know that your particular QueryService do, so here is an imaginary example:
suppose the original class have a lot of weired methods worked with strange types
struct OriginQueryService
{
WeirdType1 query_for_smth(...);
WeirdType1 smth_related(...);
WeirdType2 another_query(...);
void smth_related_to_another_query(...);
// and so on (a lot of other function-members)
};
then you may write some facade classes like this:
struct QueryFacade
{
OriginQueryService& m_instance;
QueryFacade(OriginQueryService* qs) : m_instance(*qs) {}
// Wrap original query_for_smth(), possible w/ changed type of
// parameters (if you'd like to convert 'em from C++ native types to
// some WeirdTypeX)...
DesiredType1 query_for_smth(...);
// more wrappers related to this particular query/task
DesiredType1 smth_related(...);
};
struct AnotherQueryFacade
{
OriginQueryService& m_instance;
AnotherQueryFacade(OriginQueryService* qs) : m_instance(*qs) {}
DesiredType2 another_query(...);
void smth_related_to_another_query(...);
};
every method delegate call to m_instance and decorated w/ input/output types conversion in a way you want it. Types conversion can be implemented as #Jack describe in his post. Or you can provide a set of free functions in your namespace (like Desired fromWeird(const Weired&); and Weired toWeired(const Desired&);) which would be choosen by ADL, so if some new type arise, all that you have to do is to provide overloads for this 2 functions... such approach work quite well in boost::serialization.
Also you may provide a generic (template) version for that functions, which would call getValue() for example, in case if lot of your Weired types has such member.

Approaching serialization without common base class

I am looking for some design advices for the following problem:
I am using boost geometry, I have a couple of custom geometry types compatible with boost geometry (via traits), but most of the types I am using are typedefs.
class MyPoint
{
// custom stuff
};
// declare traits for MyPoint for use wih boost geometry here
class MyTaggedPoint : public MyPoint
{
// more custom stuff
};
// declare traits for MyTaggedPoint for use wih boost geometry here
// example typedefs
typedef boost::geometry::model::polygon<MyPoint> Polygon;
typedef boost::geometry::model::polygon<MyTaggedPoint> TaggedPolygon;
My problem is when I want to serialize/deserialize my geometries.
Let's say all geometries are stored in a binary field in a database. If I would have a base geometry class, I would probably just write g->type() (4 bytes) and call g->save(some_outputstream) and write all of that to the binary field. Then when reading the binary field I would simply read the bytes and cast to appropriate geometry type.
But Boost geometries do not have a common base class.
How do you guys usually approach serialization when there are multiple types that can be stored as binary and you do not have a shared base class ?
I was thinking of maybe having a Serializer class, that returns a boost.Any and then the geometry can be casted afterward with the type that would be stored in the (de)serializer? But then the serializer would need a save method for each geometry types ? ex: Save(myPolygon), Save(myPoint)
Any ideas/experiences?
Boost's serialization supports non-invasive serialization if you do not wish to reimplement the wheel. You may even be able to find library support for their geometry types somewhere. The interface is somewhat complicated due to XML concerns unfortunately.
To serialize objects to and from bytes, you ultimately need 2 functions for EACH type you have to support (primatives, objects, etc.). These are "Load()" and "Store()".
Ideally, you use a fixed interface for the bytes- a iostream, char*, some buffer object- etc.
For the sake of readability let's call it "ByteBuffer", since logically that's what its role is.
We now have something like template functions for the Serializable concept:
template<typename T>
ByteBuffer Store(const T& object) { // BUT, What goes here...? }
template<typename T>
T Load(const ByteBuffer& bytes);
Okay, this isn't going to work for anything other than the primitive types- even if we made these "visitors" or something they literally have to know every detail about the object's internals to do their job. Furthermore, "Load()" is logically a constructor (really, a FACTORY since it could easily fail). We've got to associate these with the actual objects.
To make Serializable a base class, we need to use the "curiously recurring template" pattern. To do this, we require all derived classes to have a constructor of the form:
T(const ByteBuffer& bytes);
To check for errors, we can provide a protected flag "valid" in the base class that derived constructors can set. Note that your object has to support factory-style construction anyway for Load() to work well with it.
Now we can do this right, providing "Load" as a factory:
template<typename T>
class Serializable // If you do reference-counting on files & such, you can add it here
{
protected:
bool valid;
// Require derived to mark as valid upon load
Serializable() : valid(false) {}
virtual ~Serializable() { valid = false; }
public:
static T Load(const ByteBuffer& bytes); // calls a "T(bytes)" constructor
// Store API
virtual ByteBuffer Store() = 0; // Interface details are up to you.
};
Now, just derive from the base class like so and you can pick up everything you need:
class MyObject : public Serializable<MyObject>
{
protected:
// .. some members ...
MyObject(const ByteBuffer& bytes)
{
//... Actual load logic for this object type ...
// On success only:
valid = true;
}
public:
virtual ByteBuffer Store() {
//... store logic
}
};
What's cool is that you can call "MyObject::Load()" and it'll do exactly what you expect. Futhermore, "Load" can be made into the ONLY way to build the object, allowing you clean APIs for read-only files and such.
Extending this to full File APIs takes a little more work, namely adding a "Load()" that can read from a larger buffer (holding other things) and "Store()" that appends to an existing buffer.
As a side note, do NOT use boost's APIs for this. In a good design serializable objects should map 1-to-1 to packed structures of primitive types on disk- that's the only way the resulting files are really going to be usable by other programs or on other machines. Boost gives you a horrible API that mostly enables you to do things you'll regret later.

How to create a correct hierarchy of objects in C++

I'm building an hierarchy of objects that wrap primitive types, e.g integers, booleans, floats etc, as well as container types like vectors, maps and sets. I'm trying to (be able to) build an arbitrary hierarchy of objects, and be able to set/get their values with ease. This hierarchy will be passed to another class (not mentioned here) and an interface will be created from this representation. This is the purpose of this hierarchy, to be able to create a GUI representation from these objects.To be more precise, i have something like this:
class ValObject
{
public:
virtual ~ValObject() {}
};
class Int : public ValObject
{
public:
Int(int v) : val(v) {}
void set_int(int v) { val = v);
int get_int() const { return val; }
private:
int val;
};
// other classes for floats, booleans, strings, etc
// ...
class Map : public ValObject {}
{
public:
void set_val_for_key(const string& key, ValObject* val);
ValObject* val_for_key(const string& key);
private:
map<string, ValObject*> keyvals;
};
// classes for other containers (vector and set) ...
The client, should be able to create and arbitrary hierarchy of objects, set and get their values with ease, and I, as a junior programmer, should learn how to correctly create the classes for something like this.
The main problem I'm facing is how to set/get the values through a pointer to the base class ValObject. At first, i thought i could just create lots of functions in the base class, like set_int, get_int, set_string, get_string, set_value_for_key, get_value_for_key, etc, and make them work only for the correct types. But then, i would have lots of cases where functions do nothing and just pollute my interface. My second thought was to create various proxy objects for setting and getting the various values, e.g
class ValObject
{
public:
virtual ~ValObject() {}
virtual IntProxy* create_int_proxy(); // <-- my proxy
};
class Int : public ValObject
{
public:
Int (int v) : val(v) {}
IntProxy* create_int_proxy() { return new IntProxy(&val); }
private:
int val;
};
class String : public ValObject
{
public:
String(const string& s) : val(s) {}
IntProxy* create_int_proxy() { return 0; }
private:
string val;
};
The client could then use this proxy to set and get the values of an Int through an ValObject:
ValObject *val = ... // some object
IntProxy *ipr = val->create_int_proxy();
assert(ipr); // we know that val is an Int (somehow)
ipr->set_val(17);
But with this design, i still have too many classes to declare and implement in the various subclasses. Is this the correct way to go ? Are there any alternatives ?
Thank you.
Take a look at boost::any and boost::variant for existing solutions. The closest to what you propose is boost::any, and the code is simple enough to read and understand even if you want to build your own solution for learning purposes --if you need the code, don't reinvent the wheel, use boost::any.
One of the beauties of C++ is that these kinds of intrusive solutions often aren't necessary, yet unfortunately we still see similar ones being implemented today. This is probably due to the prevalence of Java, .NET, and QT which follows these kinds of models where we have a general object base class which is inherited by almost everything.
By intrusive, what's meant is that the types being used have to be modified to work with the aggregate system (inheriting from a base object in this case). One of the problems with intrusive solutions (though sometimes appropriate) is that they require coupling these types with the system used to aggregate them: the types become dependent on the system. For PODs it is impossible to use intrusive solutions directly as we cannot change the interface of an int, e.g.: a wrapper becomes necessary. This is also true of types outside your control like the standard C++ library or boost. The result is that you end up spending a lot of time and effort manually creating wrappers to all kinds of things when such wrappers could have been easily generated in C++. It can also be very pessimistic on your code if the intrusive solution is uniformly applied even in cases where unnecessary and incurs a runtime/memory overhead.
With C++, a plethora of non-intrusive solutions are available at your fingertips, but this is especially true when we know that we can combine static polymorphism using templates with dynamic polymorphism using virtual functions. Basically we can generate these base object-derived wrappers with virtual functions on the fly only for the cases in which this solution is needed without pessimizing the cases where this isn't necessary.
As already suggested, boost::any is a great model for what you want to achieve. If you can use it directly, you should use it. If you can't (ex: if you are providing an SDK and cannot depend on third parties to have matching versions of boost), then look at the solution as a working example.
The basic idea of boost::any is to do something similar to what you are doing, only these wrappers are generated at compile-time. If you want to store an int in boost::any, the class will generate an int wrapper class which inherits from a base object that provides the virtual interface required to make any work at runtime.
The main problem I'm facing is how to
set/get the values through a pointer
to the base class ValObject. At first,
i thought i could just create lots of
functions in the base class, like
set_int, get_int, set_string,
get_string, set_value_for_key,
get_value_for_key, etc, and make them
work only for the correct types. But
then, i would have lots of cases where
functions do nothing and just pollute
my interface.
As you already correctly deduced, this would generally be an inferior design. One tell-tale sign of inheritance being used improperly is when you have a lot of base functions which are not applicable to your subclasses.
Consider the design of I/O streams. We don't have ostreams with functions like output_int, output_float, output_foo, etc. as being directly methods in ostream. Instead, we can overload operator<< to output any data type we want in a non-intrusive fashion. A similar solution can be achieved for your base type. Do you want to associate widgets with custom types (ex: custom property editor)? We can allow that:
shared_ptr<Widget> create_widget(const shared_ptr<int>& val);
shared_ptr<Widget> create_widget(const shared_ptr<float>& val);
shared_ptr<Widget> create_widget(const shared_ptr<Foo>& val);
// etc.
Do you want to serialize these objects? We can use a solution like I/O streams. If you are adapting your own solution like boost::any, it can expect such auxiliary functions to already be there with the type being stored (the virtual functions in the generated wrapper class can call create_widget(T), e.g.
If you cannot be this general, then provide some means of identifying the types being stored (a type ID, e.g.) and handle the getting/setting of various types appropriately in the client code based on this type ID. This way the client can see what's being stored and deal set/get values on it accordingly.
Anyway, it's up to you, but do consider a non-intrusive approach to this as it will generally be less problematic and a whole lot more flexible.
Use dynamic_cast to cast up the hierarchy. You don't need to provide an explicit interface for this - any reasonable C++ programmer can do that. If they can't do that, you could try enumerating the different types and creating an integral constant for each, which you can then provide a virtual function to return, and you can then static_cast up.
Finally, you could consider passing a function object, in double-dispatch style. This has a definite encapsulation advantage.
struct functor {
void operator()(Int& integral) {
...
}
void operator()(Bool& boo) {
...
}
};
template<typename Functor> void PerformOperationByFunctor(Functor func) {
if (Int* ptr = dynamic_cast<Int*>(this)) {
func(*ptr);
}
// Repeat
}
More finally, you should avoid creating types where they've basically been already covered. For example, there's little point providing a 64bit integral type and a 32bit integral type and ... it's just not worth the hassle. Same with double and float.