C++: Does this pattern have a name, and can it be improved? - c++

The motivation
Let's say I'm writing a Tree class. I will represent nodes of the tree by a Tree::Node class. Methods of the class might return Tree::Node objects and take them as arguments, such as a method which gets the parent of a node: Node getParent(Node).
I'll also want a SpecialTree class. SpecialTree should extend the interface of a Tree and be usable anywhere a Tree is.
Behind the scenes, Tree and SpecialTree might have totally different implementations. For example, I might use a library's GraphA class to implement a Tree, so that Tree::Node is a thin wrapper or a typedef for a GraphA::Node. On the other hand, SpecialTree might be implemented in terms of a GraphB object, and a Tree::Node wraps a GraphB::Node.
I'll later have functions which deal with trees, like a depth-first search function. This function should accept both Tree and SpecialTree objects interchangeably.
The pattern
I will use a templated interface class to define the interface for a tree and a special tree. The template argument will be the implementation class. For example:
template <typename Implementation>
class TreeInterface
{
public:
typedef typename Implementation::Node Node;
virtual Node addNode() = 0;
virtual Node getParent(Node) = 0;
};
class TreeImplementation
{
GraphA graph;
public:
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent() { // ...return the parent... }
};
class Tree : public TreeInterface<TreeImplementation>
{
TreeImplementation* impl;
public:
Tree() : impl(new TreeImplementation);
~Tree() { delete impl; }
virtual Node addNode() { return impl->addNode(); }
virtual Node getParent() { return impl->getParent(); }
};
I could then derive SpecialTreeInterface from TreeInterface:
template <typename Implementation>
class SpecialTreeInterface : public TreeInterface<Implementation>
{
virtual void specialTreeFunction() = 0;
};
And define SpecialTree and SpecialTreeImplementation analogously to Tree and TreeImplementation.
My depth-first search function might look like this:
template <typename T>
void depthFirstSearch(TreeInterface<T>& tree);
and since SpecialTree derives from TreeInterface, this will work for Tree objects and SpecialTree objects.
Alternatives
An alternative is to rely more heavily on templates so that SpecialTree isn't a descendent of TreeInterface in the type hierarchy at all. In this case, my DFS function will look like template <typename T> depthFirstSearch(T& tree). This also throws out the rigidly defined interface describing exactly what methods a Tree or its descendents should have. Since a SpecialTree should always act like a Tree, but provide some additional methods, I like the use of an interface.
Instead of the TreeInterface template parameter being the implementation, I could make it take a "representation" class that defines what a Node looks like (it will also have to define what an Arc looks like, and so on). But since I'll potentially need one of these for each of the implementations, I think I'd like to keep this together with the implementation class itself.
What do I gain by using this pattern? Mostly, a looser coupling. If I'd like to change the implementation behind Tree, SpecialTree doesn't mind at all because it only inherits the interface.
The questions
So, does this pattern have a name? I'm using the handle-body pattern by storing a pointer to ContourTreeImplementation in ContourTree. But what about the approach of having a template-ized interface? Does this have a name?
Is there a better way to do this? It does seem that I am repeating myself a lot, and writing a lot of boilerplate code, but those nested Node classes give me trouble. If Tree::Node and SpecialTree::Node had reasonably similar implementations, I could define a NodeInterface interface for a Node in TreeInterface, and override the implementation of the node class in Tree and SpecialTree. But as it is, I can't guarantee that this is true. Tree::Node may wrap a GraphA::Node, and SpecialTree::Node may wrap an integer. So this method won't quite work, but it seems like there might still be room for improvement. Any thoughts?

Looks like a mixture of the Curiously Recurring Template Pattern and the Pimpl idiom.
In the CRTP, we derive Tree from TreeInterface<Tree>; in your code you're deriving Tree from TreeInterface<TreeImplementation>. So it's also as #ElliottFrisch said: it's an application of the strategy pattern. Certain parts of the code care that Tree conforms to TreeInterface, while certain other parts care about the fact that it uses the particular strategy TreeImplementation.
Is there a better way to do this? It does seem that I am repeating myself a lot
Well, it depends what your runtime requirements are. When I look at your code, the thing that jumps out at me is that you're using virtual methods — slooooow! And your class hierarchy looks like this:
Tree is a child of
TreeInterface<TreeImplementation>
SpecialTree is a child of
TreeInterface<SpecialTreeImplementation>
Notice that the fact that TreeInterface<X>::addNode() happens to be virtual has absolutely no bearing on whether TreeInterface<Y>::addNode() is virtual! So making those methods virtual doesn't gain us any runtime polymorphism; I can't write a function that takes an arbitrary instance of TreeInterfaceBase, because we haven't got a single TreeInterfaceBase. All we've got is a bag of unrelated base classes TreeInterface<T>.
So, why do those virtual methods exist? Aha. You're using virtual to pass information from the derived class back up to the parent: the child can "see" its parent via inheritance, and the parent can "see" the child via virtual. This is the problem that is usually solved via CRTP.
So, if we used CRTP (and thus didn't need the virtual stuff anymore), we'd have just this:
template <typename Parent>
struct TreeInterface {
using Node = typename Parent::Node;
Node addNode() { return static_cast<Parent*>(this)->addNode(); }
Node getParent(Node n) const { return static_cast<Parent*>(this)->getParent(n); }
};
struct ATree : public TreeInterface<ATree> {
GraphA graph;
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
struct BTree : public TreeInterface<BTree> {
GraphB graph;
typedef GraphB::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
template <typename Implementation>
void depthFirstSearch(TreeInterface<Implementation>& tree);
At this point someone would probably remark that we don't need the ugly pointer-casting CRTP at all and we could just write
struct ATree {
GraphA graph;
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
struct BTree {
GraphB graph;
typedef GraphB::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
template <typename Tree>
void depthFirstSearch(Tree& tree);
and personally I would agree with them.
Okay, you're concerned that then there's no way of ensuring through the typesystem that the T the caller passes to depthFirstSearch actually conforms to TreeInterface. Well, I think the most C++11-ish way of enforcing that restriction would be with static_assert. For example:
template<typename Tree>
constexpr bool conforms_to_TreeInterface() {
using Node = typename Tree::Node; // we'd better have a Node typedef
static_assert(std::is_same<decltype(std::declval<Tree>().addNode()), Node>::value, "addNode() has the wrong type");
static_assert(std::is_same<decltype(std::declval<Tree>().getParent(std::declval<Node>())), Node>::value, "getParent() has the wrong type");
return true;
}
template <typename T>
void depthFirstSearch(T& tree)
{
static_assert(conforms_to_TreeInterface<T>(), "T must conform to our defined TreeInterface");
...
}
Notice that my conforms_to_TreeInterface<T>() will actually static-assert-fail if T doesn't conform; it will never actually return false. You could equally well make it return true or false and then hit the static_assert in depthFirstSearch().
Anyway, that's how I'd approach the problem. Notice that my entire post was motivated by the desire to get rid of those inefficient and confusing virtuals — someone else might latch onto a different aspect of the problem and give a totally different answer.

Related

How to structure the inheritance from generic tree to a-b tree

I am trying to implement an a-b tree, as a derived class from a generic tree.
The generic tree node is as follows:
template<typename T>
struct TreeNode
{
T value;
std::vector<TreeNode*> children;
//Some other trivial stuff
};
The structure of the a-b node is as follows:
template<typename T>
struct ABTreeNode : TreeNode<T>
{
std::vector<T> keys;
//The idea is to omit the T value field of the base node and use that vector for the keys
};
Also in the generic tree class there exists a root field
TreeNode *root;
And the a-b constructor is
template<Typename T>
ABTree<T>::ABTree(T value)
{
GenericTree<T>::root = new ABTreeNode;
root->keys.push_back(value);
}
Now, the way this is made, I need to use down casting in a lot of the a-b tree methods, for example:
template<typename T>
bool ABTree<T>::search(T value)
{
ABTreeNode *node = GenericTree<T>::root;
//....
}//Downcast base to derived
As far as I know down casting is a bad practice and indicates bad design. The fact that I use variables defined in the derived struct but declare the node as base struct seems very error prone. What would happen if that node was created as a base node and not derived?
Eg:
//Somewhere:
TreeNode *node = new TreeNode;//Instead of new ABTreeNode
//..
//Somewhere else
node->keys//Shouldn't that be an error?
Is my approach correct? If not how should I structure it better?
PS: spare the raw pointers please.
Sharing code by inheritance is a bad design. Better is to use Composition - see https://en.wikipedia.org/wiki/Composition_over_inheritance
To share code between different implementations of various trees I would extract common fields into a struct.
template <class T, class ChildT>
struct TreeNodeCommons
{
T nodeValue;
std::vector<ChildT*> children;
// more common fields
}
Then I would attach it to Nodes of different types.
template<typename T>
struct ABTreeNode
{
TreeNodeCommons<T, ABTreeNode<T>> commons;
std::vector<T> keys;
};
You may then write templated algorithms assuming Node contains field named commons and you may write Node specific algorithms as well. And there is no dynamic_casts.

How to improve the self-referencing template implementation?

How to get rid of abstract classes in the given implementation of self-referencing templates?
I just tried to implement a skip-list data structure.
So I wanted to create the template Node such that I may instantiate the class of the next link for different node classes to avoid class casts.
Have found these questions:
Self-referencing Template in Template Argument
How to properly declare a self-referencing template type?
but none of them have a solution. Then I've made my own solution based on two lines of inheritance. One is the sequence of "abstract" templates (for Next argument propogation). Another is to instantiate concrete classes. But feel like it can be improved to handle the same without redundant abstract templates (NodeAbstract, NodeWithKeyAbstract etc). After several own tries I want to ask you help me:
template <class Value, class Next >
class NodeAbstract
{
public:
Value m_value;
Next * next;
NodeAbstract () : next(0) {}
Next * getNext() {return next;}
};
template <class Value, class Key, class Next >
class NodeWithKeyAbstract : public NodeAbstract <Value, Next >
{
public:
Key m_key;
};
template <class Value, class Key>
class NodeWithKey : public NodeWithKeyAbstract <Value, Key, NodeWithKey<Value,Key> >
{
};
template <class Value, class Key, int maxlevel, class Next>
class NodeSkipListAbstract : public NodeWithKeyAbstract<Value, Key, Next >
{
public:
Next * nextjump[maxlevel-1];
};
template <class Value, class Key, int maxlevel>
class NodeSkipList : public NodeSkipListAbstract<Value, Key, maxlevel, NodeSkipList<Value, Key, maxlevel> >
{
};
If I understand you correctly, your problem is basically that different maxlevel values in would produce different classes, and so you couldn't use one array to store them all (correct me if I'm wrong).
You cannot fully get rid of abstract classes - if you want to have nodes with different max level as different classes (different template specializations) you have to provide some common denominator for them.
Good news is that you can get rid of Curiously Recurring Template Pattern instead - since you use pointers you don't have to refer to exact implementation type (e.g. knowing exact template specialization) if you're abstraction gives you access to all information you need. Also your code can be simplified a bit.
Consider this code:
template <class Key, class Value>
class Node {
public:
virtual ~Node() = default;
virtual std::size_t MaxLevel() const = 0;
virtual Node* Skip(size_t level) const = 0;
// add setter as well
Key key;
Value value;
};
template <class Key, class Value, std::size_t max_level>
class NodeImpl : public Node<Key, Value> {
public:
typedef Node<Key, Value> node_type;
NodeImpl() : skips() {}
size_t MaxLevel() const { return max_level; }
node_type* Skip(std::size_t level) const {
return level < max_level ? skips[level] : nullptr;
}
// add setter as well
private:
node_type* skips[max_level];
};
template <class Key, class Value>
class SkipList {
public:
typedef Node<Key, Value> node_type;
node_type* head;
};
Here Node provides you with an abstraction for a "skipping" behavior. NodeImpl would be used to generate Nodes with different max level, but in the end used implementation would be transparent to you - you would only use Node's interface. Also on syntax level you would only use Node* type, so variety of implementations wouldn't be a problem. Virtual destructor would ensure that delete frees all memory, and key and value would always be accessible as public fields.
This code can of course be improved. Raw array can be replaced by std::array. Whole idea of max_level as a template can be get rid of if you decide to use std::vector with size set in constructor instead of array (then you'll only have Node and SkipList). As a bonus creating new nodes would be easier, since now you'd have to write some factory with specializations of all NodeImpl's from 1 to some value. Additionally pointers could be replaced by some smart pointer to avoid memory leaks.

OO design for intrusive data structure

I'm writing an intrusive linked list
class ListAlgorithm {
ListNode& next(ListNode& n) {
//returns an object of type ListNode linked to n.
}
};
Users usually want to add some features (such as some additional data) on ListNode like this:
class UserNode : public ListNode {
void operationOnUserData();
int userData;
};
Then users have to downcast ListNode returned by 'next' into UserNode. It is inconvenient. Thus, I tried to make ListAlgorithm a template class :
//U extends ListNode
template<class U>
class ListAlgorihtm {
U& next(U& u);
};
But then I have to upcast u into ListNode inside the method 'next' because class U could accidentally hide some members of ListNode that ListAlgorithm uses. This is error-prone because I could forget the upcast and compiler will not warn about that. I have to downcast ListNode into U again for the return value but it is safe because 'next' takes an instance u of U and the return value is something from u.
Another trial is
//U extends ListNode
template<class U>
class ListAlgorhtm {
U& next(ListNode& n);
};
In this case, the upcast problem is not there, but I have to downcast ListNode into U for the return value and it is not safe because it is not sure that n is an instance of U. It could be an instance of another type extending ListNode.
What is the best solution in this case? I think this is a very elementary design problem and I'd like to know what kind of material I have to study for basic OO design like this.
Your actual problem here is that you allow users to subclass ListNode and mess with its semantics by adding arbitrary data and operations to ListNode objects through subclassing. This therefore makes it necessary for the user to interpret the ListNode& return values of actual ListNode methods as something that those return values are not, semantically speaking.
This problem of a semantic nature is reflected in how tedious your code suddenly becomes, with casts and templating of an unrelated class (ListAlgorithm) which is due to your problem "propagating" and infecting other parts of your code.
Here's a solution: a ListNode object should not be allowed to also be a UserNode object. However, it should be allowed to have, to carry with it a UserData object that can be retrieved and manipulated.
In other words, your list becomes a simple container template, like std::list, and the users can specify the operations and data members that they need as part of the definition of the class they use as the template argument.
class IListNode
{
public:
// whatever public methods you want here
protected:
// pure virtual methods maybe?
};
class ListNode : public IListNode
{
// List node class, no data
};
template<class UserDataType>
class ListNodeWithData : public IListNode
{
private:
UserDataType data;
public:
ListNodeWithData <UserDataType>(UserDataType &data) :
data(data)
{ }
const UserDataType& getData() {
return data;
}
};
class ListAlgorithm
{
public:
template<class UserDataType>
ListNodeWithData<UserDataType>& next(const ListNodeWithData<UserDataType>& node) {
// Do stuff
}
ListNode& next(const ListNode& node) {
// Do stuff, which may be very similar to the stuff done above
// in which case you may want to prefer to just define the
// method below, and remove this one and the one above:
}
// You should define either this method or the two above, but having
// them all is possible too, if you find a use for it
IListNode& next(const IListNode& node) {
// Do generic stuff
}
};
As far as the size of the resulting classes is concerned, I just know it will increase if you use virtual methods in IListNode.
As far as the issue you raise goes, any time you want to operate on members of a class and avoid hiding by a derived class, just make sure your operations are on the base, so
template<class U>
class ListAlgorihtm {
public:
U& next(U& u) {
return static_cast<U&>(return nextNode(u));
}
private:
ListNode& nextNode(ListNode& n);
};
That said, you have a lot of options for this problem set. The Boost library has an "intrusive" library that embeds node information either as base_hook (as a base of the user data) or member_hook (as a member of the class, which avoids some of the problems you describe). Check it out at http://www.boost.org/doc/libs/1_57_0/doc/html/intrusive.html.

Parallel inheritance trees, where classes from one tree have containers of classes from another

I really didn't know how to specify the problem in the title, so here's the gist of it.
I am writing graph classes Graph, Node, and Edge, and then subclassing them into VisGraph, VisNode and VisEdge to obtain a drawable graph (in C++). I then need to further subclass those into specific classes that depend on certain data. So I have a lot of parallel inheritance:
Graph -- VisGraph -- RouteGraph
Node -- VisNode -- RouteNode
Edge -- VisEdge -- RouteEdge
This is pretty ugly and I started out doing this, so that I would implement functionality incrementally, but there are a lot of problems. One of them, for example, is that the base class has a container of all the Node instances in the graph. The problem is, if I am in a function in VisGraph dealing with a VisNode, that requires functionality unique to VisNode, I have to do a dynamic_cast on the Nodes that I get from the container in the base class.
Perhaps I should write a "Vis" class that holds a Graph and draws it?
I found inheritance convenient because each node/edge could easily draw itself instead of me
storing extra information outside about position etc. and drawing them all individually.
Do you have any suggestions/design patterns that could make this more elegant?
Thank you in advance.
If in doubt, throw templates at the problem until it surrenders:
template <typename N, typename E>
class Graph {
std::vector<N> nodes;
std::vector<E> edges;
};
typedef Graph<VisNode, VisEdge> VisGraph;
typedef Graph<RouteNode, RouteEdge> RouteGraph;
You lose the inheritance (RouteGraph no longer inherits from VisGraph), but that's normal in C++ for container types, and Graph is somewhat like a container. You can keep the inheritance between Node -> VisNode -> RouteNode, though.
Since nodes and edges are supposed to be of matching types, you could go even further, and give Graph a single template parameter, which itself is a class containing the edge and node types as typedefs. I'm not sure it's worth it, though.
Edit
Since you want to successively add functions, you could keep a form of inheritance but lose the polymorphism:
template <typename N, typename E>
class GraphImpl {
std::vector<N> nodes;
std::vector<E> edges;
};
template <typename N, typename E>
class VisGraphImpl : public GraphImpl<N, E> {
// constructors
// extra functions
};
template <typename N, typename E>
class RouteGraphImpl : public VisGraphImpl<N, E> {
// constructors
// extra functions
};
typedef GraphImpl<Node, Edge> Graph;
typedef VisGraphImpl<VisNode, VisEdge> VisGraph;
typedef RouteGraphImpl<RouteNode, RouteEdge> RouteGraph;
There might be a better way, though, by bundling these extra functions up into sensible mixins and using CRTP:
template<typename Derived>
class VisFunctions {
void somfunc() {
myself = static_cast<Derived&>(*this);
// do stuff
}
};
Then:
class VisGraph : public Graph<VisNode, VisEdge>, public VisFunctions<VisGraph> {
friend class VisFunctions<VisGraph>;
};
class RouteGraph : public Graph<RouteNode, RouteEdge>, public VisFunctions<RouteGraph>, public RouteFunctions<RouteGraph> {
friend class VisFunctions<RouteGraph>;
friend class RouteFunctions<RouteGraph>;
};
Not sure how that'll look for your real circumstances, though. Btw, if you don't want/need the friend declaration for the extra functions, then you don't need those extra functions to be members at all - just make them free functions that take a VisGraph or RouteGraph parameter.

c++ handling derived class that's self referencing

So suppose I have a tree class like this in c++
class Node{
void addChild(Node*);
/*obvious stuff*/
protected:
Node* parent;
vector<Node*> children
}
class specialNode : public Node{
void addChild(specialNode*);
/*obvious stuff*/
/*special stuff*/
}
Now whenever I access the children in specialTree, I obviously get Node*, not specialNode*.
But this specialNode* has member variables and functions that Node doesn't have.
I can force specialNode to only have take specialNode as children and otherwise break in compile time,
but I still get Node* when accessing children/parent, and I have to cast it whenever I want to use special functions, even in specialNode functions.
Is there any clever, or just any better way to go about this?
Other than literally casting every time?
If you only need SpecialNode objects in your tree (and just want to encapsulate all generic tree functionality in Node) you can make Node a so called "mix-in" class like
template <class N>
class Node : public N {
public:
void addChild(Node<N>*);
protected:
Node<N>* parent;
vector<Node<N>*> children;
};
class SpecialNodeBase {
// Here comes all "special" data/methods for your "special" tree
};
typedef Node<SpecialNodeBase> SpecialNode;
After that you can construct a tree of SpecialNode objects and use all methods from SpecialNodeBase as well as additional tree-managing functions from Node
Because addChild function in your child class is not polymorphism, make it virtual, but overloading functions across base/child members is not allowed, so we have to change the addChild parameter in the child class:
class Node{
virtual void addChild(Node*);
...
}
class specialNode : public Node{
virtual void addChild(Node*);
...
}
Now, it should work.
If you want to access to the childeren variable from the child class (specialNode class), you should cast it. For example:
specialNode* var = static_cast<specialNode*>(children[i]);
Since we declared addChild as a virtual function, then we should use dynamic_cast instead of static_cast if we aren't sure that children[i] is always an instance of specialNode class, and thus it is better to use dynamic_cast:
specialNode* var = dynamic_cast<specialNode*>(children[i]);
if(var != NULL)
{
//...
}
If I understand correctly, the "Mix-in" class solution won't allow you to call addChild from functions implemented by SpecialNodeBaseClass.
You can actually do the following:
template <class recursiveT>
class Base {
public:
Base(dataType data) { populate children with data; }
void addChild() { something base class appropriate; }
protected:
std::vector<recursiveT> children;
};
class Derived: public Base<Derived> {
public:
/* note: the constructor here will actually call the
constuctor of the base class */
Derived(dataType data) : Base<Derived>(data) {}
/* other special functions go here. */
};
This may look a little crazy, but it compiles cleanly for me on several GCC versions so I'm inclined to believe it's not totally wrong-headed. You should now be able to call the functions of Base from inside Derived.
You will definitely have to cast the Node * to a specialNode * at some point, but you can make this clean and easy to manage by doing this in only one place. You could add a member function, say getParent and override it in specialNode, like this:
class Node {
...
virtual Node *getParent() {
return parent;
}
};
class specialNode : public Node {
...
specialNode *getParent() {
return dynamic_cast<specialNode *>(parent);
}
};
Of course, this is assuming that specialNodes always have other specialNodes as parent/children. If you mix Nodes and specialNodes, this obviously won't work.