I am trying to implement an a-b tree, as a derived class from a generic tree.
The generic tree node is as follows:
template<typename T>
struct TreeNode
{
T value;
std::vector<TreeNode*> children;
//Some other trivial stuff
};
The structure of the a-b node is as follows:
template<typename T>
struct ABTreeNode : TreeNode<T>
{
std::vector<T> keys;
//The idea is to omit the T value field of the base node and use that vector for the keys
};
Also in the generic tree class there exists a root field
TreeNode *root;
And the a-b constructor is
template<Typename T>
ABTree<T>::ABTree(T value)
{
GenericTree<T>::root = new ABTreeNode;
root->keys.push_back(value);
}
Now, the way this is made, I need to use down casting in a lot of the a-b tree methods, for example:
template<typename T>
bool ABTree<T>::search(T value)
{
ABTreeNode *node = GenericTree<T>::root;
//....
}//Downcast base to derived
As far as I know down casting is a bad practice and indicates bad design. The fact that I use variables defined in the derived struct but declare the node as base struct seems very error prone. What would happen if that node was created as a base node and not derived?
Eg:
//Somewhere:
TreeNode *node = new TreeNode;//Instead of new ABTreeNode
//..
//Somewhere else
node->keys//Shouldn't that be an error?
Is my approach correct? If not how should I structure it better?
PS: spare the raw pointers please.
Sharing code by inheritance is a bad design. Better is to use Composition - see https://en.wikipedia.org/wiki/Composition_over_inheritance
To share code between different implementations of various trees I would extract common fields into a struct.
template <class T, class ChildT>
struct TreeNodeCommons
{
T nodeValue;
std::vector<ChildT*> children;
// more common fields
}
Then I would attach it to Nodes of different types.
template<typename T>
struct ABTreeNode
{
TreeNodeCommons<T, ABTreeNode<T>> commons;
std::vector<T> keys;
};
You may then write templated algorithms assuming Node contains field named commons and you may write Node specific algorithms as well. And there is no dynamic_casts.
Related
I have this piece of code for a Tree. The BSTnodes contain the actual data. BST is a wrapper around them by inheriting from unique_ptr<BSTnode<Key,Data>>. BST doesn't add any new fields to the class.
The inheritance makes it so that my tree is a unique_ptr<BSTnode>, but is that a correct way of implementing it? The added operations for the BST like rotate() / insert() or remove() are specific to the data structure. You wouldn't and shouldn't expect them for a regular unique_ptr, but this does mean that a BST can't be used interchangeably with a unique_ptr.
If this implementation strategy is incorrect, how should I solve it?
template <class Key, class Data>
class BST : public unique_ptr<BSTnode<Key, Data>>
{
using unique_ptr<BSTnode<Key, Data>>::unique_ptr;
// operations ...
};
template <class Key, class Data>
class BSTnode
{
friend class BST<Key, Data>;
public:
//constructors ...
protected:
Key key;
Data data;
BSTnode<Key, Data> *parent;
BST<Key, Data> left, right;
};
LSP aside, inheriting standard classes is generally problematic and not a recommended solution for most cases. In this case, as #SomeProgrammerDude suggests, it's better to use composition and put the pointer inside your class:
template <class Key, class Data>
class BST
{
std::unique_ptr<BSTnode<Key, Data>> root;
// operations ...
};
Noone would want to use your BST class to replace the unique_ptr anyway. It's a separate data container that just happens to utilize unique_ptr to store its data.
How to get rid of abstract classes in the given implementation of self-referencing templates?
I just tried to implement a skip-list data structure.
So I wanted to create the template Node such that I may instantiate the class of the next link for different node classes to avoid class casts.
Have found these questions:
Self-referencing Template in Template Argument
How to properly declare a self-referencing template type?
but none of them have a solution. Then I've made my own solution based on two lines of inheritance. One is the sequence of "abstract" templates (for Next argument propogation). Another is to instantiate concrete classes. But feel like it can be improved to handle the same without redundant abstract templates (NodeAbstract, NodeWithKeyAbstract etc). After several own tries I want to ask you help me:
template <class Value, class Next >
class NodeAbstract
{
public:
Value m_value;
Next * next;
NodeAbstract () : next(0) {}
Next * getNext() {return next;}
};
template <class Value, class Key, class Next >
class NodeWithKeyAbstract : public NodeAbstract <Value, Next >
{
public:
Key m_key;
};
template <class Value, class Key>
class NodeWithKey : public NodeWithKeyAbstract <Value, Key, NodeWithKey<Value,Key> >
{
};
template <class Value, class Key, int maxlevel, class Next>
class NodeSkipListAbstract : public NodeWithKeyAbstract<Value, Key, Next >
{
public:
Next * nextjump[maxlevel-1];
};
template <class Value, class Key, int maxlevel>
class NodeSkipList : public NodeSkipListAbstract<Value, Key, maxlevel, NodeSkipList<Value, Key, maxlevel> >
{
};
If I understand you correctly, your problem is basically that different maxlevel values in would produce different classes, and so you couldn't use one array to store them all (correct me if I'm wrong).
You cannot fully get rid of abstract classes - if you want to have nodes with different max level as different classes (different template specializations) you have to provide some common denominator for them.
Good news is that you can get rid of Curiously Recurring Template Pattern instead - since you use pointers you don't have to refer to exact implementation type (e.g. knowing exact template specialization) if you're abstraction gives you access to all information you need. Also your code can be simplified a bit.
Consider this code:
template <class Key, class Value>
class Node {
public:
virtual ~Node() = default;
virtual std::size_t MaxLevel() const = 0;
virtual Node* Skip(size_t level) const = 0;
// add setter as well
Key key;
Value value;
};
template <class Key, class Value, std::size_t max_level>
class NodeImpl : public Node<Key, Value> {
public:
typedef Node<Key, Value> node_type;
NodeImpl() : skips() {}
size_t MaxLevel() const { return max_level; }
node_type* Skip(std::size_t level) const {
return level < max_level ? skips[level] : nullptr;
}
// add setter as well
private:
node_type* skips[max_level];
};
template <class Key, class Value>
class SkipList {
public:
typedef Node<Key, Value> node_type;
node_type* head;
};
Here Node provides you with an abstraction for a "skipping" behavior. NodeImpl would be used to generate Nodes with different max level, but in the end used implementation would be transparent to you - you would only use Node's interface. Also on syntax level you would only use Node* type, so variety of implementations wouldn't be a problem. Virtual destructor would ensure that delete frees all memory, and key and value would always be accessible as public fields.
This code can of course be improved. Raw array can be replaced by std::array. Whole idea of max_level as a template can be get rid of if you decide to use std::vector with size set in constructor instead of array (then you'll only have Node and SkipList). As a bonus creating new nodes would be easier, since now you'd have to write some factory with specializations of all NodeImpl's from 1 to some value. Additionally pointers could be replaced by some smart pointer to avoid memory leaks.
As an alternative to the classic pointer representations of binary trees and adjacency list, what would be a good way to implement graph and trees using the STL in C++, so as to make it dynamic and to minimize memory leaks and segfaults.
One such implementation of adjacency list I found was by using an STL list<> inside a structure,
struct Node {
int data;
list<int> adj;
};
and then declare an array of struct pointers
struct Node *nodes[10005];
but all this is considering integer data, what if the data to be stored is not integer, how to use STL at its maximum potential?
use template classes
for example
template<class t1>
class node
{
public:
t1 value;
node * link;
void getdata(t1 val)
{
value=val;
}
};
now you can deal with any type of data
This greatly depends on what will be done with such structures. Let's start with your second question - to use any type, apply templates
template<class T>
class Node
{
public:
T value;
std::list<int> nodeIndices;
}
struct Node *nodes[10005];
If you wish to use different types in a single graph structure, combine above with inheritance
class BaseNode
{
public:
std::list<int> nodeIndices;
// some functions
};
template<class T>
class Node : public BaseNode
{
public:
T value;
}
struct BaseNode *nodes[10005];
This way you can store objects of different types. Another remark might be to use std::array instead of classic array.
I'm not sure if i should.. or should not use a struct to create a binary search tree, the other option is to create the nodes out of a separate node class. with a data, left and right. Which one is better? And why?
heres my code for the BST
template <typename T>
class BST : public SearchableADT<T>
{
public:
BST(void){ head = NULL; numnodes = 0; }
virtual ~BST(void);
virtual int loadFromFile(string filename);
virtual void clear(void);
virtual void insertEntry(T info);
virtual void deleteEntry(T info);
virtual bool isThere(T info);
virtual int numEntries(void);
//needed for comparison to AVL
int BST<T>::height(t_node* tPTR);
protected:
struct t_node
{
string data;
t_node *L;
t_node *R;
};
int numnodes;
t_node* head;
t_node* cPTR; //current pointer
t_node* pPTR; //parent pointer
t_node* tPTR; //temporary pointer
}; // end of class BST
I'm not sure if you understand the difference between struct and class but basically:
struct
Has public access for all of its members by default and
class
Has private access for all of its members by default.
You can achieve the same thing with both of them but many programmers, including myself, tend to use structs for POD objects (Plain Old Data) for straight up access (It makes it easier to write less).
That said, I think you should put your Node class outside in a different file since the BST and Node classes are very different. Since you gave your BST class a template, I am assuming that you are gonna use more than just the Node class, which gives more reason to separate the files for the projects that you might not use the Node class. If you aren't going to use more than just a Node class, you might consider removing the template and defining the Node struct/class inside the BST class!
It is better to create two classes, one for the BST and another for the node. They are two different abstractions. A node is a simpler abstraction whose main purpose is to hold the data necessary to define a BST. A BST is a higher level abstraction. It's a collection class with its own constraints and expectations.
The motivation
Let's say I'm writing a Tree class. I will represent nodes of the tree by a Tree::Node class. Methods of the class might return Tree::Node objects and take them as arguments, such as a method which gets the parent of a node: Node getParent(Node).
I'll also want a SpecialTree class. SpecialTree should extend the interface of a Tree and be usable anywhere a Tree is.
Behind the scenes, Tree and SpecialTree might have totally different implementations. For example, I might use a library's GraphA class to implement a Tree, so that Tree::Node is a thin wrapper or a typedef for a GraphA::Node. On the other hand, SpecialTree might be implemented in terms of a GraphB object, and a Tree::Node wraps a GraphB::Node.
I'll later have functions which deal with trees, like a depth-first search function. This function should accept both Tree and SpecialTree objects interchangeably.
The pattern
I will use a templated interface class to define the interface for a tree and a special tree. The template argument will be the implementation class. For example:
template <typename Implementation>
class TreeInterface
{
public:
typedef typename Implementation::Node Node;
virtual Node addNode() = 0;
virtual Node getParent(Node) = 0;
};
class TreeImplementation
{
GraphA graph;
public:
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent() { // ...return the parent... }
};
class Tree : public TreeInterface<TreeImplementation>
{
TreeImplementation* impl;
public:
Tree() : impl(new TreeImplementation);
~Tree() { delete impl; }
virtual Node addNode() { return impl->addNode(); }
virtual Node getParent() { return impl->getParent(); }
};
I could then derive SpecialTreeInterface from TreeInterface:
template <typename Implementation>
class SpecialTreeInterface : public TreeInterface<Implementation>
{
virtual void specialTreeFunction() = 0;
};
And define SpecialTree and SpecialTreeImplementation analogously to Tree and TreeImplementation.
My depth-first search function might look like this:
template <typename T>
void depthFirstSearch(TreeInterface<T>& tree);
and since SpecialTree derives from TreeInterface, this will work for Tree objects and SpecialTree objects.
Alternatives
An alternative is to rely more heavily on templates so that SpecialTree isn't a descendent of TreeInterface in the type hierarchy at all. In this case, my DFS function will look like template <typename T> depthFirstSearch(T& tree). This also throws out the rigidly defined interface describing exactly what methods a Tree or its descendents should have. Since a SpecialTree should always act like a Tree, but provide some additional methods, I like the use of an interface.
Instead of the TreeInterface template parameter being the implementation, I could make it take a "representation" class that defines what a Node looks like (it will also have to define what an Arc looks like, and so on). But since I'll potentially need one of these for each of the implementations, I think I'd like to keep this together with the implementation class itself.
What do I gain by using this pattern? Mostly, a looser coupling. If I'd like to change the implementation behind Tree, SpecialTree doesn't mind at all because it only inherits the interface.
The questions
So, does this pattern have a name? I'm using the handle-body pattern by storing a pointer to ContourTreeImplementation in ContourTree. But what about the approach of having a template-ized interface? Does this have a name?
Is there a better way to do this? It does seem that I am repeating myself a lot, and writing a lot of boilerplate code, but those nested Node classes give me trouble. If Tree::Node and SpecialTree::Node had reasonably similar implementations, I could define a NodeInterface interface for a Node in TreeInterface, and override the implementation of the node class in Tree and SpecialTree. But as it is, I can't guarantee that this is true. Tree::Node may wrap a GraphA::Node, and SpecialTree::Node may wrap an integer. So this method won't quite work, but it seems like there might still be room for improvement. Any thoughts?
Looks like a mixture of the Curiously Recurring Template Pattern and the Pimpl idiom.
In the CRTP, we derive Tree from TreeInterface<Tree>; in your code you're deriving Tree from TreeInterface<TreeImplementation>. So it's also as #ElliottFrisch said: it's an application of the strategy pattern. Certain parts of the code care that Tree conforms to TreeInterface, while certain other parts care about the fact that it uses the particular strategy TreeImplementation.
Is there a better way to do this? It does seem that I am repeating myself a lot
Well, it depends what your runtime requirements are. When I look at your code, the thing that jumps out at me is that you're using virtual methods — slooooow! And your class hierarchy looks like this:
Tree is a child of
TreeInterface<TreeImplementation>
SpecialTree is a child of
TreeInterface<SpecialTreeImplementation>
Notice that the fact that TreeInterface<X>::addNode() happens to be virtual has absolutely no bearing on whether TreeInterface<Y>::addNode() is virtual! So making those methods virtual doesn't gain us any runtime polymorphism; I can't write a function that takes an arbitrary instance of TreeInterfaceBase, because we haven't got a single TreeInterfaceBase. All we've got is a bag of unrelated base classes TreeInterface<T>.
So, why do those virtual methods exist? Aha. You're using virtual to pass information from the derived class back up to the parent: the child can "see" its parent via inheritance, and the parent can "see" the child via virtual. This is the problem that is usually solved via CRTP.
So, if we used CRTP (and thus didn't need the virtual stuff anymore), we'd have just this:
template <typename Parent>
struct TreeInterface {
using Node = typename Parent::Node;
Node addNode() { return static_cast<Parent*>(this)->addNode(); }
Node getParent(Node n) const { return static_cast<Parent*>(this)->getParent(n); }
};
struct ATree : public TreeInterface<ATree> {
GraphA graph;
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
struct BTree : public TreeInterface<BTree> {
GraphB graph;
typedef GraphB::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
template <typename Implementation>
void depthFirstSearch(TreeInterface<Implementation>& tree);
At this point someone would probably remark that we don't need the ugly pointer-casting CRTP at all and we could just write
struct ATree {
GraphA graph;
typedef GraphA::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
struct BTree {
GraphB graph;
typedef GraphB::Node Node;
Node addNode() { return graph.addNode(); }
Node getParent(Node n) const { // ...return the parent... }
};
template <typename Tree>
void depthFirstSearch(Tree& tree);
and personally I would agree with them.
Okay, you're concerned that then there's no way of ensuring through the typesystem that the T the caller passes to depthFirstSearch actually conforms to TreeInterface. Well, I think the most C++11-ish way of enforcing that restriction would be with static_assert. For example:
template<typename Tree>
constexpr bool conforms_to_TreeInterface() {
using Node = typename Tree::Node; // we'd better have a Node typedef
static_assert(std::is_same<decltype(std::declval<Tree>().addNode()), Node>::value, "addNode() has the wrong type");
static_assert(std::is_same<decltype(std::declval<Tree>().getParent(std::declval<Node>())), Node>::value, "getParent() has the wrong type");
return true;
}
template <typename T>
void depthFirstSearch(T& tree)
{
static_assert(conforms_to_TreeInterface<T>(), "T must conform to our defined TreeInterface");
...
}
Notice that my conforms_to_TreeInterface<T>() will actually static-assert-fail if T doesn't conform; it will never actually return false. You could equally well make it return true or false and then hit the static_assert in depthFirstSearch().
Anyway, that's how I'd approach the problem. Notice that my entire post was motivated by the desire to get rid of those inefficient and confusing virtuals — someone else might latch onto a different aspect of the problem and give a totally different answer.