How to derive from a base tree class - c++

I have a tree class that includes data members and member functions that operate on the data of children:
class Tree {
// variables, member functions here
Tree *parent;
std::vector<Tree*> children;
public:
Tree(Tree* parent, int par0 /*, other constructor parameters*/) {
//....
this->parent = parent;
for (int n = 0; n < par0; n++)
children.push_back(new Tree(this, /*other arguments*/));
//...
}
void method() {
for (auto node: children)
node->method();
if (children.size() == 0) {
// Code for leaf nodes
} else {
// Code for internal nodes
}
}
};
The Tree constructor creates the tree structure, allocating and initializing every node according to the arguments.
I would like to add new data and functions to the class, resulting in the new class ExtTree, which has access to all data and functions of Tree, and shares as much code with it as possible. However, in ExtTree both the parents and the children should be (ExtTree*) rather than (Tree*). How can I reorganize this code so that ExtTree would just add its own new data, and fall back on Tree for the old methods?
A related question was asked 4 years ago here, but I could not work out the solution based on the answers in that; in particular, how ExtTree would call the base constructor, or how it can access Tree::method().

You will get the best, most type-safe result by converting the whole thing to a template.
It is possible to come up with a non-template based solution, but I don't think it's going to be optimal. I am suggesting a template-based solution but, if for whatever reason, a template is not going to cut it, you might get acceptable results with the following approach:
A) Define a virtual method: ExtTree *get_extree(). Tree::get_extree() either returns nullptr or throws an exception (up to you, whichever works best for your application). ExtTree::get_extree() returns this.
B) Define another virtual method called create_node(). Tree::create_node() executes new Tree( /* forwarded parameters */), and ExtTree::create_node() executes new ExtTree( /* forwarded parameters */).
C) Replace all of your existing new Tree calls with a call to create_node(), instead.
D) And any other common code laying reason that needs to work with both Trees and ExtTrees will use get_extree() to figure out what it's working with.
This will, more or less, get you where you want to go, mostly. A) is little different from, conceptually, just using a dynamic_cast(), and there's nothing wrong with using a dynamic_cast in lieu of get_extree(). But sometimes you can come across someone having an allergy to dynamic_cast, and this would be one way to avoid it.
I don't think there's really a perfect, 100% clean, non-template based solution here. It might be possible to get a 100% type-safe solution here by creating a much larger pile of virtual methods, and implementing pretty much everything as a virtual method.
But, I think it's going to be a lot of work. This is really what templates are for. Use them.

Related

Is there a way to overload classes in a way similar to function overloading?

We can overload functions by giving them a different number of parameters. For example, functions someFunc() and someFunc(int i) can do completely different things.
Is it possible to achieve the same effect on classes? For example, having one class name but creating one class if a function is not called and a different class if that function is not called. For example, If I have a dataStorage class, I want the internal implementation to be a list if only add is called, but want it to be a heap if both add and pop are called.
I am trying to implement this in C++, but I am curious if this is even possible. Examples in other languages would also help. Thanks!
The type of an object must be completely known at the point of definition. The type cannot depend on what is done with the object later.
For the dataStorage example, you could define dataStorage as an abstract class. For example:
struct dataStorage {
virtual ~dataStorage() = default;
virtual void add(dataType data) = 0;
// And anything else necessarily common to all implementations.
};
There could be a "default" implementation that uses a list.
struct dataList : public dataStorage {
void add(dataType data) override;
// And whatever else is needed.
};
There could be another implementation that uses a heap.
struct dataHeap : public dataStorage {
void add(dataType data) override;
void pop(); // Maybe return `dataType`, if desired
// And whatever else is needed.
};
Functions that need only to add data would work on references to dataStorage. Functions that need to pop data would work on references to dataHeap. When you define an object, you would choose dataList if the compiler allows it, dataHeap otherwise. (The compiler would not allow passing a dataList object to a function that requires a dataHeap&.) This is similar to what you asked for, except it does require manual intervention. On the bright side, you can use the compiler to tell you which decision to make.
A downside of this approach is that changes can get messy. There is additional maintenance and runtime overhead compared to simply always using a heap (one class, no inheritance). You should do some performance measurements to ensure that the cost is worth it. Sometimes simplicity is the best design, even if it is not optimal in all cases.

Is dynamic_casting through inheritance hierarchy bad practice?

I have got the following data structure:
class Element {
std::string getType();
std::string getId();
virtual std::vector<Element*> getChildren();
}
class A : public Element {
void addA(const A *a);
void addB(const B *b);
void addC(const C *c);
std::vector<Element*> getChildren();
}
class B : public Element {
void addB(const B *b);
void addC(const C *c);
std::vector<Element*> getChildren();
}
class C : public Element {
int someActualValue;
}
/* The classes also have some kind of container to store the pointers and
* child elements. But let's keep the code short. */
The data structure is used to pruduce a acyclic directed graph. The C class acts as a "leaf" containing actual data for algebra-tasks. A and B hold other information, like names, types, rules, my favourite color and the weather forecast.
I want to program a feature, where a window pops up and you can navigate through an already existing structure. On the way i want to show the path the user took with some pretty flow chart, which is clickable to go back in the hierarchy. Based on the currently visited Graph-Node (which could be either A, B or C) some information has to be computed and displayed.
I thought i could just make a std::vector of type Element* and use the last item as the active element i work with. I thought that was a pretty nice approach, as it makes use of the inheritance that is already there and keeps the code i need quite small.
But i have a lot of situations like these:
Element* currentElement;
void addToCurrentElement(const C *c) {
if(A *a = dynamic_cast<A*>(currentElement)) {
//doSomething, if not, check if currentElement is actually a B
}
}
Or even worse:
vector<C*> filterForC's(A* parent) {
vector<Element*> eleVec = parent.getChildren();
vector<C*> retVec;
for(Element* e : eleVec) {
if (e.getType() == "class C") {
C *c = dynamic_cast<C*>(e);
retVec.append(c);
}
}
}
It definitely is object oriented. It definitely does use inheritance. But it feels like i just threw all the comfort OOP gives me over board and decided to use raw pointers and bitshifts again. Googling the subject, i found a lot of people saying casting up/down is bad design or bad practice. I totally believe that this is true, but I want to know why exactly. I can not change most of the code as it is part of a bigger project, but i want to know how to counter something like this situation when i design a program in the future.
My Questions:
Why is casting up/down considered bad design, besides the fact that it looks horrible?
Is a dynamic_cast slow?
Are there any rules of thumb how i can avoid a design like the one i explained above?
There are a lot of questions on dynamic_cast here on SO. I read only a few and also don't use that method often in my own code, so my answer reflects my opinion on this subject rather than my experience. Watch out.
(1.) Why is casting up/down considered bad design, besides the fact that it looks horrible?
(3.) Are there any rules of thumb how i can avoid a design like the one i explained above?
When reading the Stroustrup C++ FAQ, imo there is one central message: don't trust the people which say never use a certain tool. Rather, use the right tool for the task at hand.
Sometimes, however, two different tools can have a very similar purpose, and so is it here. You basically can recode any functionality using dynamic_cast through virtual functions.
So when is dynamic_cast the right tool? (see also What is the proper use case for dynamic_cast?)
One possible situation is when you have a base class which you can't extend, but nevertheless need to write overloaded-like code. With dynamic-casting you can do that non-invasive.
Another one is where you want to keep an interface, i.e. a pure virtual base class, and don't want to implement the corresponding virtual function in any derived class.
Often, however, you rather want to rely on virtual function -- if only for the reduced uglyness. Further it's more safe: a dynamic-cast can fail and terminate your program, a virtual function call (usually) won't.
Moreover, implemented in terms of pure functions, you will not forget to update it in all required places when you add a new derived class. On the other hand, a dynamic-cast can easily be forgotten in the code.
Virtual function version of your example
Here is the example again:
Element* currentElement;
void addToCurrentElement(const C *c) {
if(A *a = dynamic_cast<A*>(currentElement)) {
//doSomething, if not, check if currentElement is actually a B
}
}
To rewrite it, in your base add a (possibly pure) virtual functions add(A*), add(B*) and add(C*) which you overload in the derived classes.
struct A : public Element
{
virtual add(A* c) { /* do something for A */ }
virtual add(B* c) { /* do something for B */ }
virtual add(C* c) { /* do something for C */ }
};
//same for B, C, ...
and then call it in your function or possibly write a more concise function template
template<typename T>
void addToCurrentElement(T const* t)
{
currentElement->add(t);
}
I'd say this is the standard approach. As mentioned, the drawback could be that for pure virtual functions you require N*N overloads where maybe N might be enough (say, if only A::add requires a special treatment).
Other alternatives might use RTTI, the CRTP pattern, type erasure, and possibly more.
(2.) Is a dynamic_cast slow?
When considering what the majority of answers throughout the net state, then yes, a dynamic cast seems to be slow, see here for example.
Yet, I don't have practical experience to support or disconfirm this statement.

Deriving from a base class whose instances reside in a fixed format (database, MMF)...how to be safe?

(Note: I'm looking for really any suggestions on the right search terms to read up on this category of issue. "Object-relational-mapping" occurred to me as a place where I could find some good prior art...but I haven't seen anything quite fitting this scenario just yet.)
I have a very generic class Node, which for the moment you can think of as being a bit like an element in a DOM tree. This is not precisely what's going on--they're graph database objects in a memory mapped file. But the analogy is fairly close for all practical purposes, so I'll stick to DOM terms for simplicity.
The "tag" embedded in the node implies a certain set of operations you should (ideally) be able to do with it. Right now I'm using derived classes to do this. So for instance, if you were trying to represent something like an HTML list:
<ul>
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
</ul>
The underlying tree would be seven nodes:
+--UL // Node #1
+--LI // Node #2
+--String(Coffee) // Node #3 (literal text)
+--LI // Node #4
+--String(Tea) // Node #5 (literal text)
+--LI // Node #6
+--String(Milk) // Node #7 (literal text)
Since getString() is already a primitive method on Nodes themselves, I'd probably only make class UnorderedListNode : public Node, class ListItemNode : public Node.
Continuing this hypothetical, let's imagine I wanted to help the programmer use less general functions when they know more about the Node "type"/tag they have in their hands. Perhaps I want to assist them with structural idioms on the tree, like adding a string item to an unordered list, or extracting things as a string. (This is just an analogy so don't take the routines too seriously.)
class UnorderedListNode : public Node {
private:
// Any data members someone put here would be a mistake!
public:
static boost::optional<UnorderedListNode&> maybeCastFromNode(Node& node) {
if (node.tagString() == "ul") {
return reinterpret_cast<UnorderedListNode&>(node);
}
return boost::none;
}
// a const helper method
vector<string> getListAsStrings() const {
vector<string> result;
for (Node const* childNode : children()) {
result.push_back(childNode->children()[0]->getText());
}
return result;
}
// helper method requiring mutable object
void addStringToList(std::string listItemString) {
unique_ptr<Node> liNode (new Node (Tag ("LI"));
unique_ptr<Node> textNode (new Node (listItemString));
liNode->addChild(std::move(textNode));
addChild(std::move(liNode));
}
};
Adding data members to these new derived classes is a bad idea. The only way to really persist any information is to use the foundational routines of Node (for instance, the addChild call above, or getText) to interact with the tree. Thus the real inheritance model--to the extent one exists--is outside of the C++ type system. What makes a <UL> node "maybeCast" into an UnorderedListNode has nothing to do with vtables/etc.
C++ inheritance looks right sometimes, but feels wrong usually. I feel like instead of inheritance I should have classes that exist independently of Node, and just collaborate with it somehow as "accessor helpers"...but I don't have a good grasp of what that would be like.
I am not sure I have understood completely what you intend to do but here are some suggestions you might find useful.
You are definitely on the right track with inheritance. All the UL nodes, LI nodes, ... etc. are Node-s. Perfect "is_a" relationship, you should derive these classes from the Node class.
let's imagine I wanted to help the programmer use less general functions when they know more about the Node "type"/tag they have in their hands
...and this is what virtual functions are for.
Now for the maybeCastFromNode method. That's downcasting. Why would you do that? Maybe for deserializing? If yes, then I'd recommend using dynamic_cast<UnorderedListNode *> . Although most likely you won't need RTTI at all if the inheritance tree and the virtual methods are well-designed.
C++ inheritance looks right sometimes, but feels wrong usually.
This might not always be C++'s fault :-)
"C++ inheritance looks right sometimes, but feels wrong usually."
Indeed, and this statement is worrisome:
What makes a node "maybeCast" into an UnorderedListNode has nothing to do with vtables/etc.
As is this code:
static boost::optional<UnorderedListNode&> maybeCastFromNode(Node& node) {
if (tagString() == "ul") {
return reinterpret_cast<UnorderedListNode&>(node);
}
return boost::none;
}
(1) type-punning
If the Node& being passed in was allocated through a mechanism that did not legally and properly construct an UnorderedListNode on the inheritance path, this is what is called type punning. It's almost always a bad idea. Even if the memory layout on most compilers appears to work when there are no virtual functions and derived classes add no data members, they are free to break it in most all circumstances.
(2) strict-alias
Next there is the compiler's assumption that pointers to objects of fundamentally different types do not "alias" each other. This is the strict aliasing requirement. Although it can be disabled via non-standard extensions, that should only be applied in legacy situations...it hinders optimization.
It's not completely clear--from an academic standpoint--whether these two hindrances have workarounds permitted by the spec under special circumstances. Here's a question which investigates that, and is still an open discussion at time of writing:
Make interchangeable class types via pointer casting only, without having to allocate any new objects?
But to quote #MatthieuM.: "The closer you get to the edges of the specifications, the more likely you are to hit a compiler bug. So, as engineer, I advise to be pragmatic and avoid playing mind games with your compiler; whether you are right or wrong is irrelevant: when you get a crash in production code, you lose, not the compiler writers."
This is probably more the right track:
I feel like instead of inheritance I should have classes that exist independently of Node, and just collaborate with it somehow as "accessor helpers"...but I don't have a good grasp of what that would be like.
Using Design Pattern terms, this matches something like a Proxy. You would have a lightweight object that stores the pointer and is then passed around by value. In practice, it can be tricky to handle issues like what to do about the const-ness of the incoming pointers being wrapped!
Here's a sample of how it might be done relatively simply for this case. First, a definition for the Accessor base class:
template<class AccessorType> class Wrapper;
class Accessor {
private:
mutable Node * nodePtrDoNotUseDirectly;
template<class AccessorType> friend class Wrapper;
void setNodePtr(Node * newNodePtr) {
nodePtrDoNotUseDirectly = newNodePtr;
}
void setNodePtr(Node const * newNodePtr) const {
nodePtrDoNotUseDirectly = const_cast<Node *>(newNodePtr);
}
Node & getNode() { return *nodePtrDoNotUseDirectly; }
Node const & getNode() const { return *nodePtrDoNotUseDirectly; }
protected:
Accessor() {}
public:
// These functions should match Node's public interface
// Library maintainer must maintain these, but oh well
inline void addChild(unique_ptr<Node>&& child)) {
getNode().addChild(std::move(child));
}
inline string getText() const { return getNode().getText(); }
// ...
};
Then, a partial template specialization for handling the case of wrapping a "const Accessor", which is how to signal that it will be receiving a const Node &:
template<class AccessorType>
class Wrapper<AccessorType const> {
protected:
AccessorType accessorDoNotUseDirectly;
private:
inline AccessorType const & getAccessor() const {
return accessorDoNotUseDirectly;
}
public:
Wrapper () = delete;
Wrapper (Node const & node) { getAccessor().setNodePtr(&node); }
AccessorType const * operator-> const () { return &getAccessor(); }
virtual ~Wrapper () { }
};
The Wrapper for the "mutable Accessor" case inherits from its own partial template specialization. This way the inheritance provides for the proper coercions and assignments...prohibiting the assignment of a const to a non-const, but working the other way around:
template<class AccessorType>
class Wrapper : public Wrapper<AccessorType const> {
private:
inline AccessorType & getAccessor() {
return Wrapper<AccessorType const>::accessorDoNotUseDirectly;
}
public:
Wrapper () = delete;
Wrapper (Node & node) : Wrapper<AccessorType const> (node) { }
AccessorType * operator-> () { return &Wrapper::getAccessor(); }
virtual ~Wrapper() { }
};
A compiling implementation with test code and with comments documenting the weird parts is in a Gist here.
Sources: #MatthieuM., #PaulGroke

Using generic ADTs

I have a design problem. I'm asked to plan a design for a certain problem, where I need a few lists, and also a queue (which I need to create by myself, STL isn't allowed). In order to make the implementation more efficient, I thought about creating a generic list as follows: Create a node which contains a pointer to 'Data', an empty class. Then, any class that I want to make a list or a queue of (is the last sentence grammatically correct?), I'll just make it a subclass of data. That's the only way to make a generic list (I think), as we are not allowed to use void*.
The problem begins when I want to use a certain method of a certain class in a certain list. I can't do that, since 'Data' doesn't know that function. Creating a virtual function in Data is counter-logical and ugly, and we're also not allowed to use any downcasting.
Is there a way to overcome the problem using generic ADTs? Or must I create specific lists?
Thank you very much!
edit: We are also not allowed to use templates.
About the list and the queue, maybe you can adopt the same approach taken by the STL: just create the list, and then stack, as an adaptor of the list in which you only push and pop from the end.
About those contraints, which seems to be draconian, don't I suppose that the objective is for you to use templates?
Instead of creating and empty class, which if does not contain any method does not serve you at all, use a template as in the following example:
template<typename T>
class List {
class Node {
public:
Node(T* d)
{ data.reset( new Data( d ) ); }
T * getData()
{ return data; }
Node * getSig()
{ return sig; }
private:
std::auto_ptr<T> data;
Node * sig;
};
List()...
// Lots of more things...
};
You can find more info here:
http://www.cplusplus.com/doc/tutorial/templates/
Hope this helps.

C++: Design and cost for heavy multiple inheritance hierarchies

I have a class hierarchy with the following three classes:
template<int pdim >
class Function
{
virtual double operator()( const Point<pdim>& x) const = 0;
};
Which is a function in pdim-dimensional space, returning doubles.
template<int pdim, int ldim >
class NodeFunction
{
virtual double operator()( const Node<pdim,ldim>& pnode, const Point<ldim>& xLoc) const = 0;
};
Which is a function from the ldim-dimensional local space of a node in pdim-dimensional space.
template<int pdim, int ldim, int meshdim >
class PNodeFunction
{
virtual double operator()( const PNode<pdim,ldim,meshdim>& pnode, const Point<ldim>& xLoc) const = 0;
};
Reason 1 for this design: a NodeFunction is more general than a Function. It can always map the local ldim-point point to a pdim-point. E.g an edge (Node with ldim=1) maps the interval [0,1] into pdim-dimensional physical space. That is why every Function is a NodeFunction. The NodeFunction is more general as the NodeFunction is allowed to query the Node for attributes.
Reason 2 for this design: a PNodeFunction is more general than a NodeFunction. Exactly one Node is accociated to every PNode (not vice versa). That is why every PNodeFunction is a NodeFunction. The PNodeFunction is more general as it also has all the context of the PNode which is part of a Mesh (thus it knows all its parents, neighbours, ...).
Summary: Every Function<pdim> is a NodeFunction<pdim, ldim> for any parameter of ldim. Every NodeFunction<pdim, ldim> is a NodeFunction<pdim, ldim, meshdim> for any parameter of meshdim.
Question: What is the best way to express this in C++, such that I can use Function in place of NodeFunction / PNodeFunction, such that the code is fast (it is a high performance computing code), such that the Code works for
The template parameters are not completely independent but rather dependend on each other:
- pdim=1,2,3 (main interest) but it is nice if it works also for values of pdim up to 7.
- 'ldim=0,1,...,pdim'
- 'meshdim=ldim,ldim+1,...,pdim'
To consider the performance, note that obly a few functions are created in the program, but their operator() is called many times.
Variants
I thought about a few ways to implement this (I currently implemented Variant 1). I wrote it down here so that you can tell me about the advanage and disadvantage of these approaches.
Variant 1
Implement the above described inheritance A<dim> inherits from B<dim,dim2> via a helper template Arec<dim,dim2>. In pseudo Code this is
class A<dim> : public Arec<dim,dim>;
class Arec<dim,dim2> : public Arec<dim,dim2-1>, public B<dim,dim2>;
class Arec<dim,0> : public B<dim,dim2>;
This is applied both to inherit Function from NodeFunction and NodeFunction from PNodeFunction. As NodeFunction inherits roughly O(pdim^2) times from PNodeFunction how does this scale? Is this huge virtual table bad?
Note: In fact every Function should also inherit from VerboseObject, which allows me to print debugging information about the function to e.g. std::cout. I do this by virtually inheritung PNodeFunction from VerboseObject. How will this impact the performance? This should increase the time to construct a Function and to print the debug information, but not the time for operator(), right?
Variant 2
Don't express the inheritance in C++, e.g. A<dim> doesn inherit from B<dim,dim2> bur rather there is a function to convert the two
class AHolder<dim,dim2> : public B<dim, dim> {
}
std::shared_pointer< AHolder<dim,dim2> > interpretAasB( std::shared_pointer< AHolder<dim> >)
[...]
This has the disadvanate that I can no longer use Function<dim> in place of NodeFunction<dim> or PNodeFunction<dim>.
Variant 3
What is your prefered way to implement this?
I don't comprehend you problem very well; that might be because I lack specific knowledge of the problem domain.
Anyway it seems like you want to generate a hierarchy of classes, with Function (most derived class) at the bottom, and PNodeFunction at the top (least derived class).
For that I can only recommend Alexandrescu's Modern C++ design book, especially the chapter on hierarchy generators.
There is an open source library stemming from the book called Loki.
Here's the part that might interest you.
Going the generic meta-programming way might be the hardest but I think it will result in ease of use once setup, and possibly increased performance (that is always to be verified by the profiler) compared to virtual inheritance.
In any case I strongly recommend not inheriting from the Verbose object for logging, but rather having a separate singleton logging class.
That way you don't need the extra space in the class hierarchy to store a logging object.
You could have only the least derived class inherit from the Verbose object but your function classes are not logging objects; they use a logging object (I may be a bit pedantic here). The other problem is if you inherit multiple times from that base class, you'll end up with multiple copies of the logging object and have to use virtual inheritance to solve it.