Is "node" an ADT? If so, what is its interface? - c++

Nodes are useful for implementing ADTs, but is "node" itself an ADT? How does one implement "node"? Wikipedia uses a plain old struct with no methods in its (brief) article on nodes. I googled node to try and find an exhaustive article on them, but mostly I found articles discussing more complex data types implemented with nodes.
Just what is a node? Should a node have methods for linking to other nodes, or should that be left to whatever owns the nodes? Should a node even be its own standalone class? Or is it enough to include it as an inner struct or inner class? Are they too general to even have this discussion?

A node is an incredibly generic term. Essentially, a node is a vertex in a graph - or a point in a network.
In relation to data structures, a node usually means a single basic unit of data which is (usually) connected to other units, forming a larger data structure. A simple data structure which demonstrates this is a linked list. A linked list is merely a chain of nodes, where each node is linked (via a pointer) to the following node. The end node has a null pointer.
Nodes can form more complex structures, such as a graph, where any single node may be connected to any number of other nodes, or a tree where each node has two or more child nodes. Note that any data structure consisting of one or more connected nodes is a graph. (A linked list and a tree are both also graphs.)
In terms of mapping the concept of a "node" to Object Oriented concepts like classes, in C++ it is usually customary to have a Data Structure class (sometimes known as a Container), which will internally do all the work on individual nodes. For example, you might have a class called LinkedList. The LinkedList class then would have an internally defined (nested) class representing an individual Node, such as LinkedList::Node.
In some more cruder implementations you may also see a Node itself as the only way to access the data structure. You then have a set of functions which operate on nodes. However, this is more commonly seen in C programs. For example, you might have a struct LinkedListNode, which is then passed to functions like void LinkedListInsert(struct LinkedListNode* n, Object somethingToInsert);
In my opinion, the Object Oriented approach is superior, because it better hides details of implementation from the user.

Generally you want to leave node operations to whatever ADT owns them. For example a list should have the ability to traverse its own nodes. It doesn't need to the node to have that ability.
Think of the node as a simple bit of data that the ADT holds.

In the strictest terms, any assemblage of one or more primitive types into some kind of bundle, usually with member functions to operate on the data, is an Abstract Data Type.
The grey area largely comes from which language you operate under. For example, in Python, some coders consider the list to be a primitive type, and thus not an ADT. But in C++, the STL List is definitely an ADT. Many would consider the STL string to be an ADT, but in C# it's definitely a primitive.
To answer your question more directly: Any time you are defining a data structure, be it struct or class, with or without methods, it is necessarily an ADT because you are abstracting primitive data types into some kind of construct for which you have another purpose.

An ADT isn't a real type. That's why it's called an ADT. Is 'node' an ADT? Not really, IMO. It can be a part of one, such as a linked list ADT. Is 'this node I just created to contain thingys' an ADT? Absolutely not! It's, at best, an example of an implementation of an ADT.
There's really only one case in which ADT's can be shown expressed as code, and that's as templated classes. For example, std::list from the C++ STL is an actual ADT and not just an example of an instance of one. On the other hand, std::list<thingy> is an example of an instance of an ADT.
Some might say that a list that can contain anything that obeys some interface is also an ADT. I would mildly disagree with them. It's an example of an implementation of an ADT which can contain a wide variety of objects that all have to obey a specific interface.
A similar argument could be made about the requirements of the std::list's "Concepts". For instance that type T must be copyable. I would counter that by saying that these are simply requirements of the ADT itself while the previous version actually requires a specific identity. Concepts are higher level than interfaces.
Really, an ADT is quite similar to a "pattern" except that with ADT's we're talking about algorithms, big O, etc... With patterns we're talking about abstraction, reuse, etc... In other words, patterns are a way to build something that's implementations solve a particular type of problem and can be extended/reused. An ADT is a way to build an object that can be manipulated through algorithms but isn't exactly extensible.

Nodes are a detail of implementing the higher class. Nodes don't exist or operate on their own- they only exist because of the need for separate lifetimes and memory management than the initial, say, linked list, class. As such, they don't really define themselves as their own type, but happily exist with no encapsulation from the owning class, if their existence is effectively encapsulated from the user. Nodes typically also don't display polymorphism or other OO behaviours.
Generally speaking, if the node doesn't feature in the public or protected interface of the class, then don't bother, just make them structs.

In the context of ADT a node is the data you wish to store in the data structure, plus some plumbing metadata necessary for the data structure to maintain its integrity. No, a node is not an ADT. A good design of an ADT library will avoid inheritance here because there is really no need for it.
I suggest you read the code of std::map in your compiler's standard C++ library to see how its done properly. Granted, you will probably not see an ADT tree but a Red-Black tree, but the node struct should be the same. In particular, you will likely see a lightweight struct that remains private to the data structure and consisting of little other than data.

You're mixing in three mostly orthogonal concepts in your question: C++, nodes, ADTs.
I don't think it's useful to try to sort out what can be said in general about the intersection of those concepts.
However, things can be said about e.g. singly linked list nodes in C++.
#include <iostream>
template< class Payload >
struct Node
{
Node* next;
Payload value;
Node(): next( 0 ) {}
Node( Payload const& v ): next( 0 ), value( v ) {}
void linkInFrom( Node*& aNextPointer )
{
next = aNextPointer;
aNextPointer = this;
}
static Node* unlinked( Node*& aNextPointer)
{
Node* const result = aNextPointer;
aNextPointer = result->next;
return result;
}
};
int main()
{
using namespace std;
typedef Node<int> IntNode;
IntNode* pFirstNode = 0;
(new IntNode( 1 ))->linkInFrom( pFirstNode );
(new IntNode( 2 ))->linkInFrom( pFirstNode );
(new IntNode( 3 ))->linkInFrom( pFirstNode );
for( IntNode const* p = pFirstNode; p != 0; p = p->next )
{
cout << p->value << endl;
}
while( pFirstNode != 0 )
{
delete IntNode::unlinked( pFirstNode );
}
}
I first wrote these operations in Pascal, very early eighties.
It continually surprises me how little known they are. :-)
Cheers & hth.,

Related

Working on Parser with Binary and unary node?

I now this is bit weird title but I hope you will understand what I am asking about. Few months back I worked on Interpreter program in python and that was kind of great but now I want to implement same in C++ but doing so is offering great problems to me as C++ is type strict.
Lets start of from what I did in my python program. First I created a Lexer that would separate everything into tokens (key-value pair) and I wrote a Parser which will convert a arithmetic grammar into Operation Nodes as BinaryOpNode, UnaryOpNode, and NumberNode. ex- (-2+7)^3 will be converted into AST as a Binary Node having left node as another Binary Node, operator as POW(power) and right node as Number node of 3. Left Node of this node is Binary Node whose Left Node is Unary Node (MINUS and a Number Node 2), opeartor as PLUS and Right Node as Number Node 7.
I did this by identifying expression, term and factor. I have wrote a Lexer in C++ but having problem in Parser. Please help me to do same in C++.
What I have done so Far??
I tried something weird but kind of working. I created a class BinaryOpNode with two void* members for Right and left Node, A Enum member for operation between Rt and Lt node. Now two another boolean members for both nodes which would help to now what type of void* Lt and Rt are? Are they UnaryOpNode or BinaryOpNode(default). This will help me to typecast the Node into respective types.
However I am not satisfied with my results as they look like less optimized and also I can't keep track of NumberNode this way.
Please Help me. THANKS IN ADVANCE
What you are looking for is polymorphism. That is, code that a programmer writes, and does different things depending on the types of the things it operates on.
C++ supports a bewildering array of ways to do polymorphism.
The most supported kind is inheritance based virtual polymorphism. In this, you create a base class:
struct INode {
virtual ~INode() {}
};
and add in common operations to it, making those common operations pure-virtual:
struct INode {
virtual ~INode() {}
virtual std::vector<INode*> GetChildren() const = 0;
};
This requires that you work with pointers instead of object instances.
In this system, if you know the type of an object, you can use dynamic_cast<RealType*>(iNodePointer) to get a pointer to the object as an instance of that type. It returns nullptr if the types don't match. This lets you access the methods you have in the descended type that aren't in the base interface.
A second kind of polymorphism is std::variant based. This is a closed set of types, which parsers often have.
using AnyNode = std::variant<Node::BinaryOp, Node::UnaryOp, Node::Number>;
here you use std::visit to operate on the concrete type instead of dynamic_cast, and your parse tree is value-based instead of pointer-based.
There is some pain when you want a node to have inside itself a vector of AnyNode.
A third way is std::function type-erasure style. Here you write your own polymorphic system that takes objects of arbitrary type and wraps their operations up in a value-semantics wrapper.
A forth option is CRTP static polymorphism. This isn't suitable to build a dynamic parse tree, but it can be used to help implement some of the above.
A fifth option is aspect oriented std::function operation bundles.
A sixth option is manual function table tweaking, basically reimplementing the C++ vtable solution manually as if you are in C, but in C++. This can permit you to have features similar to other OO-languages.
A seventh option is to write up a signals-slots system and send messages to your objects.
There are almost certainly more.
The easiest solution is probably to first learn about inheritance and virtual functions in C++ (the first option above). I personally would probably write a parse tree using std::variant at this point, but if you probably don't know enough C++ at this point to practically do that.

How to create a linked list without dynamic memory allocation as template in c++

I started working through the book Hands-On System Programming in C++
and I tried to create the following linked list with a template without dynamic memory allocation. But every time I try to build the linked list a find no other way than having to assign memory with new - how else would I create a new node?
As I understand the author there is a way to replace the need for creating a new node by using c++ templates since allocating dynamic memory is considered slow.
And so far that doesn't mean using static memory allocation or an array nor macro programming at compile time but the same flexibility at runtime? Or is that a misunderstanding?
What am I missing? Thanks upfront for any hint on how can I create a linked list dynamically without dynamic memory allocation with c++ templates?.
"There are several implementations of these types of linked lists (and other data structures) floating around on the internet, which provide a generic implementation of a linked list without the need for dynamically allocating data."
I didn't find any in C++ :(
and
"In the preceding example, not only are we able to create a linked list without macros or dynamic allocations (and all the problems that come with the use of void * pointers), but we are also able to encapsulate the functionality, providing a cleaner implementation and user API."
That is what I tried to do but every way I puzzle I have to allocate memory dynamically:
template<typename T>
class MyLinkedList
{
struct node
{
T data;
node* next = nullptr;
};
private:
node m_head;
public:
void setData(T value)
{
if(m_head.next == nullptr){
m_head.data = value;
}
}
T getData()
{
return m_head.data;
}
};
int main()
{
MyLinkedList<int> list;
list.setData(4);
std::cout << list.getData() << std::endl;
return 0;
}
The Whole text from this book about C++ Templates:
Hands-On System Programming in C++
Templates used in C++
Template programming is often an undervalued, misunderstood addition
to C++ that is not given enough credit. Most programmers need to look
no further than attempting to create a generic linked list to
understand why.
C++ templates provides you with the ability to define your code
without having to define type information ahead of time.
One way to create a linked list in C is to use pointers and dynamic
memory allocation, as seen in this simple example:
struct node {
void *data;
node next; };
void add_data(node *n, void *val);
In the preceding example, we store data in the linked list using void
*. An example of how to use this is as follows:
node head; add_data(&head, malloc(sizeof(int)));
*(int*)head.data = 42;
There are a few issues with this approach:
This type of linked list is clearly not type-safe. The use of the data and the data's allocation are completely unrelated, requiring the programmer using this linked list to manage all of this without error.
A dynamic memory allocation is needed for both the nodes and the data. As was discussed earlier, memory allocations are slow as they
require system calls.
In general, this code is hard to read and clunky.
Another way to create a generic linked list is to use macros. There
are several implementations of these types of linked lists (and other
data structures) floating around on the internet, which provide a
generic implementation of a linked list without the need for
dynamically allocating data. These macros provide the user with a way
to define the data type the linked list will manage at compile time.
The problem with these approaches, other than reliability, is these
implementations use macros to implement template programming in a way
that is far less elegant. In other words, the solution to adding
generic data structures to C is to use C's macro language to manually
implement template programming. The programmer would be better off
just using C++ templates.
In C++, a data structure like a linked list can be created without
having to declare the type the linked list is managing until it is
declared, as follows:
template<typename T> class mylinked_list {
struct node
{
T data;
node *next;
};
public:
...
private:
node m_head; };
In the preceding example, not only are we able to create a linked list
without macros or dynamic allocations (and all the problems that come
with the use of void * pointers), but we are also able to encapsulate
the functionality, providing a cleaner implementation and user API.
One complaint that is often made about template programming is the
amount of code it generates. Most code bloat from templates typically
originates as a programming error. For example, a programmer might not
realize that integers and unsigned integers are not the same types,
resulting in code bloat when templates are used (as a definition for
each type is created).
Even aside from that issue, the use of macros would produce the same
code bloat. There is no free lunch. If you want to avoid the use of
dynamic allocation and type casting while still providing generic
algorithms, you have to create an instance of your algorithm for each
type you plan to use. If reliability is your goal, allowing the
compiler to generate the code needed to ensure your program executes
properly outweighs the disadvantages.
What am I missing? Thanks upfront for any hint on how can I create a linked list dynamically without dynamic memory allocation with c++ templates?.
a find no other way than having to assign memory with new - how else would I create a new node?
You can declare a variable. Or, you can create a dynamic object into non-dynamic memory using placement-new.
Here is minimal example of linked list using node variables:
template<class T>
struct node
{
T data;
node* next = nullptr;
};
// some function
node<int> n3{3, nullptr};
node<int> n2{2, &n3};
node<int> n1{1, &n2};
Reusing non-dynamic storage for dynamic objects is quite a bit more complicated. I recommend a structured approach of using a pre-existing implementation such as std::list with a custom allocator.

Recursive data-structures without the use of pointers

During my bachelor degree in CS I've come across the use of recursive data-structures a lot of times. In C++ I always ended up using pointers to make my data structures recursive, just like what I would do in C.
A simplified example could be the following:
struct Tree{
int data;
struct Tree *left, *right;
};
However, using pointers tends to be a risky job and involves a lot hours debugging and testing the code. For these resouns I would like to know if there is any other efficient way of defining recursive data-structures in C++.
In other programming languages, like Rust, I've seen things like that:
struct Node {
children: Vec<Node>,
node_type: NodeType,
}
Is there a safer and confortable way of defining such recursive structures in C++. One possibility would be to use std::Vector, but I am not aware of the performance of the method.
The reason pointers are used rather than values is because you would never be able to define your struct as its size would be infinitely recursive.
struct Tree{
int data;
struct Tree left, right;
};
Neglecting padding etc, you could approximate the size of Tree as
sizeof(Tree) == sizeof(int) + sizeof(Tree) + sizeof(Tree)
// ^data ^left ^right
but you can see that since Tree has two members of Tree, and those members themselves have two Tree members, and those have two Tree members.... you can see where this is going.
The Rust example uses a vector of children - this can be empty as well.
In C++, the member variable can be an object, a pointer or a reference (omitted for simplicity).
Since a node object cannot be used directly (this would loop infinitely) and you do not wish to use a pointer, your options are:
use a vector as well (though for a binary tree this is not the most convenient type - you could however limit it in code to always two elements),
use a map (key could be an enum CHILD_LEFT, CHILD_RIGHT),
reconsider using pointers, or better yet: smart pointers (this looks like a good use case for regular unique_ptrs).

Issue while designing B+Tree template class in C++

I'm trying to write a generic C++ implementation of B+Tree. My problem comes from the fact that there are two kinds of nodes in a B+Tree; the internal nodes, which contain keys and pointers to child nodes, and leaf nodes, which contain keys and values, and pointers in an internal node can either point to other internal nodes, or leaf nodes.
I can't figure how to model such a relationship with templates (I don't want to use casts or virtual classes).
Hope there is a solution to my problem or a better way to implement B+Tree in C++.
The simplest way:
bool mIsInternalPointer;
union {
InternalNode<T>* mInternalNode;
LeafNode<T>* mLeafNode;
};
This can be somewhat simplified by using boost::variant :)

Shared single variable for list?

This problem is a little difficult to describe, so bear with me if it isn't clear.
I want to implement a doubly-linked list with a single, universally accessible [to the items inside] Head, End and Iter pointers - this would greatly reduce memory overhead and processing/accessing times...
Static almost fulfills this role - except, it's shared by all classes of the same type - which what I don't want [as I might have multiple doubly-linked lists - I need one per list, not one per class]. So what I need is something similar to static, except it's localised to different declarations.
Head/Node methods become complicated (notably as it uses templates) and I want to avoid this at all costs. Head just ends up having duplicate functions of Node [so Node is accessible], which seems a waste and added complexity just to have three local-universal variables.
What I'd like is something similar to this:
class Test
{
private:
static Test *Head; //Single universal declaration!
static Test *End;
static Test *Iter;
//etc etc
};
Except...
Test A; //Set of 'static' variables 'unique' to A
Test B; //Set of 'static' variables 'unique' to B
I am willing to entertain any and all solutions to the problem, but please avoid complicated solutions - this is meant as an improvement and needs to be quick and simple to implement.
Additional Information [as requested]:
There isn't a 'problem' per se [aside in terms of avoiding overhead and design] - this is setting the frame-work/ground-work for several other classes/functions to build on. So the class needs to be able to handle multiple roles/variables/classes - for this, it has to be templated [although this isn't entirely relevant].
One [of many] of it's main roles is storing individual characters [loaded from files] in seperate Nodes. Given the size can vary, it has to be dynamic. However, as one of it's roles involve loading from files, it can't be an array [as reading the file to work out number of arguments, characters etc causes harddrive/access bottlenecks]. So...
...Singly-linked lists would allow a character to be [easily] added [to the list] on each pass that gets a character [and counted at the same time - solving two problems in one]. The problem is singly-linked lists are very hard to [safely] delete, and navigation is one way. Which is a problem as this hinders search functionality, and notably, the intended multipurpose role...
...So the conclusion is it has to be a doubly-linked list. I don't like the STL or standard lists as I have no idea of their efficiency or safety, or indeed, compatibility with additional features the class has to support. So it has to be a custom built D-L-List...
...However I previously (some time ago) implemented a Head/Node method - it worked. However it become complex and difficult to debug as Head and Node shared functions. This time around I just want a simple, single [Readable! It's going to be shared!] class that somehow sidesteps the almost 'beaucratic' nature of C++. That means no Head/Iter/End copying overhead (and all functions/variables/debugging required for it) and no Head system with it's duplication...
...Static is the closest I get. Perhaps there is a way that somehow, you have Class A that stores the three variables, and a Class B that stores the list - and both of them are aware of each other and are able to communicate via some method/function (no pointer storage!)...
...Something along those lines. I am pretty sure there is some hierarchy or sub-class or inheiretence trick that would pull this off, and I need someone who knows the finer arts better than I do to, refine my raw idea or something.
If static variables are not suitable, you have only one possibility - use instance variables.
If you want to share the variables between the items, put them in the list itself and maintain a pointer to the list in each item as follows:
class List
{
Item* head;
Item* end;
Item* iter;
};
class Item
{
List* list;
};
Make a List class (as already shown by vitaut, but add a makeEntry() function in which a reference to the List class can be passed. If List becomes more complicated, I would isolate these members to ListInfo, so the node only have access to them