C++: pointers to data members or traits? - c++

I'm practicing with some binary tree algorithms in C++ and trying to write as generic code as possible. In particular, I would like my functions (algorithms) be able to operate on any (to some extend, of course) tree-like data structures.
A tree node structure might be defined in different ways, like this, for instance:
struct binary_tree_node
{
int data;
struct binary_tree_node *left;
struct binary_tree_node *right;
};
Or like this:
struct binary_tree_node2
{
long key;
struct binary_tree_node2 *first_child;
struct binary_tree_node2 *second_child;
};
Or anyhow else, but pretty similar to that pattern.
So I would like my functions/algorithms be able to work with any of these or similar data structures.
For example, here is how I define one simple function:
template <typename TreeNode, typename DataType = typename TreeNode::data_type>
TreeNode*
binary_tree_new_node(DataType value = DataType(),
DataType TreeNode::* data = &TreeNode::data,
TreeNode* TreeNode::* left = &TreeNode::left,
TreeNode* TreeNode::* right = &TreeNode::right)
{
TreeNode *newnode = new TreeNode();
newnode->*data = value;
newnode->*left = nullptr;
newnode->*right = nullptr;
return newnode;
}
Thus, it is possible to use the function with any suitable tree-node type of your choice. If the data members have different names (not data, left and right), then one can call the function and pass the pointers to the corresponding data members. This way, the function does not depend on (or at least can adjust itself to) how the data members of the input type are named.
It worked pretty well so far, but as I implement more and more functions, I'm getting tired of these pointer-to-data-member parameters which I have to list as optional parameters of the functions. So is there any better way to handle this? Maybe some sort of traits? Or somehow else?
I would like to keep the requirements on the input type as few as possible. For example, the client program should not be forced to define anything more than just the tree-node type. It shall not also be forced to use/derive-from any provided types or templates. Of course, the client program might re-use some of the provided templates, like the one below, which I also define, but should not really be forced to.
template<typename T>
struct binary_tree_node
{
using data_type = T;
data_type data;
struct binary_tree_node *left;
struct binary_tree_node *right;
};
What are the available options here?
Does it make any sense at all? :)
Thanks in advance

The way the STL collection libraries work is that the author of the tree library would supply the node class so that all the user of the template needs to do is supply the data. The other option which you seem to be going for are intrusive data structures(that would be a good term to google for ideas). With those you have a couple of choices, the first is require the data members to be left, right, and data. Second require access functions with a specific name so the template can find them. Third require functors to be passed in as template parameters so you can use them to find the data you need. Personally I find the STL way the least messy, followed by the functors approach.

To make your algorithm as generic as possible I recommend using functors. Your binary_tree_new_node would look like
template<typename G, typename L, typename R, typename AT, typename AD, typename D>
auto binary_tree_new_node(
G generator,
L left,
R right,
AT assign_tree,
AD assign_data,
D data) ->decltype( generator() )
{
auto tree = generator();
auto& l = left(tree);
auto& r = right(tree);
assign_data(tree, data);
assign_tree(l, nullptr);
assign_tree(r, nullptr);
return tree;
}
For your binary_tree_node, the functors would look like:
// WARNING!!! Very dangerous code!!!
binary_tree_node* generator()
{
return new binary_tree_node;
}
binary_tree_node*& left(binary_tree_node* tree)
{
return tree->left;
}
binary_tree_node*& right(binary_tree_node* tree)
{
return tree->right;
}
void assign_data(binary_tree_node* node, int data)
{
node->data = data;
}
void assign_tree(binary_tree_node*& node, binary_tree_node* data)
{
node = data;
}

Related

Name of this Data Structure?

So I recently came across a data structure roughly like this:
template<class T>
struct Node {
size_t m_next;
size_t m_prev;
T m_key;
}
template<class T, size_t N>
struct DS {
Node<T> m_elements[N];
size_t m_head;
size_t m_tail;
}
I simplified a bit, just to keep this brief: I don't do error handling when DS gets too full. Normally N is large enough that this isn't a concern.
One note is T must have some way of representing "no value"; why this is needed can be seen below. (I'll refer to this value as TOMBSTONE below.)
The API for this data structure is roughly the same as for a linked list, but it performs much better because everything fits nicely in the cache.
The actual implementation is different from a linked list in that it doesn't need to allocate any new memory for new nodes. For example, pushing to the back of DS is roughly like this:
void DS::push_back(T t) {
size_t attempt = 0;
size_t i = hash(t, attempt++);
while (true) {
if (m_elements[i] == TOMBSTONE) {
m_elements[m_tail].m_next = i;
m_elements[i] = Node(N, m_tail, t);
m_tail = i;
break;
}
i = hash(t, attempt++);
}
}
where hash(T t, size_t attempt) finds places to try to insert new elements. (This is so there's nice spread, rather than clumping everything at the start.)
I hesitate to call this a linked list because of the vast performance and implementation differences from a normal linked list. I also want to point out that this question is not about when to use what data-structures, or if the above data-structure is good/fast/safe/whatever. This data-structure works quite well for us in the very specific situation we use it in.
Is there any name for this particular implementation/data-structure?
It is linked list. It's mentioned on Wikipedia as "Linked list using arrays of nodes"
It's a double-linked linked list, implemented with a C-style array.

Returning two strings in C++

I am solving exercises for a C++ exam I have soon. Consider the following exercise:
A travel agency uses lists to manage its trips. For each trip the agency registers its point of departure, point of arrival, distance and time/duration
1) Define the necessary structures to represent a list of trips
2) Write a function that, given integer i returns the point of departure and point of arrival of the trip in position i
Defining the structure is easy:
struct list{
char departure[100];
char arrival[100];
double distance;
double time;
list* next = NULL;
};
My problem is the function. The actual work, to find the ith trip is easy. But how can I return the two char arrays/strings departure and arrival? If this were a question in my exam, I would have solved it like this:
typedef list* list_ptr;
list_ptr get_trip(list_ptr head, const int i){
if(i<0 || head==NULL){
return NULL;
}
for(int k = 0; k<i;k++){
head = head->next;
if(head==NULL){
return NULL;
}
}
return head;
}
I am returning a pointer to the list element. One then has to print departure and arrival. I could easily return just the departure or just the arrival by using a function with return type char*. How can I properly return 2 strings?
I know that there is ways doing this using std::tuple, but I cannot use this as we haven't had it in the lecture(we only had the really basic stuff, up to classes).
Am I right that returning both strings is not possible without using additional libraries?
Cheers
OK, to start with, your list type has some problems. Don't use char[] in C++ unless you really, really have to (note: if you think you have to, you're probably wrong). C++ provides a standard library that is wonderous in its applications (well, compared to C), and you should use it. In particular, I'm talking about std::string. You're probably OK using double for distance and duration, although a lack of units means that you're going to have a bad time.
Let's try this:
struct Trip {
std::string departure;
std::string arrival;
double distance_km;
double duration_hours;
};
Now you can either use std::vector, std::list, std::slist, or roll your own list. Let's assume the last.
class TripList {
public:
TripList() = default;
// Linear in i.
Trip& operator[](std::size_t i);
const Trip& operator[](std::size_t i) const;
void append_trip(Trip trip);
void remove_trip(std::size_t i);
private:
struct Node {
Trip t;
std::unique_ptr<Node> next;
};
std::unique_ptr<Node> head;
Node* tail = nullptr; // for efficient appending
};
I'll leave implementation of this to you. Note that list and trip are separate concepts, so we're writing separate types to handle them.
Now you can write a simple function:
std::pair<string, string> GetDepartureAndArrival(const TripList& list, std::size_t index) {
const auto& trip = list[index];
return {trip.departure, trip.arrival};
}

How to Make a C++ Wrapper for a C Linked List

I've implemented a linked list in C with many functions to help ease its manipulation.
I don't want to port this functionality to C++ so I'm trying to create a simple Wrapper Class that calls the original functions internally, and manipulates the C linked list internally as well.
For most of the functionality, the wrapper code works well. There is one problem, however. The C linked list structure has pointers to the next and the previous C linked list structures, and I want to be able to get the C++ equivalent class pointers instead ..
How can I do that ?
E.x: There is a C function that gets the linked list in the chain at an index. The Original function would do something like this:
struct _linkedlist *LinkedList_get(struct _linkedlist * list, const unsigned long index)
{ /* Gets the index'th linked list in the chain as a pointer */
if ((list) == NULL) return NULL;
if (index >= LinkedList_get_depth(list))
return NULL;
for(unsigned int i = 0; i < index; list = list->next, ++i);
return list;
}
The function clearly returns a pointer to the linked list C struct. What I want to do is get a pointer to the C++ linked list wrapper object.
The whole purpose of this is that I can make an object oriented wrapper (the C++ interface) around a purely functional interface (the C interface) without altering the original source (the C version).
You've mentioned in comments that your C linked list stores an arbitrary value type (as void*). Therefore, it should be fairly trivial for the C++ wrapper to store extra information in that value type. This extra information could be the pointer to the corresponding C++ wrapper.
You haven't shown your code, so I will just show the idea in a generic fashion:
// This is the original C interface
struct C_Node;
void* c_getValue(struct C_Node *node);
struct C_Node* c_insertAfter(struct C_Node *node, void *value);
// This is the C++ wrapper
template <class T>
class Node
{
C_Node *cNode;
typedef std::pair<T, Node*> ProxyValueType;
explicit Node(C_Node *cNode) : cNode(cNode)
{
static_cast<ProxyValueType*>(c_getValue(cNode))->second = this;
}
public:
T& getValue() const
{ return static_cast<ProxyValueType*>(c_getValue(cNode))->first; }
Node* insertAfter(T value)
{
ProxyValueType *proxy = new ProxyValueType(T, nullptr);
C_Node *newNode = c_insertAfter(cNode, proxy);
return new Node(newNode);
}
};
Of course, the above as written is bad C++, as it uses owning raw pointers etc. Treat it as a demonstration of the idea, not as pastable code.

Linked List using Void* pointers

I want to create a generic linked list in C/C++ (without using templates of C++).
I have written following simple program and it works fine as of now -
typedef struct node
{
void *data;
node *next;
}node;
int main()
{
node *head = new node();
int *intdata = new int();
double *doubledata = new double();
char *str = "a";
*doubledata = 44.55;
*intdata = 10;
head->data = intdata;
node *node2 = new node();
node2->data = doubledata;
head->next = node2;
node *node3 = new node();
node3->data = str;
node3->next = NULL;
node2->next = node3;
node *temp = head;
if(temp != NULL)
{
cout<<*(int *)(temp->data)<<"\t";
temp = temp->next;
}
if(temp != NULL)
{
cout<<*(double *)(temp->data)<<"\t";
temp = temp->next;
}
if(temp != NULL)
{
cout<<*(char *)(temp->data)<<"\t";
temp = temp->next;
}
return 0;
}
My question is -
I need to know the data type of the data I am printing in the code above.
For example - first node is int so i wrote -
*(int *)(temp->data)
second is double and so on...
Instead, is there any generic way of simply displaying the data without worrying about the data type?
I know you can achieve this with templates, but what if I have to do this in C only ?
Thanks,
Kedar
The whole point of a generic list is that you can store anything in it. But you have to be realistic... You still need to know what you are putting in it. So if you are going to put mixed types in the list, then you should look at using a Variant pattern. That is, a type that provides multiple types. Here's a simple variant:
typedef struct Variant
{
enum VariantType
{
t_string,
t_int,
t_double
} type;
union VariantData
{
char* strVal;
int intVal;
double doubleVal;
} data;
} Variant;
You can then tell yourself "I'm storing pointers to Variants in my void* list. This is how you would do it in C. I assume when you say "C/C++" you mean that you're trying to write C code but are using a C++ compiler. Don't forget that C and C++ are two different languages that have some overlap. Try not to put them together in one word as if they're one language.
In C, the only way to achieve generics is using a void*, as you are already doing. Unfortunately, this means that there is no easy way to retrieve the type of an element of your linked list. You simply need to know them.
The way of interpreting data in memory is completely different for different data type.
Say a 32 bit memory block has some data. It will show different values when you typecast it as int or float as both are stored with different protocols. When saving some data in memory pointed by variable of type void*, it does not know how to interpret the data in its memory block. So you need to typecast it to specify the type in which you want to read the data.
This is a little bit like sticking all the cutlery in a drawer, but instead of putting knifes in one slot, forks in another slot, and spoons in a third slot, and teaspoons in the little slot in the middle, we just stick them all in wherever they happen to land when chucking them in, and then wondering why when you just stick your hand in and pick something up, you can't know what you are going to get.
The WHOLE POINT of C++ is that it allows you to declare templates and classes that "do things with arbitrary content". Since the above code uses new, it won't compile as C. So there's no point in making it hold an non-descriptive pointer (or even storing the data as a pointer in the first place).
template<typename T> struct node
{
T data;
node<T> *next;
node() : next(0) {};
};
Unfortunately, it still gets messier if you want to store a set of data that is different types within the same list. If you want to do that, you will need something in the node itself that indicates what it is you have stored.
I have done that in lists a few times since I started working (and probably a couple of times before I got a job) with computers in 1985. Many more times, I've done some sort of "I'll store arbitrary data" in a something like a std::map, where a name is connected to some "content". Every time I've used this sort of feature, it's because I'm writing something similar to a programming language (e.g. a configuration script, Basic interpreter, LisP interpreter, etc), using it to store "variables" that can have different types (int, double, string) or similar. I have seen similar things in other places, such as OpenGL has some places where the data returned is different types depending on what you ask for, and the internal storage has to "know" what the type is.
But 99% of all linked lists, binary trees, hash-tables, etc, that I have worked on contain one thing and one thing only. Storing "arbitrary" things in a single list is usually not that useful.
The answer below is targeting at C++ and not C. C++ allows for what you want, just not in the way that you want to do it. The way I would implement your problem would be using the built-in functionality of the virtual keyword.
Here's a stand-alone code sample that prints out different values no matter the actual derived type:
#include <iostream>
#include <list>
class Base
{
public:
virtual void Print() = 0;
};
class Derived1 : public Base
{
public:
virtual void Print()
{
std::cout << 1 << std::endl; // Integer
}
};
class Derived2 : public Base
{
public:
virtual void Print()
{
std::cout << 2.345 << std::endl; // Double
}
};
class Derived3 : public Base
{
public:
virtual void Print()
{
std::cout << "String" << std::endl; // String
}
};
int main(void)
{
// Make a "generic list" by storing pointers to a base interface
std::list<Base*> GenericList;
GenericList.push_back(new Derived1());
GenericList.push_back(new Derived2());
GenericList.push_back(new Derived3());
std::list<Base*>::iterator Iter = GenericList.begin();
while(Iter != GenericList.end())
{
(*Iter)->Print();
++Iter;
}
// Don't forget to delete the pointers allocated with new above. Omitted in example
return 0;
}
Also notice that this way you don't need to implement your own linked list. The standard list works just fine here. However, if you still want to use your own list, instead of storing a void *data;, store a Base *data;. Of course, this could be templated, but then you'd just end up with the standard again.
Read up on polymorphism to learn more.

C++ Losing Template Data

I don't consider myself all that knowledgeable in C++ but I'm having a hard time with this concept. So I have a class the holds some template datatype and a double. I want the m_data variable to be generic, but right now I'm only testing with an unsigned int. When I call the function SetData() with say a pointer to an unsigned int I lose the info the pointer is pointing to. This happens when I go out of scope, so I felt I need to do a deep copy of it...
I tried many different constructors and assignment operators but I still lose the info... I feel I'm missing something obvious about templates here.If anyone could point me in the right direction as to why the data is being lost I would be very grateful.
Small bit of code:
template<typename T>
class PointNode {
public:
PointNode(double p){ m_point = p;}
~PointNode();
void SetData(T * data);
T * GetData() const;
private:
double m_point;
T *m_data;
};
template<typename T>
void PointNode::SetData(T * data)
{
m_data = data;
}
template<typename T>
T * PointNode::GetData()
{
return m_Data;
}
OK some more info. This class is being stored in a map that is a member of another class. Heres a bit of it.
template<typename T>
class AuMathPointTreeT
{
public:
//Member Variables
double m_dTolerance;
unsigned int m_cPoint;
map<VectorKey, PointNode<T> > m_tree; /*map posing as a tree */
typename map<VectorKey, PointNode<T> >::iterator iter; /* iterator */
pair< typename map<VectorKey, PointNode<T> >::iterator, bool > return_val;
/* Tree methods */
//constructor
AuMathPointTreeT(double tol);
...
};
In another program I'm using this class, creating node and setting the template data like so
if (node = pnttree.AddPoint(point) )
{
unsigned int * data = new unsigned int();
*data = pntCount;
node->SetData(data);
++pntCount;
}
UPDATE: Ok discovered the culprit of what's wrong, and would like suggestions on how to approach it. When I insert a node into the map class a few functions are called in the process and im losing the original pointer to the newly allocated node class object. Here is what I'm doing.
template<typename T>
PointNode<T> * AuMathPointTreeT<T>::
AddPoint(double point)
{
PointNode<T> * prNode = MakeNode(point);
m_cPoint++;
return prNode;
}
template<typename T>
PointNode<T> * AuMathPointTreeT<T>::
MakeNode(double point)
{
PointNode<T> * prNode = new PointNode<T>;
//set the contents for the node just performs a few calcs on the values
prNode->SetNode(point, m_dTolerance);
//Create the key class using the
VectorKey key(point, m_dTolerance);
//Store the key,node as a pair for easy access
return_val = m_tree.insert( pair<VectorKey, PointNode<T> >(key, *prNode) );
if (return_val.second == false)
prNode = NULL;
unsigned int * test = new unsigned int;
*test = 55;
prNode->SetData(test); //if call this here its no longer the right pointer
return prNode;
}
So after looking at this... I really still want to return a pointer and use it. But maybe the iterator being held by return_val? Im open on suggestions for all aspects too.. Sorry this question has been a mess :\
I don't think this has anything to do with the use of templates. Once a local variable goes out of scope, its location on the stack could be over-written by other data.
If you expect the template class instance to out-live the local variable whose address is passed to SetData, you should consider allocating the data on the heap not the stack. Either way, I'd suggest replacing the raw m_data pointer with an appropriate smart pointer. For example, the use of shared_ptr<> in the template class and its client code should reduce the amount of data copying while at the same time ensuring the data remains valid regardless of whether or not the original data variable is in scope.
If you want a deep copy, you must use T and not T*, or you must do dynamic memory allocation with T* (but it's overkill) and will bring a similar result.
If you really want Nodes of pointers, it will be when you use your Node.
Exemple:
int number = 5;
Node<int*> oneNode(&number); // number will die at end of scope
Node<int> anotherNode(number); //anotherNode can be used without risk
your code will not compile because of your constructor
PointNode(double p){ m_point = p;}
m_point is const, you have to write it to the initializer list:
PointNode(double p) : m_point(p) {}