I have a graph implemented using a struct Node and a struct Edge where:
Each Edge has a start and an end Node
Each Node maintains a list of Edge objects which start from or end at it
The following is one possible implementation:
struct Node;
struct Edge {
Node *st;
Node *en;
int some_data;
};
const int MAX_EDGES = 100;
struct Node {
Edge *edges[MAX_EDGES];
int some_data;
};
While the above structs can represent the graph I have in mind, I would like to do it the "Modern C++" way while satisfying the following requirements:
Avoid pointers
Use an std::vector for Node::edges
Be able to store Node and Edge objects in standard C++ containers
How is this done in Modern C++? Can all of 1-3 be achieved?
Avoid pointers
You can use std::shared_ptr and std::weak_ptr for this. Just decide whether you want nodes to own edges, or vice versa. The non-owning type should use weak_ptr (to avoid cycles).
Unless your graph is acyclic you might still need to be careful about ownership cycles.
std::unique_ptr is not an option, because there is not a one-to-one relationship between nodes and edges, so there cannot be a unique owner of any given object.
Use an std::vector for Node::edges
No problem. Make it a std::vector<std::weak_ptr<Edge>> or std::vector<std::shared_ptr<Edge>> (depending whether edges own nodes or vice versa)
Be able to store Node and Edge objects in standard C++ containers
No problem, just ensure your type can be safely moved/copied without leaking or corrupting memory, i.e. has correct copy/move constructors and assignment operators. That will happen automatically if you use smart pointers and std::vector as suggested above.
Modern C++ eschews the assignment of dynamic memory to a raw pointer. This is because it is all to easy to forget to delete said pointer. Having said that there is nothing wrong with the use of raw pointers as reference to an object provided you can guarantee that the object's lifetime will be greater than the use of said pointer.
The rules generally are:
Use std::unique_ptr if an object has single owner.
Use raw pointers to reference objects created in 1. provided you can guarantee that the object's lifetime will be greater than the use of your reference.
Use std::shared_ptr for reference counted objects
Use std::weak_ptr to refer to a reference counted object when you do not want to increase the refernce count.
So in your case, if the Edge owns the Nodes then use std::unique_ptr, if not, the keep the raw pointers.
In your Node class, if the Node owns the Edges use a std::vector<Edge> otherwise use a std::vector<Edge*> although it might be more efficient to link the your Edges together in their own intrusive linked list.
Having done some work on complex graphs, it might be allocate all your Nodes and Edgees in a vector outside your graph and then only refer to them internally using raw pointers inside the graph. Remember memory allocation is slow so the less you do the faster your algorithm will be.
By using std::shared_ptr or std::unique_ptr
I don't think vector is a right choice here since a graph usually is not linear (usually speaking, also ,in most cases you can't linearize it like you can with a heap)
there is no standard 'general-use' container , but you can use templates here for generity
for example, your Element class can look like this:
template <class T>
struct Elem {
std::shared_ptr<Node> st , en;
T some_data;
};
speaking of modern C++ , I don't think struct is encouraged here , you ahould encapsulate you data
Related
This is a more general question that I'm trying to resolve for C++ best practices. Suppose I want to create objects which store references to each other, like a graph. All objects are owned by the same object, like a Graph object to all the Nodes, which is to say the ownership is fixed.
Here's my idea: a class Graph has a std::vector of Nodes, each Node has a std::vector of Nodes representing its list of connections. I'm wondering how best to implement this in terms of smart pointers? To my understanding, ownership is unique so the Graph vector should be std::vector<std::unique_ptr<Node>> nodes and I can populate that as needed. But the connections vector, how can I get each node to store references to its connections? These would only be read-only references, and maybe it would be better to name all the nodes and only store the names, or to store connections in the Graph. But is there a good way of storing references to the connection nodes as if they were const pointers?
Note: this is really about ownership and smart pointers, not about data structures, the graph example is just an example.
When discussing "Best Practices", it's important to consider what your quality-attributes and needs are for the code.
There is no "right" or "wrong" answer in the example of code such as a Graph; there are varying degrees that solve different problems in different ways -- and it depends strongly on the way its intended to be used.
By-far the simplest way to solve such a problem is for the main container (Graph) to have strong ownership in the with unique_ptr, and to only view the lifetime in the internal elements (Node) with a raw pointer, e.g.:
class Graph
{
...
private:
std::vector<std::unique_ptr<Node>> m_nodes;
};
class Node
{
...
private:
std::vector<const Node*> m_connected_nodes;
};
This would work well, since Node cannot mutate its connected nodes, and since Graph assumes that Node will never outlive it.
However, this approach does not work if you ever want Node to outlive Graph, or if you want Node to be used across multiple Graph objects. If it lives between different Graphs, then you may run the risk of a Node referring to a dangling pointer -- and this would be bad.
If this is the case, you might need to consider a different ownership pattern, such as shared_ptr and weak_ptr ownership:
class Graph
{
...
private:
std::vector<std::shared_ptr<Node>> m_nodes;
};
class Node
{
...
private:
std::vector<std::weak_ptr<const Node*>> m_connected_nodes;
};
In this case, Nodes only weakly know other Node objects, whereas Graph is the strong owner of them. This prevents the dangling issue, but incurs additional overhead now for the shared_ptr's control node, and for having to check for whether it's alive before accessing weak_ptr nodes.
So the correct answer is: It depends. If you can get away with the former approach, that's probably the cleanest; you always have 1 owner, and thus the logic is simple and easy to follow.
I'm wondering how best to implement this in terms of smart pointers?
By not using them. Use a vector of nodes for the graph: std::vector<Node>. This is a reasonable default choice until you have a good reason to do otherwise.
But is there a good way of storing references to the connection nodes as if they were const pointers?
Yes. Const pointers are a good way of storing as if they were const pointers. (And by "const pointer", I presume we are actually talking about pointer to const).
A reference wrapper is another choice. Although it has the advantage of not having representation for null, it does have the downside of clumsy syntax.
i've made a dynamic graph structure where both nodes and arcs are classes (i mean arcs are an actual instance in memory, they are not implied by an adjacency list of nodes to nodes).
Each node has a list of pointers to the arcs it's connected to.
Each arc has 2 pointers to the 2 nodes it's connecting.
Deleting a node calls delete for each of its arcs.
Each arc delete removes its pointer from the arcs lists in the 2 nodes it connects.
Simplified:
~node()
{
while(arcs_list.size())
{
delete arcs_list[arcs_list.size()-1];
}
}
~arc()
{
node_from.remove_arc(this);
node_to.remove_arc(this);
}
If i want to start using smart pointers here, how do i proceed?
Does each arc own 2 nodes, or do 2 nodes share an individual arc's ownership?
I was thinking about a shared_ptr, but shared pointer would only delete the arc when both nodes are deleted. If i delete only one node i would still have to explicitly delete all its arcs if i used shared_ptr. And that totally defeats the point of not using raw pointers in the first place.
Nodes can exist alone; each arc is owned by two nodes and it can only exist as long as these two nodes both exist.
Is there some other kind of smart pointer i should use to handle this?
Or is raw pointer just the plain simple way to go?
Does each arc own 2 nodes, or do 2 nodes share an individual arc's ownership?
You answered this question yourself:
Nodes can exist alone; each arc is owned by two nodes and it can only exist as long as these two nodes both exist.
When object A owns object B, then object A can exist after destroying B, but destroying A implies destroying B. Applied to your case, the two nodes share ownership of the arc.
Is there some other kind of smart pointer i should use to handle this? Or is raw pointer just the plain simple way to go?
Ah, yes. That is the real question. There is no pre-made smart pointer for this situation. However, I would not go with raw pointers in your node and/or arc classes. That would mean those classes would need to implement memory management on top of their primary purpose. (Much better to let each class do one thing well, then try to do many things and fail.) I see a few viable options.
1: Write your own smart pointer
Write a class that can encapsulate the necessary destruction logic. The node and/or arc classes would use your new class instead of standard smart pointers (and instead of raw pointers). Take some time to make sure your design decisions are solid. I'm guessing your new class would want a functional/callable of some sort to tell it how to remove itself from the lists it is in. Or maybe shift some data (like the pointers to the nodes) from the arc class to the new class.
I haven't worked out the details, but this would be a reasonable approach since the situation does not fit any of the standard smart pointers. The key point is to not put this logic directly in your node and arc classes.
2: Flag invalid arcs
If your program can stand not immediately releasing memory, you may be able to take a different approach to resolving an arc deletion. Instead of immediately removing an arc from its nodes' lists, simply flag the arc as no longer valid. When a node needs to access its arcs, it (or better yet, its list) would check each arc it accesses – if the arc is invalid, it can be removed from the list at that time. Once the node has been removed from both lists, the normal shared_ptr functionality will kick in to delete the arc object.
The usefulness of this approach decreases the less frequently a node iterates over its arcs. So there is a judgement call to be made.
How would an arc be flagged invalid? The naive approach would be to give it a boolean flag. Set the flag to false in the constructors, and to true when the arc should be considered deleted. Effective, but does require a new field. Can this be done without bloating the arc class? Well, presumably, each arc needs pointers to its nodes. Since the arc does not own its nodes, these are probably weak pointers. So one way to define an arc being invalid is to check if either weak pointer is expired(). (Note that the weak pointers could be manually reset() when the arc is being deleted directly, not via a node's deletion. So an expired weak pointer need not mean the associated node is gone, only that the arc no longer points to it.)
In the case where the arc class is sizeable, you might want to discard most of its memory immediately, leaving just a stub behind. You could add a level of indirection to accomplish this. Essentially, the nodes would share a pointer to a unique pointer, and the unique pointer would point to what you currently call your arc class. When the arc is deleted, the unique pointer is reset(), freeing most of the arc's memory. An arc is invalid when this unique pointer is null. (It looks like Davis Herring's answer is another way to get this effect with less memory overhead, if you can accept an object storing a shared_ptr to itself.)
3: Use Boost.Bimap
If you can use Boost, they have a container that looks like it would solve your problem: Boost.Bimap. But, you ask, didn't I already discount using an adjacency list? Yes, but this Bimap is more than just a way to associate nodes to each other. This container supports having additional information associated with each relation. That is, each relation in the Bimap would represent an arc and it would have an associated object with the arc's information. Seems to fit your situation well, and you would be able to let someone else worry about memory management (always a nice thing, provided you can trust that someone's abilities).
Since nodes can exist alone, they are owned by the graph (which might or might not be a single object), not the arcs (even as shared ownership). The ownership of an arc by its nodes is, as you observed, dual to the usual shared_ptr situation of either owner being sufficient to keep the object alive. You can nonetheless use shared_ptr and weak_ptr here (along with raw, non-owning pointers to the nodes):
struct Node;
struct Arc {
Node *a,*b;
private:
std::shared_ptr<Arc> skyhook{this};
public:
void free() {skyhook.reset();}
};
struct Node {
std::vector<std::weak_ptr<Arc>> arcs;
~Node() {
for(const auto &w : arcs)
if(const auto a=w.lock()) a->free();
}
};
Obviously other Node operations have to check for empty weak pointers and perhaps clean them out periodically.
Note that exception safety (including vs. bad_alloc in constructing the shared_ptr) requires more care in constructing an Arc.
I'm facing a design issue in my program.
I have to manage Nodes object which are part of a root ChainDescriptor.
Basically it looks like the following:
class ChainDescriptor
{
public:
~ChainDescriptor()
{
//delete the nodes in nodes...
}
void addNode(Node *);
Node * getNode();
const std::list<Node *>& getNodes() const;
std::list<Node *> m_nodes;
};
class Node
{
public:
Node(Node *parent);
void addChild(Node *node);
Node * getChild(const std::string& nodeName);
private:
Node * m_parent;
std::list<Node*> m_childs;
};
The ChainDescriptor class owns all the nodes and is responsible of deleting them.
But these classes need now to be used in another program, a GUI with undo/redo capabilities, with the problematic of the "ownership".
Before modifying the existing code in depth, I'm considering the different solutions:
using shared_ptr and respective list<shared_ptr<...> >
using weak_ptr and respective list<weak_ptr<...> >
In the example above, I don't really know where to use shared_ptr and weak_ptr properly.
Any suggestion?
You can use shared_ptr for m_childs and weak_ptr for m_parent.
However, it might be still reasonable to retain the raw pointer to the parent Node and don't use any weak pointers at all. The safeguarding mechanism behind this is the invariant that non-null parent always exists.
Another option is using shared_ptr in ChainDescriptor only and retaining all raw pointers in Node. This approach avoids weak pointers and has a clean ownership policy (parent nodes own their children).
Weak pointers will help you to manage the memory automatically, but the backside of this are fuzzy ownership logic and performance penalties.
shared_ptr is owning smart pointer and weak_ptr is referencing smart pointer.
So in your situation I think the ChainDescriptor should use shared_ptr (it owns the nodes) and Node should use weak_ptr for m_parent (it only references it) and shared_ptr for m_childs (it owns them).
The usual implementation would be for each node to have strong reference to its child (i.e. keeps them alive), and each child to have a weak reference back to the parent.
The reason for this is to avoid circular references. If only strong references were used, then you'd have a situation where the parent refcount never drops to zero (because the child has a reference), and the child refcount never drops to zero (because the parent has a reference).
I think your ChainDescriptor class is okay to use strong references here though.
Trying to just replace raw pointers with some sort of smart
pointer will in general not work. Smart pointers have
different semantics than weak pointers, and usually, these
special semantics need to be taken into account at a higher
level. The "cleanest" solution here is to add support for copy
in ChainDescriptor, implementing a deep copy. (I'm supposing
here that you can clone Node, and that all of the Node are
always owned by a ChainDescriptor.) Also, for undo, you may
need a deep copy anyway; you don't want modifications in the
active instance to modify the data saved for an undo.
Having said that, your nodes seem to be used to form a tree. In
this case, std::shared_ptr will work, as long as 1) all Node
are always "owned" by either a ChainDescriptor or a parent
Node, and 2) the structure really is a forest, or at least
a collection of DAG (and, of course, you aren't making changes
in any of the saved instances). If the structure is such that
cycles may occur, then you cannot use shared_ptr at this
level. You might be able to abstract the list of nodes and the
trees into a separate implementation class, and have
ChainDescriptor keep a shared_ptr to this.
(FWIW: I used a reference counted pointer for the nodes in
a parse tree I wrote many years ago, and different instances
could share sub-trees. But I designed it from the
start to use reference counted pointers. And because of how the
tree was constructed, I was guaranteed that there could be no
cycles.)
While implementing a FIFO I have used the following structure:
struct Node
{
T info_;
Node* link_;
Node(T info, Node* link=0): info_(info), link_(link)
{}
};
I think this a well known trick for lots of STL containers (for example for List). Is this a good practice? What it means for compiler when you say that Node has a member with a type of it's pointer? Is this a kind of infinite loop?
And finally, if this is a bad practice, how I could implement a better FIFO.
EDIT: People, this is all about implemenation. I am enough familiar with STL library, and know a plenty of containers from several libraries. Just I want to discuss with people who can gave a good implementation or a good advice.
Is this a good practice?
I don't see anything in particular wrong with it.
What it means for compiler when you say that Node has a member with a type of it's pointer?
There's nothing wrong with a class storing a pointer to an object of the same class.
And finally, if this is a bad practice, how I could implement a better FIFO.
I'd use std::queue ;)
Obviously you are using linked-list as the underlying implementation of your queue. There's nothing particularly bad about that.
Just FYI though, that in terms of implementation, std::queue itself is using std::deque as its underlying implementation. std::deque is a more sophisticated data structure that consists of blocks of dynamic arrays that are cleverly managed.
It ends up being better than linked list because:
With linked-list, each insertion means you have to do an expensive dynamic memory allocation. With dynamic arrays, you don't. You only allocate memory when the buffer has to grow.
Array elements are contiguous and that means elements access can be cached easily in hardware.
Pointers to objects of type that is being declared is fine in both C and C++. This is based on the fact that pointers are objects of fixed size (say, always 32-bit integers on 32-bit platform) so you don't need the full size of the pointed-to type to be known.
In fact, you don't even need a full type declaration to declare a pointer. A forward declaration would suffice:
class A; // forward declared type
struct B
{
A* pa; //< pointer to A - perfectly legal
};
Of course, you need a full declaration in scope at the point where you actually access members:
#include <A.hpp> // bring in full declaration of class A
...
B b;
b.pa = &a; // address of some instance of A
...
b.pa->func(); // invoke A's member function - this needs full declaration
For FIFO look into std::queue. Both std::list, std::deque, and std::vector could be used for that purpose, but also provide other facilities.
You can use the existing FIFO, std::queue.
This is one good way of implementing a node. The node pointer is used to create the link to the next node in the container. You're right though, it can be used to create a loop. If the last node in the container references the first, iterating that container would loop through all of the nodes.
For example, if the container is a FIFO queue the pointer would reference the next node in the queue. That is, the value of link_ would be the address of another instance of class Node.
If the value type T implemented an expensive copy constructor, a more efficient Node class would be
struct Node
{
T * info_;
Node* link_;
Node(T * info, Node* link=0): info_(info), link_(link)
{}
};
Note that info_ is now a pointer to an instance of T. The idea behind using a pointer is that assigning a pointer is less expensive than copying complex objects.
I am seeking to improve my C++ skills by writing a sample software renderer. It takes objects consisting of points in a 3d space and maps them to a 2d viewport and draws circles of varying size for each point in view. Which is better:
class World{
vector<ObjectBaseClass> object_list;
public:
void generate(){
object_list.clear();
object_list.push_back(DerivedClass1());
object_list.push_back(DerivedClass2());
or...
class World{
vector<ObjectBaseClass*> object_list;
public:
void generate(){
object_list.clear();
object_list.push_back(new DerivedClass1());
object_list.push_back(new DerivedClass2());
?? Would be using pointers in the 2nd example to create new objects defeat the point of using vectors, because vectors automatically call the DerivedClass destructors in the first example but not in the 2nd? Are pointers to new objects necessary when using vectors because they handle memory management themselves as long as you use their access methods? Now let's say I have another method in world:
void drawfrom(Viewport& view){
for (unsigned int i=0;i<object_list.size();++i){
object_list.at(i).draw(view);
}
}
When called this will run the draw method for every object in the world list. Let's say I want derived classes to be able to have their own versions of draw(). Would the list need to be of pointers then in order to use the method selector (->) ?
Since you are explicitly stating you want to improve your C++, I am going to recommend you start using Boost. This can help you with your problem in three different ways:
Using a shared_ptr
Using a shared_ptr could declare your vector like this:
std::vector< boost::shared_ptr< ObjectBase > > object_list;
And use it like this:
typedef std::vector< boost::shared_ptr< ObjectBase > >::iterator ObjectIterator;
for ( ObjectIterator it = object_list.begin(); it != object_list.end(); it++ )
(*it)->draw(view);
This would give you polymorphism and would be used just like it was a normal vector of pointers, but the shared_ptr would do the memory-management for you, destroying the object when the last shared_ptr referencing it is destroyed.
Note about C++11: In C++11 shared_ptr became part of the standard as std::shared_ptr, so Boost is no longer required for this approach. However, unless you really need shared ownership, it is recommended you use std::unique_ptr, which was newly introduced in C++11.
Using a ptr_vector
Using a ptr_vector you would do it like this:
boost::ptr_vector< ObjectBase > object_list;
And use it like this:
typedef boost::ptr_vector< ObjectBase >::iterator ObjectIterator;
for ( ObjectIterator it = object_list.begin(); it != object_list.end(); it++ )
(*it)->draw(view);
This would again be used like a normal vector of pointers, but this time the ptr_vector manages the lifetime of your objects. The difference to the first approach is, that here your objects get destroyed when the vector gets destroyed, whereas above they may live longer than the container, if other shared_ptrs referencing them exist.
Using a reference_wrapper
Using a reference_wrapper you would declare it like this:
std::vector< boost::reference_wrapper< ObjectBase > > object_list;
And then use it like this:
typedef std::vector< boost::reference_wrapper< ObjectBase > >::iterator
ObjectIterator;
for ( ObjectIterator it = object_list.begin(); it != object_list.end(); it++ )
it->draw(view);
Notice that you do not have to dereference the iterator first as in the above approaches. This does however only work if the lifetime of your objects is managed elsewhere and is guaranteed to be longer than that of the vector.
Note about C++11: reference_wrapper has also been standardized in C++11 and is now usable as std::reference_wrapper without Boost.
As pointed out in Maciej Hs answer, your first approach results in object slicing. In general you may want to look into iterators when using containers.
You wont get what You want with this code
class World{
vector<ObjectBaseClass> object_list;
public:
void generate(){
object_list.clear();
object_list.push_back(DerivedClass1());
object_list.push_back(DerivedClass2());
What is going to happen is called object slicing. You will get a vector of ObjectBaseClass.
To make polymorphism work You have to use some kind of pointers. There are probably some smart pointers or references in boost or other libraries that can be used and make the code much safer than the second proposed solution.
As for your first question, it is generally preferred to use automatically allocated objects rather than dynamically allocated objects (in other words, not to store pointers) so long as for the type in question, copy-construction and assignment is possible and not prohibitively expensive.
If the objects can't be copied or assigned, then you can't put them directly into a std::vector anyway, and so the question is moot. If the copying and/or assignment operations are expensive (e.g. the object stores a large amount of data), then you might want to store pointers for efficiency reasons. Otherwise, it is generally better not to store pointers for exactly the reason that you mentioned (automatic deallocation)
As for your second question, yes, that is another valid reason to store pointers. Dynamic dispatch (virtual method calls) work only on pointers and references (and you can't store references in a std::vector). If you need to store objects of multiple polymorphic types in the same vector, you must store pointers in order to avoid slicing.
Well, it depends on what you are trying to do with your vector.
If you don't use pointers, then it is a copy of the object you pass in that gets put on the vector. If it is a simple object, and/or you don't want to bother with keeping track of the storage for them, this may be exactly what you want. If it is something complex, or very time-consuming to construct and destruct, you might prefer to do that work only once each and pass pointers into the vector.