C++ Vector of Object , Memory Usage on Empty Object Creation - c++

C++ Vector - part of it to point to same address
Hi , my subject might be confusing.
Here it goes.
I got a vector
struct node{
int nodeid;
vector<string> data;
vector<fTable> fdata;
}
struct fTable{
int index;
int key;
}
vector<node> myNode;
as at some function...
void chord::someFunc(int nodeid)
{
node myTempNode;
vector<string> data1;
vector<fTable> fdata1;
myTempNode.nodeid = nodeid
myTempNode.data = data1;
myTempNode.fTable = ftable1;
myNode.push_back(myTempNode);
myTempNode.clear();
}
I will be creating 10000 objects, at this point of time, i only got the value for nodeid.
But for data and fTable, i am setting to some empty string vector and empty fTable vector but i wonder if i create 10000 objects and doing the same thing.
am i creating 10000 empty string and fTable vector
Is there a way i can set all this object point to same string vector (null value) and fTable vector ( empty value) so i can save some memories. considering i will or might create 10000 nodes or so. and memory consumption is a concern to me.
Thanks for all help.

No, since the vectors are empty, they don't consume much space and no string or fTable objects are created.
Give your limited c++ knowledge I would stay clear of pointers and stick to values.
You don't need to do any of the (immediately) following, the constructor of node takes care of that. This simply overwrites empty vectors with empty vectors.
node myTempNode;
vector<string> data1;
vector<fTable> fdata1;
myTempNode.data = data1;
myTempNode.fTable = ftable1;
If you give your node a constructor like this:
struct node{
int node(int id) : nodeid(id) {}
int nodeid;
vector<string> data;
vector<fTable> fdata;
}
then you only need to write:
myNode.push_back( node(nodeid) );

Creating a vector does not always create its data : the data of a vector is allocated when needed, so vectors with no data will be likely to take sizeof(std:vector<...>) bytes (if reserved size is 0), and vectors with data will in real take sizeof(vector<...>) + n * sizeof(data), where n is the number of reserved items in the vector. The size of a vector is 28 bytes on my implementation.
1st method: vector as fields. The advantage of having vector fields is they're not dynamically allocated, saving you from a bunch of new/delete manual calls : it is more safe.
2nd method: you can also use pointers as you said:
struct node
{
int nodeid;
vector<string>* data; // pointer
vector<fTable>* fdata; // pointer
};
You can set them to 0 (null), saving the size of a vector minus the size of pointer, per node. When you need a node to have a vector, simply new a vector, and set the appropriated pointer. However, this method will eventually take more space than the previous, because it will also take the size of the pointers. And you will have to manage the delete (it can be done with the node destructor, but may be less efficient that deallocating vectors before node destruction).
Conclusion: I suggest you estimate the total size occupied by your data (ex: 10000 * ...), and see if you have to use a specific model (ie, measure first). Personnally, I advise you to take the first (no pointers).
I also recommend that you use a constructor (or two) for node, for a better code.

Yes, use a vector of pointers then, i.e.
struct node {
node(int nid) : nodeid(nid), data(0), fdata(0) { }
int nodeid;
vector<string *> data;
vector<fTable *> fdata;
}
But beware of memory management: now when a node is deleted, the string and the fTable pointed by data and fdata are not deleted. If these data should be owned by a node once assigned, add a destructor:
struct node {
node(int nid) : nodeid(nid), data(0), fdata(0) { }
~node() {
for (auto i = data.begin(); i != data.end(); ++i)
delete *i;
for (auto i = fdata.begin(); i != fdata.end(); ++i)
delete *i;
}
int nodeid;
vector<string *> data;
vector<fTable *> fdata;
}

Related

C++ avoid dynamic memory allocation

Imagine I have some Node struct that contains pointers to the left and right children and some data:
struct Node {
int data;
Node *left;
Node *right;
};
Now I want to do some state space search, and naturally I want to construct the graph as I go. So I will have a kind of loop that will have to create Nodes and keep them around. Something like:
Node *curNode = ... ; // starting node
while (!done) {
// ...
curNode->left = new Node();
curNode->right = new Node();
// ..
// Go left (for example)
curNode = curNode->left;
}
The problem is that I have to dynamically allocate node on each iteration, which is slow. So the question is: how can I have pointers to some memory but not by allocating it one by one?
The first solution I thought of is to have a std::vector<Node> that will contain all the allocated nodes. The problem is that when we push_back elements, all references might be invalidated, so all my left/right pointers will be garbage.
The second solution is to allocate a big chunk of memory upfront, and then we just grab the next available pointer when we want to create a Node. To avoid references invalidation, we just have to create a linked list of big chunks of memory when we exceed the capacity of the current chunk so every given pointer stays valid. I think that std::deque behaves like this, but it's not explicitly created for this.
Another solution would be to store vector indices instead of pointers but this is not a solution because a Node doesn't want to be associated with any container, it wants the pointer directly.
So what is the good solution here, that would avoid having to allocated new nodes on each iteration?
You can use std::deque<Node> and it will do memory management for you creating elements by groups and no invalidating pointers if you do not delete elements in middle. Though if you want to have more precise control on how many elements in a group you can quite simply create something like that:
class NodePool {
constexpr size_t blockSize = 512;
using Block = std::array<Node,blockSize>;
using Pool = std::list<Block>;
size_t allocated = blockSize;
Pool pool;
public:
Node *allocate()
{
if( allocated == blockSize ) {
pool.emplace_back();
allocated = 0;
}
return &( pool.back()[ allocated++ ] );
}
};
I did not try to compile it, but it should be enough to exress the idea. Here changing blockSize you can fine tune performance of your program. Though you should be aware than Node objects will be fully constructed by groups (unlike hoiw std::deque would do it). As much as I am aware there is no way to create raw memory for Node objects which is standard comformant.

How to create an array of pointers to structure using new?

I am trying to create a graph using linked list styled nodes where each node is a structure containing a key and an address to the next node, but I want to join multiple nodes to one node so I tried creating an array of pointers to structure and initialize them using new dynamically but it throws an error saying that it "cannot convert node*** to node** in assignment".
I have tried using struct node* next[] but it didn't work well. What am I missing here? Should I just use a vector of pointers instead of an array?
struct node
{
int key;
struct node** next;
};
int main()
{
struct node A;
A.key = 12;
A.next = new node**[2];
return 0;
}
Should I just use a vector of pointers instead of an array?
This is often an ideal solution. This would fix the memory leak that your program has (or would have if it compiled in the first place). An example:
struct node
{
int key;
std::vector<node*> next;
};
// usage
A.next.resize(2);
Vector does have space overhead, which can be a problem with big graphs because the total overhead increases linearly in relation to number of nodes. If vector is not appropriate for you, an alternative is std::unique_ptr, which does not have any overhead compared to a bare pointer:
struct node
{
int key;
std::unique_ptr<node[]> next;
};
// usage
A.next.reset(new node*[2]);
new node**[2];
What am I missing here?
You're attempting to create an array of node** when you need an array of node*.
Should I just use a vector of pointers instead of an array?
YES!
After including the vector library, then in your structure, you would have a member like this:
std::vector<node*> next;
This is the C++ approach, using raw pointers is the C approach.
As an encyclopedian information though, with raw pointers, you would do:
A.next = new node*[2];
which means an array of two pointers.

How can I correctly push back a series of objects in a vector in C++?

The scope of the program is to create a Container object which stores in a vector Class objects. Then I want to print, starting from a precise Class object of the vector all its predecessors.
class Class{
public:
Class(){
for (int i = 0; i < 10; ++i) {
Class c;
c.setName(i);
if (i > 0) {
c.setNext(_vec,i-1);
}
_vec.push_back(c);
}
}
};
~Class();
void setName(const int& n);
void setNext( vector<Class>& vec, const int& pos);
Class* getNext();
string getName();
void printAllNext(){ //print all next Class objects including himself
cout << _name <<endl;
if (_next != nullptr) {
(*_next).printAllNext();
}
}
private:
Class* _next;
string _name;
};
class Container{
public:
Container(){
for (int i = 0; i < 10; ++i) {
Class c;
c.setName(i);
if (i > 0) {
c.setNext(_vec,i-1);
}
_vec.push_back(c);
};
~Container();
void printFromVec(const int& n){//print all objects of _vec starting from n;
_vec[n].printAllNext();
};
private:
vector<Class> _vec;
};
int main() {
Container c;
c.printFromVec(5);
}
The problem is that all _next pointers of Class objects are undefined or random.
I think the problem is with this part of code:
class Container{
public:
Container(){
for (int i = 0; i < 10; ++i) {
Class c;
c.setName(i);
if (i > 0) {
c.setNext(_vec,i-1);
}
_vec.push_back(c);
};
Debugging I noticed that pointers of already created objects change their values.
What is the problem? How can I make it work?
Although there is really error in the code (likely wrong copypaste), the problem is really following: std::vector maintains inside dynamically allocated array of objects. It starts with certain initial size. When you push to vector, it fills entries of array. When all entries are filled but you attempt pushing more elements, vector allocates bigger chunk of memory and moves or copies (whichever you element data type supports) objects to a new memory location. That's why address of object changes.
Now some words on what to do.
Solution 1. Use std::list instead of std::vector. std::list is double linked list, and element, once added to list, will be part of list item and will not change its address, there is no reallocation.
Solution 2. Use vector of shared pointers. In this case you will need to allocate each object dynamically and put address into shared pointer object, you can do both at once by using function std::make_shared(). Then you push shared pointer to vector, and store std::weak_ptr as pointer to previous/next one.
Solution 3. If you know maximum number of elements in vector you may ever have, you can leave all as is, but do one extra thing before pushing very first time - call reserve() on vector with max number of elements as parameters. Vector will allocate array of that size and keep it until it is filled and more space needed. But since you allocated maximum possible size you expect to ever have, reallocation should never happen, and so addresses of objects will remain same.
Choose whichever solution you think fits most for your needs.
#ivan.ukr Offered a number of solutions for keeping the pointers stable. However, I believe that is the wrong problem to solve.
Why do we need stable pointers? So that Class objects can point to the previous object in a container.
Why do we need the pointers to previous? So we can iterate backwards.
That’s the real problem: iterating backwards from a point in the container. The _next pointer is an incomplete solution to the real problem which is iteration.
If you want to iterate a vector, use iterators. You can read about them on the cppreference page for std::vector. I don’t want to write the code for you but I’ll give you some hints.
To get an iterator referring to the ith element, use auto iter = _vec.begin() + i;.
To print the object that this iterator refers to, use iter->print() (you’ll have to rename printAllNext to print and have it just print this object).
To move an iterator backwards, use --iter.
To check if an iterator refers to the first element, use iter == _vec.begin().
You could improve this further by using reverse iterators but I’ll leave that up to you.

Create Dynamically Allocated Array with Pointers to Structs C++

So I currently have a simple struct (linkedlist) that I will be using in a HashMap:
struct Node {
std::string key, value;
Node* head;
}
I'm currently trying to dynamically allocate an array with pointers to each struct. This is what I have right now ...
Node* nodes = new Node[100]
I understand this allocates an array of 100 nodes into memory (which I will have to delete later on); however, upon iteration to try to transverse these nodes (which I an implementing as a linked list)...
for (int x = 0; x < 100; x++) {
Node current = nodes[x]; // Problem is I wanted an array to node pointers. This is not a pointer.
while (current != nullptr) { // this isn't even legal since current is not a pointer.
// DO STUFF HERE
current = current.next; // This is not a pointer access to a method. I'm looking to access next with current->next;
}
}
Hopefully I was clear enough. Can someone how to allocate a dynamic array of pointers to structs? So far I'm able to dynamically allocate an array of structs, just not an array of pointers to structs.
There are two approaches. Either you allocate an array of structures and introduce one more pointer that will point to the element in the array that will play the role of the head.
For example
Node *head = nodes;
(in this case head points to nodes[0])
After the list will not be needed you have to delete it using operator
delete [] nodes;
Or you can indeed to allocate an array of pointers to the structure like this
Node **nodes = new Node *[100];
But in this case each element of the array in turn should be a pointer to a dynamically allocated object;
And to delete the list you at first have to delete each object pointed to by elements of the array for example in a loop
for ( int i = 0; i < 100; i++ ) delete nodes[i];
and then to delete the array itself
delete [] nodes;
It is a good idea to initialize each element of the array with zeroes when the array is allocated for example
Node **nodes = new Node *[100]();
I suggested you this structure:
class myList {
struct Node {
string value;
Node* next;
}
/*Public methods .. Add/Set/Get/Next/isEmpty.. etc ... */
Node* head, *tail;
};
in main:
myList* lis = new myList[number];
then you have number of lists! and do all work in class by method's and operators, like if you want the next node just call lis[0].getNext();
if you want to skip current node dolis[0].Next(); ... etc ..
this how to work, what you try to do is looks like C program!

Private array of adjacent node addresses in C++

////EDIT #2: Deleted all the previous info and just post the working code now. Previous question became too lengthy:
#include <iostream>
#include <vector>
using namespace std;
template<class T>
class Node{
T data;
vector<Node<T>*> adjacent;
friend class Graph;
public:
int n;
Node(T initData) : data(initData), n(0){}
void addAdjacent(Node<T>& other){
adjacent.push_back(&other);
n++;
}
T getData(){
return data;
}
Node<T>* getEdge(int edgeNum){
return adjacent[edgeNum];
}
};
template<class T>
class GraphCl{
int n;
vector<Node<T>*> nodes;
T input;
public:
GraphCl(int size): n(size){
for (int i=0;i<n;i++){
cout << "Enter data for node " << i << ": ";
cin >> input;
nodes.push_back(new Node<T>(input)) ;
}
}
void addEdge(int baseNode, int edgeNode){
nodes[baseNode]->addAdjacent(*nodes[edgeNode]);
}
void printGraph(){
for (int i=0;i<n;i++){
Node<T> *base = nodes[i];
cout << "Data of node " << i <<": "<< base->getData() <<endl;
for (int j=0;j<base->n;j++){
cout << "Edge #"<< j+1 << " of node " << i << ": " << base->getEdge(j) <<endl;
}
}
}
};
int main(){
GraphCl<int> *myGraph = new GraphCl<int>(5);
myGraph->addEdge(0,1);
myGraph->addEdge(0,2);
myGraph->addEdge(0,3);
myGraph->addEdge(0,4);
myGraph->addEdge(3,1);
myGraph->addEdge(3,0);
myGraph->printGraph();
return 0;
}
Output:
Enter data for node 0: -34
Enter data for node 1: 12
Enter data for node 2: 56
Enter data for node 3: 3
Enter data for node 4: 23
Data of node 0: -34
Edge #1 of node 0: 0x7fbeebd00040
Edge #2 of node 0: 0x7fbeebd00080
Edge #3 of node 0: 0x7fbeebe00000
Edge #4 of node 0: 0x7fbeebd000d0
Data of node 1: 12
Data of node 2: 56
Data of node 3: 3
Edge #1 of node 3: 0x7fbeebd00040
Edge #2 of node 3: 0x7fbeebd00000
Data of node 4: 23
As you can see this simple implementation is working. I decided to just cut out all the complicated stuff and keep it simple with dynamically changing vectors. Obviously less efficient but I can work from here on. Since I am new with C++ the previous implementation just got my head spinning 360 degrees thinking about where all the pointers to pointers went, without even thinking about memory allocation. The above code basically is a directed graph that is very sensitive to input errors, so I got to work on it still.
Thanks for all the help!
Accessibility
Regarding the accessibility of the array to the Graph, the closest thing to the current implementation is to declare declare Graph as a friend of Node. Simply add:
friend Graph;
To the end of the Node class declaration.
That said, making a class as a friend is sometimes a sign that the API you defined isn't exactly right if classes need to know too much about each others' implementation details. You can alternatively provide an interface for Node such as:
void AddAdjacent(Node* other);
Managing Adjacent Nodes
If you want your adjacent pointer array to be growable, then you are basically re-creating std::vector, so I would suggest using std::vector<Node*>. Initializing a vector with the default (empty) constructor would take care of it, and a nodes[baseNode]->adjacent.push_back(...) would be all you need in addEdges.
If memory is not a consideration and you have a maximal number of nodes in the graph, you can instantiate a constant-sized array.
If you really don't want to use std::vector, but you actually want a growable array of pointers, then you'll have to manage your own malloc and free calls. I'll write something up to that effect, but my advice is to just go ahead with vector.
In case you are curious, the array approach would look something like:
template<class T>
class Node : public Graph{
Node **adjacent; //pointer to array of POINTERS TO adjacent Nodes
int n;
size_t capacity;
T data;
friend Graph;
public:
Node(T initData) : data(initData), capacity(8) {
n = 0;
adjacent = reinterpret_cast<Node**>(malloc(capacity * sizeof(Node**)));
}
~Node() {
free(adjacent);
}
void Grow() {
size_t new_cap = base.capacity * 2;
Node<int> **copy = reinterpret_cast<Node<int>**>(malloc(new_cap * sizeof(Node**)));
memcpy(copy, base.adjacent, base.capacity); // copy and adjacent are non-overlapping, we can use memcpy
free(base.adjacent);
base.adjacent = copy;
base.capacity = new_cap;
}
};
And the insertion:
Node<T>& base = nodes[baseNode];
Node<T>* edge = &(nodes[edgeNode]);
if (base.capacity == base.n) base.Grow();
base.adjacent[base.n++] = edge;
Answering the updated question
There are a few issues with putting Nodes directly in a std::vector in your case.
Using a std::vector is great for many things, but if you are doing that, you should make sure not to take pointers to vectors. Remember, pointers refer to exact addresses in memory of where an object is stored. A vector is a growable container of elements. To store elements contiguously, the vector allocates a bunch of memory, puts objects there, and if it has to grow, it will allocate more memory and move the objects around. It is essentially doing something similar to what you are doing in your Node and grow (except, in its case, its explicitly destroying the objects before freeing the old memory).
Notice that your Grow function allocates new memory and copies the pointers. Simlarly, vectors can allocate new memory and copy the data over. This means that holding pointers to data in a vector is bad. The only guarantee a vector gives you is that its data will continue to be accessible using array-style indexing, find, iteration, etc., not that the data will exist in the same memory location forever.
Explaining the exact bug you are seeing
The vector is invoking a copy constructor. The default copy constructor copies every field one-by-one. This is not what you want in the case of Node, because then you have two vectors that think they own the Node** adjacent memory location. When the first node (the old copy) is being destroyed, it will free its adjacent nodes (which is the same as the copy's adjacent node). When the new copy is being destroyed, it will attempt to free that same memory location, but it is already freed. You also have the problem here that, if you attempted to access the memory after it has been destroyed in the first node, you'll be in trouble.
Why was this bug showing up when you were only adding nodes?
When a vector grows to a certain amount, it needs to resize. In most implementation, the process is roughly:
Allocate a bunch more memory (usually twice the old capacity)
Invoke the copy constructor to copy elements from the old location to the new location
Destroy the elements in the old location (say, by explicitly calling the destructor)
Insert the new element in the new location
Your bug is showing up because of steps 2 and 3, basically.
Fixing this particular bug
For your case, the default copy constructor is no good because copying a node should meet a deep copy of all of the data. A regular copy in C++ will copy all of the data on the class or struct itself. If the data is a pointer, then the pointer is copied, not the thing its pointing to.
Override the copy constructor and assignment operator:
Node(const Node<T>& other) : data(other.data), capacity(other.capacity), n(other.n) {
adjacent = reinterpret_cast<Node**>(malloc(capacity * sizeof(Node**)));
memcpy(adjacent, other.adjacent, capacity * sizeof(Node**));
}
Node<T>& operator= (const Node<T>& other) {
data = other.data;
capacity = other.capacity;
n = other.n;
adjacent = reinterpret_cast<Node**>(malloc(capacity * sizeof(Node**)));
memcpy(adjacent, other.adjacent, capacity * sizeof(Node**));
}
A Bigger Problem
A bigger problem with your code is that the use of an std::vector and pointers to its elements. Choose one of:
Use a fixed-sized array (which is stable in memory), and point to these objects
Forget about pointers altogether, and make your adjacent list a list of indices into the vector (its less performant as you need to go through the vector each time, but that likely won't be your bottleneck for now)