What's the difference between these two classes? - c++

Below, I'm not declaring my_ints as a pointer. I don't know where the memory will be allocated. Please educate me here!
#include <iostream>
#include <vector>
class FieldStorage
{
private:
std::vector<int> my_ints;
public:
FieldStorage()
{
my_ints.push_back(1);
my_ints.push_back(2);
}
void displayAll()
{
for (int i = 0; i < my_ints.size(); i++)
{
std::cout << my_ints[i] << std::endl;
}
}
};
And in here, I'm declaring the field my_ints as a pointer:
#include <iostream>
#include <vector>
class FieldStorage
{
private:
std::vector<int> *my_ints;
public:
FieldStorage()
{
my_ints = new std::vector<int>();
my_ints->push_back(1);
my_ints->push_back(2);
}
void displayAll()
{
for (int i = 0; i < my_ints->size(); i++)
{
std::cout << (*my_ints)[i] << std::endl;
}
}
~FieldStorage()
{
delete my_ints;
}
};
main() function to test:
int main()
{
FieldStorage obj;
obj.displayAll();
return 0;
}
Both of them produces the same result. What's the difference?

In terms of memory management, these two classes are virtually identical. Several other responders have suggested that there is a difference between the two in that one is allocating storage on the stack and other on the heap, but that's not necessarily true, and even in the cases where it is true, it's terribly misleading. In reality, all that's different is where the metadata for the vector is allocated; the actual underlying storage in the vector is allocated from the heap regardless.
It's a little bit tricky to see this because you're using std::vector, so the specific implementation details are hidden. But basically, std::vector is implemented like this:
template <class T>
class vector {
public:
vector() : mCapacity(0), mSize(0), mData(0) { }
~vector() { if (mData) delete[] mData; }
...
protected:
int mCapacity;
int mSize;
T *mData;
};
As you can see, the vector class itself only has a few members -- capacity, size and a pointer to a dynamically allocated block of memory that will store the actual contents of the vector.
In your example, the only difference is where the storage for those few fields comes from. In the first example, the storage is allocated from whatever storage you use for your containing class -- if it is heap allocated, so too will be those few bits of the vector. If your container is stack allocated, so too will be those few bits of the vector.
In the second example, those bits of the vector are always heap allocated.
In both examples, the actual meat of the vector -- the contents of it -- are allocated from the heap, and you cannot change that.
Everybody else has pointed out already that you have a memory leak in your second example, and that is also true. Make sure to delete the vector in the destructor of your container class.

You have to release ( to prevent memory leak ) memory allocated for vector in the second case in the FieldStorage destructor.
FieldStorage::~FieldStorage()
{
delete my_ints;
}

As Mykola Golubyev pointed out, you need to delete the vector in the second case.
The first will possibly build faster code, as the optimiser knows the full size of FieldStorage including the vector, and could allocate enough memory in one allocation for both.
Your second implementation requires two separate allocations to construct the object.

I think you are really looking for the difference between the Stack and the Heap.
The first one is allocated on the stack while the second one is allocated on the heap.

In the first example, the object is allocated on the stack.
The the second example, the object is allocated in the heap, and a pointer to that memory is stored on the stack.

the difference is that the second allocates the vector dynamically. there are few differences:
you have to release memory occupied by the vector object (the vector object itself, not the object kept in the vector because it is handled correctly by the vector). you should use some smart pointer to keep the vector or make (for example in the destructor):
delete my_ints;
the first one is probably more efficient because the object is allocated on the stack.
access to the vector's method have different syntax :)

The first version of FieldStorage contains a vector. The size of the FieldStorage class includes enough bytes to hold a vector. When you construct a FieldStorage, the vector is constructed right before the body of FieldStorage's constructor is executed. When a FieldStorage is destructed, so is the vector.
This does not necessarily allocate the vector on the stack; if you heap-allocate a FieldStorage object, then the space for the vector comes from that heap allocation, not the stack. If you define a global FieldStorage object, then the space for the vector comes from neither the stack nor the heap, but rather from space designated for global objects (e.g. the .data or .bss section on some platforms).
Note that the vector performs heap allocations to hold the actual data, so it is likely to only contain a few pointers, or a pointer and couple of lengths, but it may contain whatever your compiler's STL implementation needs it to.
The second version of FieldStorage contains a pointer to a vector. The size of the FieldStorage class includes room for a pointer to a vector, not an actual vector. You are allocating storage for the vector using new in the body of FieldStorage's constructor, and you leaking that storage when FieldStorage is destructed, because you didn't define a destructor that deletes the vector.

First way is the prefereable one to use, you don't need pointer on vector and have forgot to delete it.

In the first case, the std::vector is being put directly into your class (and it is handling any memory allocation and deallocation it needs to grow and shrink the vector and free the memory when your object is destroyed; it is abstracting the memory management from you so that you don't have to deal with it). In the second case, you are explicitly allocating the storage for the std::vector in the heap somewhere, and forgetting to ever delete it, so your program will leak memory.

The size of the object is going to be different. In the second case the Vector<>* only takes up the size of a pointer (4bytes on 32bit machines). In the first case, your object will be larger.

One practical difference is that in your second example, you never release either the memory for the vector, or its contents (because you don't delete it, and therefore call its destructor).
But the first example will automatically destroy the vector (and free its contents) upon destruction of your FieldStorage object.

Related

Stack vs Heap - Should objects inside *vector be declared as pointers?

If I use this line
std:vector<MyObject>* vec = new std::vector<MyObject>(100);
If I create MyObject's in the stack and add them to the vector, they remain in the stack, right ?
MyObject obj1;
vec->push_back(obj1);
So if it go to the stack, than MyObject's added to the vector will be gone after method ends ? What will I have inside the vector than? Garbage?
Should I use this instead ?:
std:vector<MyObject*>* vec = new std::vector<MyObject*>(100);
And if so, what about objects and primitives inside each MyObject ?
Should they also be dynamically created ?
Thank You
The std:vector as any other Standard Library container copies elements into itself, so it owns them. Thus, if you have a dynamically allocated std::vector the elements that you .push_back() will be copied into the memory managed by the std::vector, thus they will be copied onto the heap.
As a side note, in some cases std::vector may move elements if it is safe to do so, but the effect is the same - in the end, all the elements are under std::vector's jurisdiction.
A std::vector<MyObject> looks something like this (in reality it's much more complex):
struct Vector {
MyObject* data;
int size;
}
As you can see, the data is not directly inside the vector object. The vector always allocates the memory for the data on the heap. Here's what happens when:
you call .push_back: The vector copies the object into its own data block (which is on the heap and owned by the vector)
you copy the vector: The copied vector allocates new memory and copies all data from the existing vector into it
As you can see, the vector owns his data. That means, if you push_back a object into it, it doesn't matter where it came from because it gets copied.
If you have a std::vector<MyObject>* you have a pointer to a vector. This vector also owns his data, but you only have a pointer to it. That means, you have to delete it exactly once, otherwise you'll get a memory leak or a crash. Passing around pointers to a vector is OK, but need one class or function that "owns" it. This class/function has to guarantee that the vector still exists when the pointer to it is used.
The third case is a std::vector<MyObject*>. As every vector, this one also owns his data. But this time, the data is only a pointer. So the vector only owns the pointer, but not the objects to which the pointers are pointing. If you do something like this:
std::vector<MyObject*> getObjects() {
MyObject obj1("foo");
MyObject obj2("bar");
std::vector<MyObject*> vec;
vec.push_back(&obj1);
vec.push_back(&obj2);
return vec;
}
The returned vector only contains garbage because you only saved the address to a object on the stack to it. These objects are destroyed when the function returns, but the pointers in the vector are still pointing to that memory block on the stack.
Additionally, keep in mind that this
std:vector<MyObject*>* vec = new std::vector<MyObject*>(100);
Doesn't store heap allocated objects to the vector. It just stores pointers of type MyObject. If you want to do something like that remember to create the objects first, before you use them. You can do that with something like the following:
for (int i = 0; i < vec->size(); i++) {
vec->at(i) = new MyObject();
}

Free storage for pointers stored in a heap allocated vector

I have a unique_ptr pointing to a heap allocated vector, that itself stores pointers to dynamically allocated objects. The problem is I think the unique_ptr frees the dynamically allocated vector itself, but this vector's destructor doesn't free the dynamically allocated objects. So, how to solve this problem without changing the implementation of the program? I know if I used a shared_ptr I can use a custom destructor to free up storage, but I have to stick with unique_ptr for specific reasons.
So, can the vector have a custom destructor like shared_ptr has, that can free the stored objects?
std::unique_ptr<vector<Test*>>
You have a misunderstanding about how types work in C++. Every type has a fixed size. You can get this size with sizeof. For example, on a typical platform, a vector<Test*> has a size of 24 bytes.
When you allocate a vector on the stack, you are only allocating those 24 bytes on the stack. When you add objects to the vector, the vector will allocate memory from the heap to store those objects.
like I said I have to stick with this implementation since it save more stack memory than storing vector on the stack.
It won't. So you don't have to stick with this implementation. Switch to a sensible one such as std::vector<std::unique_ptr<Test>>.
And I dont want objects to be copied or moved to prevent moving objects and keeping hollow ones that take more space
A sensible implementation (such as std::vector<std::unique_ptr<Test>>) won't move the underlying objects. Move semantics for std::vector are already optimal on any sensible platform.
You are expecting moving a std::unique_ptr to be significantly different from moving a std::vector. Your reasoning for that assumption is faulty and, even if it was correct, this would still be a pointless increase in complexity.
In fact, why not just use vector<Test>?
Check this out:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> a;
for (int i = 0; i < 10; ++i)
a.push_back(i);
std::cout << &a[0] << std::endl;
std::vector<int> b = std::move(a);
std::cout << &b[0] << std::endl;
}
Output:
0x563caf0f1f20
0x563caf0f1f20
Notice that moving the vector didn't even move the objects in the vector. It just transferred ownership of the objects from one vector to the other.
Stop trying to optimize things that are already optimized.

std::vector internals

How is std::vector implemented, using what data structure? When I write
void f(int n) {
std::vector<int> v(n);
...
}
Is the vector v allocated on stack?
The vector object will be allocated on the stack and will internally contain a pointer to beginning of the elements on the heap.
Elements on the heap give the vector class an ability to grow and shrink on demand.
While having the vector on the stack gives the object the benefit of being destructed when going out of scope.
In regards to your [] question, the vector class overloads the [] operator. I would say internally it's basically doing something like this when you do array[1]:
return *(_Myfirst+ (n * elementSize))
Where the vector keeps track of the start of it's internal heap with _Myfirst.
When your vector starts to fill up, it will allocate more memory for you. A common practice is to double the amount of memory needed each time.
Note that vector inherits from _Vector_val, which contains the following members:
pointer _Myfirst; // pointer to beginning of array
pointer _Mylast; // pointer to current end of sequence
pointer _Myend; // pointer to end of array
_Alty _Alval; // allocator object for values
Your v is allocated in automatic memory. (commonly known as the stack, yes)
The implementation details are not specified, but most commonly it's implemented using a dynamic array which is resized if you attempt to add more elements than the previous allocation can hold.
The standard only specifies the interface (which methods it should have) and execution time boundaries.
Since vector is a template, the implementation is visible, so locate your <vector> file and start inspecting.
void f(int n) {
std::vector<int> v(n);
...
}
The vector above has automatic storage duration and is allocated on the stack. However, the data array that the vector manages is allocated on the heap.
The internals of the vector are implementation specific, but a typical implementation will contain 3 pointers; one each to keep track of start of array, size of vector and capacity of vector.

C++ delete vector, objects, free memory

I am totally confused with regards to deleting things in C++. If I declare an array of objects and if I use the clear() member function. Can I be sure that the memory was released?
For example :
tempObject obj1;
tempObject obj2;
vector<tempObject> tempVector;
tempVector.pushback(obj1);
tempVector.pushback(obj2);
Can I safely call clear to free up all the memory? Or do I need to iterate through to delete one by one?
tempVector.clear();
If this scenario is changed to a pointer of objects, will the answer be the same as above?
vector<tempObject> *tempVector;
//push objects....
tempVector->clear();
You can call clear, and that will destroy all the objects, but that will not free the memory. Looping through the individual elements will not help either (what action would you even propose to take on the objects?) What you can do is this:
vector<tempObject>().swap(tempVector);
That will create an empty vector with no memory allocated and swap it with tempVector, effectively deallocating the memory.
C++11 also has the function shrink_to_fit, which you could call after the call to clear(), and it would theoretically shrink the capacity to fit the size (which is now 0). This is however, a non-binding request, and your implementation is free to ignore it.
There are two separate things here:
object lifetime
storage duration
For example:
{
vector<MyObject> v;
// do some stuff, push some objects onto v
v.clear(); // 1
// maybe do some more stuff
} // 2
At 1, you clear v: this destroys all the objects it was storing. Each gets its destructor called, if your wrote one, and anything owned by that MyObject is now released.
However, vector v has the right to keep the raw storage around in case you want it later.
If you decide to push some more things into it between 1 and 2, this saves time as it can reuse the old memory.
At 2, the vector v goes out of scope: any objects you pushed into it since 1 will be destroyed (as if you'd explicitly called clear again), but now the underlying storage is also released (v won't be around to reuse it any more).
If I change the example so v becomes a pointer to a dynamically-allocated vector, you need to explicitly delete it, as the pointer going out of scope at 2 doesn't do that for you. It's better to use something like std::unique_ptr in that case, but if you don't and v is leaked, the storage it allocated will be leaked as well. As above, you need to make sure v is deleted, and calling clear isn't sufficient.
vector::clear() does not free memory allocated by the vector to store objects; it calls destructors for the objects it holds.
For example, if the vector uses an array as a backing store and currently contains 10 elements, then calling clear() will call the destructor of each object in the array, but the backing array will not be deallocated, so there is still sizeof(T) * 10 bytes allocated to the vector (at least). size() will be 0, but size() returns the number of elements in the vector, not necessarily the size of the backing store.
As for your second question, anything you allocate with new you must deallocate with delete. You typically do not maintain a pointer to a vector for this reason. There is rarely (if ever) a good reason to do this and you prevent the vector from being cleaned up when it leaves scope. However, calling clear() will still act the same way regardless of how it was allocated.
if I use the clear() member function. Can I be sure that the memory was released?
No, the clear() member function destroys every object contained in the vector, but it leaves the capacity of the vector unchanged. It affects the vector's size, but not the capacity.
If you want to change the capacity of a vector, you can use the clear-and-minimize idiom, i.e., create a (temporary) empty vector and then swap both vectors.
You can easily see how each approach affects capacity. Consider the following function template that calls the clear() member function on the passed vector:
template<typename T>
auto clear(std::vector<T>& vec) {
vec.clear();
return vec.capacity();
}
Now, consider the function template empty_swap() that swaps the passed vector with an empty one:
template<typename T>
auto empty_swap(std::vector<T>& vec) {
std::vector<T>().swap(vec);
return vec.capacity();
}
Both function templates return the capacity of the vector at the moment of returning, then:
std::vector<double> v(1000), u(1000);
std::cout << clear(v) << '\n';
std::cout << empty_swap(u) << '\n';
outputs:
1000
0
Move semantics allows for a straightforward way to release memory, by simply applying the assignment (=) operator from an empty rvalue:
std::vector<int> vec(100, 0);
std::cout << vec.capacity(); // 100
vec = vector<int>(); // Same as "vector<int>().swap(vec)";
std::cout << vec.capacity(); // 0
It is as much efficient as the "swap()"-based method described in other answers (indeed, both are conceptually doing the same thing). When it comes to readability, however, the assignment version makes a better job.
You can free memory used by vector by this way:
//Removes all elements in vector
v.clear()
//Frees the memory which is not used by the vector
v.shrink_to_fit();
If you need to use the vector over and over again and your current code declares it repeatedly within your loop or on every function call, it is likely that you will run out of memory. I suggest that you declare it outside, pass them as pointers in your functions and use:
my_arr.resize()
This way, you keep using the same memory sequence for your vectors instead of requesting for new sequences every time.
Hope this helped.
Note: resizing it to different sizes may add random values. Pass an integer such as 0 to initialise them, if required.

Does a vector have to store the size twice?

This is a rather academic question, I realise it matters little regarding optimization, but it is just out of interest.
From what I understand, when you call new[size], additional space is allocated to store the size of the allocated array. This is so when delete [] is called, it is known how much space can be freed.
What I've done is written how I think a vector would roughly be implemented:
#include <cstddef>
template <class T>
class Vector
{
public:
struct VectorStorage
{
std::size_t size;
T data[];
};
Vector(std::size_t size) : storage(new VectorStorage[size])
{
storage->size = size;
}
std::size_t size()
{
return storage->size;
}
~Vector()
{
delete[] storage;
}
private:
VectorStorage* storage;
};
As far as I can tell, size is stored twice. Once in the VectorStorage object directly (as it needs to be so the size() function can work) but again in a hidden way by the compiler, so delete[] can work.
It seems like size is stored twice. Is this the case and unavoidable, or is there a way to ensure that the size is only stored once?
std::vector does not allocate memory; std::allocator, or whatever allocator you give the vector, is what allocates the memory. The allocator interface is given the number of items to allocate/deallocate, so it is not required to actually store that.
Vectors don't usually store the size. The usual implementation keeps a pointer past the end of the last element and one past the end of the allocated memory, since one can reserve space without actually storing elements in it. However, there is no standard way to access the size stored by new (which may not be stored to begin with), so some duplication is usually needed.
Yes. But that's an implementation detail of the heap allocator that the compiler knows nothing about. It is almost certainly not the same as the capacity of the vector since the vector is only interested in the number of elements, not the number of bytes. And a heap block tends to have extra overhead for its own purposes.
You have size in the wrong place -- you're storing it with every element (by virtue of an array of VectorStorage), which also contains an array (per instance of VectorStorage) of T. Do you really mean to have an array inside an array like that? You're never cleaning up your array of T in your destructor either.