What triggered this question is some code along the line of:
std::vector<int> x(500);
std::vector<int> y;
std::swap(x,y);
And I was wondering if swapping the two requires twice the amount of memory that x needs.
On cppreference I found for std::vector::swap (which is the method that the last line effectively calls):
Exchanges the contents of the container with those of other. Does not invoke any move, copy, or swap operations on individual elements.
All iterators and references remain valid. The past-the-end iterator is invalidated.
And now I am more confused than before. What does std::vector::swap actually do when it does not move, copy or swap elements?
How is it possible that iterators stay valid?
Does that mean that something like this is valid code?
std::vector<int> x(500);
std::vector<int> y;
auto it = x.begin();
std::swap(x,y);
std::sort(it , y.end()); // iterators from different containers !!
vector internally stores (at least) a pointer to the actual storage for the elements, a size and a capacity.† std::swap just swaps the pointers, size and capacity (and ancillary data if any) around; no doubling of memory or copies of the elements are made because the pointer in x becomes the pointer in y and vice-versa, without any new memory being allocated.
The iterators for vector are generally lightweight wrappers around pointers to the underlying allocated memory (that's why capacity changes generally invalidate iterators), so iterators produced for x before the swap seamlessly continue to refer to y after the swap; your example use of sort is legal, and sorts y.
If you wanted to swap the elements themselves without swapping storage (a much more expensive operation, but one that leaves preexisting iterators for x refering to x), you could use std::swap_range, but that's a relatively uncommon use case. The memory usage for that would depend on the implementation of swap for the underlying object; by default, it would often involve a temporary, but only for one of the objects being swapped at a time, not the whole of one vector.
† Per the comments, it could equivalently use pointers to the end of the used space and the end of the capacity, but either approach is logically equivalent, and just microoptimizes in favor of slightly different expected use cases; storing all pointers optimizes for use of iterators (a reasonable choice), while storing size_type optimizes for .size()/.capacity() calls.
I will write a toy vector.
struct toy_vector {
int * buffer = 0;
std::size_t valid_count = 0;
std::size_t buffer_size = 0;
int* begin() { return buffer; }
int* end() { return buffer+valid_count; }
std::size_t capacity() const { return buffer_size; }
std::size_t size() const { return valid_count; }
void swap( toy_vector& other ) {
std::swap( buffer, other.buffer );
std::swap( valid_count, other.valid_count );
std::swap( buffer_size, other.buffer_size );
}
That is basically it. I'll implement a few methods so you see we have enough tools to work with:
int& operator[](std::size_t i) { return buffer[i]; }
void reserve(int capacity) {
if (capacity <= buffer_size)
return;
toy_vector tmp;
tmp.buffer = new int[capacity];
for (std::size_t i = 0; i < valid_count; ++i)
tmp.buffer[i] = std::move(buffer[i]);
tmp.valid_count = valid_count;
tmp.buffer_size = capacity;
swap( tmp );
}
void push_back(int x) {
if (valid_count+1 > buffer_size) {
reserve( (std::max)((buffer_size*3/2), buffer_size+1) );
buffer[valid_count] = std::move(x);
++valid_count;
}
// real dtor is more complex.
~toy_vector() { delete[] buffer; }
};
actual vectors have exception safety issues, more concerns about object lifetime and allocators (I'm using ints, so don't care if I properly construct/destroy them), and might store 3 pointers instead of a pointer and 2 sizes. But swapping those 3 pointers is just as easy as swapping the pointer and 2 sizes.
Iterators in real vectors tend not to be raw pointers (but they can be). As you can see above, the raw pointer iterators to the vector a become raw pointer iterators into vector b when you do an a.swap(b); non-raw pointer vector iterators are basically fancy wrapped pointers, and have to follow the same semantics.
The C++ standard does not explicitly mandate an implementation that looks like this, but it was based off an implementation that looks like this, and it implicitly requires one that is almost identical to this (I'm sure someone could come up with a clever standard compliant vector that doesn't look like this; but every vector in every standard library I have seen has looked like this.)
Related
Let's say we have a vector of double (values) and a vector containing objects that store pointers to the elements of the vector of double (vars):
class Var
{
public:
explicit Var(double* v) : _value(v) {};
~Var() {};
const double get() const { return *_value; };
void set(const double v) { *_value = v; };
private:
double* _value;
};
struct Vars
{
Vars()
{
//values.reserve(1000'000);
}
vector<double> values{};
vector<Var> vars{};
void push(const double v)
{
values.push_back(v);
vars.emplace_back(&values.back());
}
};
(objects after adding are never deleted)
It is known that when a vector reallocates objects, all pointers break.
I can pre-call reserve(), but the problem is that I don't know how many objects will be stored, maybe 5, or maybe 500'000 and shrink_to_fit() will break pointers anyway.
I can use a deque in this case, but I'm wondering if I can prevent the vector from reallocating memory when I call shrink_to_fit (), or some other way?
The key is
objects after adding are never deleted
Rather than store pointers/iterators/references to elements of the vector, store a pointer1 to the vector and an index.
class Var
{
public:
explicit Var(std::vector<double> * vec, size_t index)
: vec(vec), index(index) {};
double get() const { return vec->at(index); };
void set(double v) { vec->at(index) = v; };
private:
std::vector<double> * vec;
size_t index;
};
struct Vars
{
vector<double> values;
vector<Var> vars;
void push(double v)
{
vars.emplace_back(&values, values.size());
values.push_back(v);
}
};
pointer and not reference so we are a SemiRegular type (and can easily define == to be Regular)
std::vector does not allow you to return part of its buffer to the heap while keeping the rest; there is no API in the std::allocator "concept" to do that.
std::vector guarantees contiguous storage, so it cannot be allocated in pieces.
If you don't need contiguous storage, use either std::deque or a hand-rolled duplicate (deque leaves some performance-critical parameters free for implementations to choose and doesn't expose the ability to configure them to end-users; so if your deque isn't appropriate for your use case, you may have to roll your own).
std::deque is a dynamic array of fixed sized buffers, plus a small amount of end-management state. It allows for stable storage with a dynamic sized container and O(1) random access, and the container overhead is a small percentage of the amount stored (the scaling overhead is a pointer per block, and block size is many times larger than a pointer on every implementation I have seen).
It has the disadvantage compared to vector that the storage isn't contiguous, and that iterator/[] lookup has an extra pointer indirect, but it would be a strange contiguous storage container that fit the rest of your requirements.
Can i forbid vector to reallocate objects?
You can avoid reallocation by not performing operations which cause the vector to (potentially) reallocate.
Other than making the vector const, there is no way to prevent those operations from being performed. One solution is to make the vector a private member (and never return a pointer / reference to non-const pointing to that member), in which case you only need to be concerned that your member functions do not perform those operations.
the problem is that I don't know how many objects will be stored, maybe 5, or maybe 500'000
If you don't need the pointers in between insertions, you can perform all insertions in one go (regardless of how many), then get the pointers afterwards, and then no longer modify the vector.
In the general case where above is not an option, if you need the pointers to remain valid, vector is not an appropriate choice of data structure. A list or deque would work depening on how you use the container.
Regarding the example: Having a vector of objects, and vector of (wrappers containing) pointers to those objects seems to be rather pointless.
I wonder if the range constructor of std::vector does copy the data, or does it just reference it?
Have a look at this example:
vector<int> getVector() {
int arr[10];
for(int i=0; i<10; ++i) arr[i] = i;
return vector<int>(arr, arr+10);
}
Would this cause a bug (due to handing out a reference to the stack which is destroyed later) or is it fine, since it copies the data in the constructor?
Edit #1
For clarification: I'm looking for a more or less official resource that points out, which of the following pseudo code implementations of the constructor are valid. I know the signature of the constructor is different... but, you should get the idea.
Version A (just uses the given data internally)
template<typename T>
class vector {
private:
T* data;
int size;
public:
vector<T>(T* start, T* end) {
data = start;
size = (end - start);
}
};
Version B (explicitly copies the data)
template<typename T>
class vector {
private:
T* data;
int size;
public:
vector<T>(T* start, T* end) {
for(T* it = start; it < end; ++it) push_back(*it);
}
};
When in doubt, check the reference. The answer can be derived from Complexity section, although I'd agree there is no explicit confirmation:
Complexity: Makes only N calls to the copy constructor of T (where N
is the distance between first and last) and no reallocations if
iterators first and last are of forward, bidirectional, or random
access categories. It makes order N calls to the copy constructor of T
and order logN reallocations if they are just input iterators.
Like all constructors of std::vector<int>, this copies the integers. The same holds for methods like push_back and insert
This is why std::vector actually has two template arguments. The second one is defaulted to std::allocator; it's the allocator used to allocate memory for the 10 integers (and perhaps a few more so that the vector can grow - see capacity)
[Edit]
The actual code is most like Version B, but probably similar to
template<typename T>
class vector {
private:
T* _Data = nullptr;
size_t _Capacity = 0;
size_t _Used = 0;
public:
vector<T>(T* start, T* end) {
_Used = (end-begin);
reserve(_Used); // Sets _Data, _Capacity
std::uninitialized_copy(begin, end, _Data);
}
};
The C++ standard library is specified in a somewhat strange way.
It is specified saying what each method requires and what each method guarantees. It is not specified as in "vector is a container of values that it owns", even though that is the real underlying abstraction here.
Formally, what you are doing is safe not because "the vector copies", but because none of the preconditions of any of the methods of std vector are violated in the copy of the std vector your function returns.
Similarly, the values are set to be certain ones because of the postconditions of the constructor, and then the pre and post conditions of the copy constructor and/or C++17 prvalue "elision" rules.
But trying to reason about C++ code in this way is madness.
A std::vector semantically is a regular type with value semantics that owns its own elements. Regular types can be copied, and the copies behave sane even if the original object is destroyed.
Unless you make a std::vector<std::reference_wrapper<int>> you are safe, and you are unsafe for the reference wrapper because you stored elements which are not regular value types.
The vector can not be defined as a vector of references as for example std::vector<int &>. So the code is valid. The vector does not contain references to elements of the array. It creates new elements of the type int (as the template argument of the vector) not a vector of references.
Today, I was attempting to extract a subset of N elements from a vector of size M, where N < M. I realised that I did not need to create a new copy, only needed to modify the original, and moreover, could take simply the first N elements.
After doing a few brief searches, there were many answers, the most attractive one being resize() which appears to truncate the vector down to length, and deal neatly with the memory issues of erasing the other elements.
However, before I came across vector.resize(), I was trying to point the vector.end() to the N+1'th position. I knew this wouldn't work, but I wanted to try it regardless. This would leave the other elements past the N'th position "stranded", and I believe (correct me if i'm wrong) this would be an example of a memory leak.
On looking at the iterator validity on http://www.cplusplus.com/reference/vector/vector/resize/,
we see that if it shrinks, vector.end() stays the same. If it expands, vector.end() will move (albeit irrelevant to our case).
This leads me to question, what is the underlying mechanic of vector.end()? Where does it lie in memory? It can be found incrementing an iterator pointing to the last element in the vector, eg auto iter = &vector.back(), iter++, but in memory, is this what happens?
I can believe that at all times, what follows vector.begin() should be the first element, but on resize, it appears that vector.end() can lie elsewhere other than past the last element in the vector.
For some reason, I can't seem to find the answer, but it sounds like a very basic computer science course would contain this information. I suppose it is stl specific, as there are probably many implementations of a vector / list that all differ...
Sorry for the long post about a simple question!
you asked about "the underlying mechanic of vector.end()". Well here is (a snippet of) an oversimplified vector that is easy to digest:
template <class T>
class Simplified_vector
{
public:
using interator = T*;
using const_interator = const T*;
private:
T* buffer_;
std::size_t size_;
std::size_t capacity_;
public:
auto push_back(const T& val) -> void
{
if (size_ + 1 > capacity_)
{
// buffer increase logic
//
// this usually means allocation a new larger buffer
// followed by coping/moving elements from the old to the new buffer
// deleting the old buffer
// and make `buffer_` point to the new buffer
// (along with modifying `capacity_` to reflect the new buffer size)
//
// strong exception guarantee makes things a bit more complicated,
// but this is the gist of it
}
buffer_[size_] = val;
++size_;
}
auto begin() const -> const_iterator
{
return buffer_;
}
auto begin() -> iterator
{
return buffer_;
}
auto end() const -> const_iterator
{
return buffer_ + size_;
}
auto end() -> iterator
{
return buffer_ + size_;
}
};
Also see this question Can std::vector<T>::iterator simply be T*? for why T* is a perfectly valid iterator for std::vector<T>
Now with this implementation in mind let's answer a few of your misconceptions questions:
I was trying to point the vector.end() to the N+1'th position.
This is not possible. The end iterator is not something that is stored directly in the class. As you can see it's a computation of the begging of the buffer plus the size (number of elements) of the container. Moreover you cannot directly manipulate it. The internal workings of the class make sure end() will return an iterator pointing to 1 past the last element in the buffer. You cannot change this. What you can do is insert/remove elements from the container and the end() will reflect these new changes, but you cannot manipulate it directly.
and I believe (correct me if i'm wrong) this would be an example of a
memory leak.
you are wrong. Even if you somehow make end point to something else that what is supposed to point, that wouldn't be a memory leak. A memory leak would be if you would lost any reference to the dynamically allocated internal buffer.
The "end" of any contiguous container (like a vector or an array) is always one element beyond the last element of the container.
So for an array (or vector) of X elements the "end" is index X (remember that since indexes are zero-based the last index is X - 1).
This is very well illustrated in e.g. this vector::end reference.
If you shrink your vector, the last index will of course also change, meaning that the "end" will change as well. If the end-iterator does not change, then it means you have saved it from before you shrank the vector, which will change the size and invalidate all iterators beyond the last element in the vector, including the end iterator.
If you change the size of a vector, by adding new elements or by removing elements, then you must re-fetch the end iterator. The existing iterator objects you have will not automatically be updated.
Usually the end isn't stored in an implementation of vector. A vector stores:
A pointer to the first element. If you call begin(), this is what you get back.
The size of the memory block that's been managed. If you call capacity() you get back the number of elements that can fit in this allocated memory.
The number of elements that are in use. These are elements that have been constructed and are in the first part of the memory block. The rest of the memory is unused, but is available for new elements. If the entire capacity gets filled, to add more elements the vector will allocate a larger block of memory and copy all the elements into that, and deallocate the original block.
When you call end() this returns begin() + size(). So yes, end() is a pointer that points to one beyond the last element.
So the end() isn't a thing that you can move. You can only change it by adding or removing elements.
If you want to extract a number of elements 'N' you can do so by reading those from begin() to begin() + 'N'.
for( var it = vec.begin(); it != begin() + n; ++it )
{
// do something with the element (*it) here.
}
Many stl algorithms take a pair of iterators for the begin and end of a range of elements you want to work with. In your case, you can use vec.begin() and vec.begin() + n as the begin and end of the range you're interested in.
If you want to throw away the elements after n, you can do vec.resize(n). Then the vector will destruct elements you don't need. It might not change the size of the memory block the vector manages, the vector might keep the memory around in case you add more elements again. That's an implementation detail of the vector class you're using.
There are some questions quite similar around here, but they couldn't help me get my mind around it.
Also, I'm giving a full example code, so it might be easier for others to understand.
I have made a vector container (couldn't use stl for memory reasons) that used to use only operator= for push_back*, and once I came accross placement new, I decided to introduce an additional "emplace_back" to it**.
*(T::operator= is expected to deal with memory management)
**(the name is taken from a similar function in std::vector that I've encountered later, the original name I gave it was a mess).
I read some stuff about the danger of using placement new over operator new[] but couldn't figure out if the following is ok or not, and if not, what's wrong with it, and what should I replace it with, so I'd appreciate your help.
This is of couse a simplified code, with no iterators, and no extended functionality, but it makes the point :
template <class T>
class myVector {
public :
myVector(int capacity_) {
_capacity = capacity_;
_data = new T[_capacity];
_size = 0;
}
~myVector() {
delete[] _data;
}
bool push_back(T const & t) {
if (_size >= _capacity) { return false; }
_data[_size++] = t;
return true;
}
template <class... Args>
bool emplace_back(Args const & ... args) {
if (_size >= _capacity) { return false; }
_data[_size].~T();
new (&_data[_size++]) T(args...);
return true;
}
T * erase (T * p) {
//assert(/*p is not aligned*/);
if (p < begin() || p >= end()) { return end(); }
if (p == &back()) { --_size; return end(); }
*p = back();
--_size;
return p;
}
// The usual stuff (and more)
int capacity() { return _capacity; }
int size() { return _size; }
T * begin() { return _data; }
T * end() { return _data + _size; }
T const * begin() const { return _data; }
T const * end() const { return _data + _size; }
T & front() { return *begin(); }
T & back() { return *(end() - 1); }
T const & front() const { return *begin(); }
T const & back() const { return *(end() - 1); }
T & operator[] (int i) { return _data[i]; }
T const & operator[] (int i) const { return _data[i]; }
private:
T * _data;
int _capacity;
int _size;
};
Thanks
I read some stuff about the danger of using placement new over
operator new[] but couldn't figure out if the following is ok or not,
and if not, what's wrong with it [...]
For operator new[] vs. placement new, it's only really bad (as in typically-crashy type of undefined behavior) if you mix the two strategies together.
The main choice you typically have to make is to use one or the other. If you use operator new[], then you construct all the elements for the entire capacity of the container in advance and overwrite them in methods like push_back. You don't destroy them on removal in methods like erase, just kind of keep them there and adjust the size, overwrite elements, and so forth. You both construct and allocate a multiple elements all in one go with operator new[], and destroy and deallocate them all in one go using operator delete[].
Why Placement New is Used For Standard Containers
First thing to understand if you want to start rolling your own vectors or other standard-compliant sequences (that aren't simply linked structures with one element per node) in a way that actually destroys elements when they are removed, constructs elements (not merely overwrite them) when added, is to separate the idea of allocating the memory for the container and constructing the elements for it in place. So quite to the contrary, in this case, placement new isn't bad. It's a fundamental necessity to achieve the general qualities of the standard containers. But we can't mix it with operator new[] and operator delete[] in this context.
For example, you might allocate the memory to hold 100 instances of T in reserve, but you don't want to default construct them as well. You want to construct them in methods like push_back, insert, resize, the fill ctor, range ctor, copy ctor, etc. -- methods that actually add elements and not merely the capacity to hold them. That's why we need placement new.
Otherwise we lose the generality of std::vector which avoids constructing elements that aren't there, can copy construct in push_backs rather than simply overwriting existing ones with operator=, etc.
So let's start with the constructor:
_data = new T[_capacity];
... this will invoke the default constructors for all the elements. We don't want that (neither the default ctor requirement nor this expense), as the whole point of using placement new is to construct elements in-place of allocated memory, and this would have already constructed all elements. Otherwise any use of placement new anywhere will try to construct an already-constructed element a second time, and will be UB.
Instead you want something like this:
_data = static_cast<T*>(malloc(_capacity * sizeof(T)));
This just gives us a raw chunk of bytes.
Second, for push_back, you're doing:
_data[_size++] = t;
That's trying to use the assignment operator, and, after our previous modification, on an uninitialized/invalid element which hasn't been constructed yet. So we want:
new(_data + _size) T(t);
++size;
... that makes it use the copy constructor. It makes it match up with what push_back is actually supposed to do: creating new elements in the sequence instead of simply overwriting existing ones.
Your erase method needs some work even at the basic logic level if you want to handle removals from the middle of the container. But just from the resource management standpoint, if you use placement new, you want to manually invoke destructors for removed elements. For example:
if (p == &back()) { --_size; return end(); }
... should be more like:
if (p == &back())
{
--size;
(_data + _size)->~T();
return end();
}
Your emplace_back manually invokes a destructor but it shouldn't do this. emplace_back should only add, not remove (and destroy) existing elements. It should be quite similar to push_back but simply invoking the move ctor.
Your destructor does this:
~myVector() {
delete[] _data;
}
But again, that's UB when we take this approach. We want something more like:
~myVector() {
for (int j=0; j < _size; ++j)
(_data + j)->~T();
free(_data);
}
There's still a whole lot more to cover like exception-safety which is a whole different can of worms.
But this should get you started with respect to proper usage of placement new in a data structure against some memory allocator (malloc/free in this exemplary case).
Last but not least:
(couldn't use stl for memory reasons)
... this might be an unusual reason. Your implementation doesn't necessarily use any less memory than a vector with reserve called in advance to give it the appropriate capacity. You might shave off a few bytes for on a per-container-level (not on a per-element level) with the choice of 32-bit integrals and no need to store an allocator, but it's going to be a very small memory savings in exchange for a boatload of work.
This kind of thing can be a useful learning exercise though to help you build some data structures outside the standard in a more standard-compliant way (ex: unrolled lists which I find quite useful).
I ended up having to reinvent some vectors and vector-like containers for ABI reasons (we wanted a container we could pass through our API that was guaranteed to have the same ABI regardless of what compiler was used to build a plugin). Even then, I would have much preferred simply using std::vector.
Note that if you just want to take control of how vector allocates memory, you can do that by specifying your own allocator with a compliant interface. This might be useful, for example, if you want a vector which allocates 128-bit aligned memory for use with aligned move instructions using SIMD.
I need a container that implements the following API (and need not implement anything else):
class C<T> {
C();
T& operator[](int); // must have reasonably sane time constant
// expand the container by default constructing elements in place.
void resize(int); // only way anything is added.
void clear();
C<T>::iterator begin();
C<T>::iterator end();
}
and can be used on:
class I {
public:
I();
private: // copy and assignment explicate disallowed
I(I&);
I& operator=(I&);
}
Dose such a beast exist?
vector<T> doesn't do it (resize moves) and I'm not sure how fast deque<T> is.
I don't care about allocation
Several people have assumed that the reason I can't do copies is memory allocation issues. The reason for the constraints is that the element type explicitly disallows copying and I can't change that.
Looks like I've got my answer: STL doesn't have one. But now I'm wondering Why not?
I'm pretty sure that the answer here is a rather emphatic "No". By your definition, resize() should allocate new storage and initialize with the default constructor if I am reading this correctly. Then you would manipulate the objects by indexing into the collection and manipulating the reference instead of "inserting" into the collection. Otherwise, you need the copy constructor and assignment operator. All of the containers in the Standard Library have this requirement.
You might want to look into using something like boost::ptr_vector<T>. Since you are inserting pointers, you don't have to worry about copying. This would require that you dynamically allocate all of your objects though.
You could use a container of pointers, like std::vector<T*>, if the elements cannot be copied and their memory is managed manually elsewhere.
If the vector should own the elements, something like std::vector< std::shared_ptr<T> > could be more appropriate.
And there is also the Boost Pointer Container library, which provides containers for exception safe handling of pointers.
Use deque: performance is fine.
The standard says, "deque is the data structure of choice when most insertions and deletions take place at the beginning or at the end of the sequence" (23.1.1). In your case, all insertions and deletions take place at the end, satisfying the criterion for using deque.
http://www.gotw.ca/gotw/054.htm has some hints on how you might measure performance, although presumably you have a particular use-case in mind, so that's what you should be measuring.
Edit: OK, if your objection to deque is in fact not, "I'm not sure how fast deque is", but "the element type cannot be an element in a standard container", then we can rule out any standard container. No, such a beast does not exist. deque "never copies elements", but it does copy-construct them from other objects.
Next best thing is probably to create arrays of elements, default-constructed, and maintain a container of pointers to those elements. Something along these lines, although this can probably be tweaked considerably.
template <typename T>
struct C {
vector<shared_array<T> > blocks;
vector<T*> elements; // lazy, to avoid needing deque-style iterators through the blocks.
T &operator[](size_t idx) { return *elements[idx]; }
void resize(size_t n) {
if (n <= elements.size()) { /* exercise for the reader */ }
else {
boost::shared_array<T> newblock(new T[elements.size() - n]);
blocks.push_back(newblock);
size_t old = elements.size();
// currently we "leak" newblock on an exception: see below
elements.resize(n);
for (int i = old; j < n; ++i) {
elements[i] = &newblock[i - old];
}
}
void clear() {
blocks.clear();
elements.clear();
}
};
As you add more functions and operators, it will approach deque, but avoiding anything that requires copying of the type T.
Edit: come to think of it, my "exercise for the reader" can't be done quite correctly in cases where someone does resize(10); resize(20); resize(15);. You can't half-delete an array. So if you want to correctly reproduce container resize() semantics, destructing the excess elements immediately, then you will have to allocate the elements individually (or get acquainted with placement new):
template <typename T>
struct C {
deque<shared_ptr<T> > elements; // or boost::ptr_deque, or a vector.
T &operator[](size_t idx) { return *elements[idx]; }
void resize(size_t n) {
size_t oldsize = elements.size();
elements.resize(n);
if (n > oldsize) {
try {
for (size_t i = oldsize; i < n; ++i) {
elements[i] = shared_ptr<T>(new T());
}
} catch(...) {
// closest we can get to strong exception guarantee, since
// by definition we can't do anything copy-and-swap-like
elements.resize(oldsize);
throw;
}
}
}
void clear() {
elements.clear();
}
};
Nicer code, not so keen on the memory access patterns (but then, I'm not clear whether performance is a concern or not since you were worried about the speed of deque.)
As you've discovered, all of the standard containers are incompatible with your requirements. If we can make a couple of additional assumptions, it wouldn't be too hard to write your own container.
The container will always grow - resize will always be called with a greater number than previously, never lesser.
It is OK for resize to make the container larger than what was asked for; constructing some number of unused objects at the end of the container is acceptable.
Here's a start. I leave many of the details to you.
class C<T> {
C();
~C() { clear(); }
T& operator[](int i) // must have reasonably sane time constant
{
return blocks[i / block_size][i % block_size];
}
// expand the container by default constructing elements in place.
void resize(int n) // only way anything is added.
{
for (int i = (current_size/block_size)+1; i <= n/block_size; ++i)
{
blocks.push_back(new T[block_size]);
}
current_size = n;
}
void clear()
{
for (vector<T*>::iterator i = blocks.begin(); i != blocks.end(); ++i)
delete[] *i;
current_size = 0;
}
C<T>::iterator begin();
C<T>::iterator end();
private:
vector<T*> blocks;
int current_size;
const int block_size = 1024; // choose a size appropriate to T
}
P.S. If anybody asks you why you want to do this, tell them you need an array of std::auto_ptr. That should be good for a laugh.
All the standard containers require copyable elements. At the very least because push_back and insert copy the element passed to them. I don't think you can get away with std::deque because even its resize method takes parameter to be copied for filling the elements.
To use a completely non-copyable class in the standard containers, you would have to store pointers to those objects. That can sometimes be a burden but usage of shared_ptr or the various boost pointer containers can make it easier.
If you don't like any of those solutions then take a browse through the rest of boost. Maybe there's something else suitable in there. Perhaps intrusive containers?
Otherwise, if you don't think any of that suits your needs then you could always try to roll your own container that does what you want. (Or else do more searching to see if anyone else has ever made such a thing.)
You shouldn't pick a container based on how it handles memory. deque for example is a double-ended queue, so you should only use it when you need a double-ended queue.
Pretty much every container will allocate memory if you resize it! Of course, you could change the capacity up front by calling vector::reserve. The capacity is the number of physical elements in memory, the size is how many you are actively using.
Obviously, there will still be an allocation if you grow past your capacity.
Look at ::boost::array. It doesn't allow the container to be resized after creating it, but it doesn't copy anything ever.
Getting both resize and no copying is going to be a trick. I wouldn't trust a ::std::deque because I think maybe it can copy in some cases. If you really need resizing, I would code your own deque-like container. Because the only way you're going to get resizing and no copying is to have a page system like ::std::deque uses.
Also, having a page system necessarily means that at isn't going to be quite as fast as it would be for ::std::vector and ::boost::array with their contiguous memory layout, even though it can still be fairly fast.