In theory, given:
std::vector X(0);
Then X will allocate memory from the stack for itself, but is it guaranteed not to allocate heap memory?
In other words, since implementations will generally use a pointer for the vector, is this pointer always initially 0?
Note: this is not the same as Initial capacity of vector in C++ since that asks about capacity when no argument is passed to the constructor, not about guarantees on heap allocations when capacity is 0; The fact that capacity can be non-zero in this case illustrates the difference.
That constructor calls explicit vector( size_type count ) which does:
Constructs the container with count default-inserted instances of T.
No copies are made.
The only guarantee you get is that the vector will be empty, its size() will be 0. Implementations are allowed to allocate whatever they want for book keeping or whatever on initialization.
So if your question is if you can count on X taking up 0 bytes of free store space then the answer is no.
Related
According to https://en.cppreference.com/w/cpp/container/vector/data, the underlying pointer is not guaranteed to be nullptr if the size is 0
If size() is 0, data() may or may not return a null pointer.
but does that apply to the default initialized std::vector with no elements? or does that simply state that if all elements of the vector are erased, the underlying pointer may not become nullptr?
Consider the following line:
std::vector<int> fooArr;
int* fooArrPtr = fooArr.data();
is it guaranteed that fooArrPtr is equal to nullptr?
No.
The standard guarantees that a default-initialised std::vector has size() equal to zero, but that doesn't require that data() will return nullptr.
There is nothing in the standard that prevents (and "not preventing" is not equivalent to "requiring") a default-constructed vector having zero .size() and non-zero capacity(). In such a case, it would be feasible for .data() to return a pointer to the allocated memory (which will be non-null, but dereferencing it will still have undefined behaviour since .size() is zero [and allocated capacity is not required to be initialised]).
If you want to test if a vector has zero size, use .size() or .empty(). Don't call .data() and compare the result with nullptr.
When an empty vector is instantiated, no memory is allocated.
Then clearing all the elements doesn't mean that the vector release its internal memory (this remains available when new elements will be inserted). The memory will be released only when the destructor is called.
To make a long story short, you cannot use data() to check if a vector is empty.
Furthermore, it is not even advisable to check if a vector has had some members in its existance depending if data() returns you a nullptr or less. You cannot be sure that the internal template implementation always grants you this property.
According to cppref:
std::allocator<T>::allocate_at_least
Allocates count * sizeof(T) bytes of uninitialized storage, where
count is an unspecified integer value not less than n, by calling
::operator new (an additional std::align_val_t argument might be
provided), but it is unspecified when and how this function is called.
Then, this function creates an array of type T[count] in the storage
and starts its lifetime, but does not start lifetime of any of its
elements.
However, I think the already existing std::allocator<T>::allocate can do the same thing.
Why do we need std::allocator<T>::allocate_at_least in C++23?
allocate may allocate more elements than were requested but it has no way to return to its caller the actual allocated size.
This is the purpose of allocate_at_least, its implementation might be the same as allocate and might allocate exactly the same number of elements, the difference is that it is able to return the number of elements allocated to the caller which means the caller can make use of those extra elements if necessary.
allocate_at_least does not do the same thing as allocate. Compare (allocate):
Allocates n * sizeof(T) bytes of uninitialized storage...
with (allocate_at_least):
Allocates count * sizeof(T) bytes of uninitialized storage, where count is an unspecified integer value not less than n...
Moreover, allocate returns:
Pointer to the first element of an array of n objects of type T...
While allocate_at_least returns:
std::allocation_result<T*>{p, count}, where p points to the first element of an array of count objects of type T...
The caller thus gets the information about the actually allocated size.
The motivation can be found in P0401R6; Section Motivation:
Consider code adding elements to vector:
std::vector<int> v = {1, 2, 3};
// Expected: v.capacity() == 3
// Add an additional element, triggering a reallocation.
v.push_back(4);
Many allocators only allocate in fixed-sized chunks of memory, rounding up requests. Our underlying heap allocator received a request for 12 bytes (3 * sizeof(int)) while constructing v. For several implementations, this request is turned into a 16 byte region.
This comes from notes of cppref:
allocate_at_least is mainly provided for contiguous containers, e.g. std::vector and std::basic_string, in order to reduce reallocation by making their capacity match the actually allocated size when possible.
The "unspecified when and how" wording makes it possible to combine or
optimize away heap allocations made by the standard library
containers, even though such optimizations are disallowed for direct
calls to ::operator new. For example, this is implemented by libc++.
After calling allocate_at_least and before construction of elements,
pointer arithmethic of T* is well-defined within the allocated array,
but the behavior is undefined if elements are accessed.
As we know, std::vector when initialized like std::vector vect(n) or empty_vect.resize(n) not only allocates required amount of memory but also initializes it with default value (i.e. calls default constructor). This leads to unnecessary initialization especially if I have an array of integers and I'd like to fill it with some specific values that cannot be provided via any vector constructor.
Capacity on the other hand allocates the memory in call like empty_vect.reserve(n), but in this case vector still is empty. So size() returns 0, empty() returns true, operator[] generates exceptions.
Now, please look into the code:
{ // My scope starts here...
std::vector<int> vect;
vect.reserve(n);
int *data = vect.data();
// Here I know the size 'n' and I also have data pointer so I can use it as a regular array.
// ...
} // Here ends my scope, so vector is destroyed, memory is released.
The question is if "so I can use it as array" is a safe assumption?
No matter for arguments, I am just curious of above question. Anyway, as for arguments:
It allocates memory and automatically frees it on any return from function
Code does not performs unnecessary data initialization (which may affect performance in some cases)
No, you cannot use it.
The standard (current draft, equivalent wording in C++11) says in [vector.data]:
constexpr T* data() noexcept;
constexpr const T* data() const noexcept;
Returns: A pointer such that [data(), data() + size()) is a valid range.
For a non-empty vector, data() == addressof(front()).
You don't have any guarantee that you can access through the pointer beyond the vector's size. In particular, for an empty vector, the last sentence doesn't apply and so you cannot even be sure that you are getting a valid pointer to the underlying array.
There is currently no way to use std::vector with default-initialized elements.
As mentioned in the comments, you can use std::unique_ptr instead (requires #inclue<memory>):
auto data = std::unique_ptr<int[]>{new int[n]};
which will give you a std::unique_ptr<int[]> smart pointer to a dynamically sized array of int's, which will be destroyed automatically when the lifetime of data ends and that can transfer it's ownership via move operations.
It can be dereferenced and indexed directly with the usual pointer syntax, but does not allow direct pointer arithmetic. A raw pointer can be obtained from it via data.get().
It does not offer you the std::vector interface, though. In particular it does not provide access to its allocation size and cannot be copied.
Note: I made a mistake in a previous version of this answer. I used std::make_unique<int[]> without realizing that it actually also performs value-initialization (initialize to zero for ints). In C++20 there will be std::make_unique_default_init<int[]> which will default-initialize (and therefore leave ints with indeterminate value).
I have a std::vector on which I call reserve with a large value. Afterwards I retrieve data().
Since iterating data is then crashing I am wondering whether this is even allowed. Is reserve forced to update data to the allocated memory range?
The guarantee of reserve is that subsequent insertions do not reallocate, and thus do not cause invalidation. That's it. There are no further guarantees.
Is reserve forced to update data to the allocated memory range?
No. The standard only guarantees that std::vector::data returns a pointer and [data(), data() + size()) is a valid range, the capacity is not concerned.
ยง23.3.11.4/1 vector data
[vector.data]:
Returns: A pointer such that [data(), data() + size()) is a valid
range. For a non-empty vector, data() == addressof(front()).
There is no requirement that data() returns dereferencable pointer for empty (size() == 0) vector, even if it has nonzero capacity. It might return nullptr or some arbitrary value (only requirement in this case is that it should be able to be compared with itself and 0 could be added to it without invoking UB).
I'd say the documentation is pretty clear on this topic: anything after data() + size() may be allocated but not initialized memory: if you want to also initialize this memory you should use vector::resize.
void reserve (size_type n);
Request a change in capacity
Requests that the vector capacity be at least enough to contain n elements.
If n is greater than the current vector capacity, the function causes
the container to reallocate its storage increasing its capacity to n
(or greater).
In all other cases, the function call does not cause a reallocation
and the vector capacity is not affected.
This function has no effect on the vector size and cannot alter its
elements.
I'm not sure why you would want to access anything after data() + size() after reserve() in the first place: the intended use of reserve() is to prevent unnecessary reallocations when you know or can estimate the expected size of your container, but at the same time avoid the unnecessary initializon of memory which may be either inefficient or impractical (e.g. non-trivial data for initialization is not available). In this situation you could replace log(N) reallocations and copies with only 1 improving performance.
Is it allowed by the standard for a deque to allocate is memory in a sparse way?
My understanding is that most implementations of deque allocate memory internally in blocks of some size. I believe, although I don't know this for a fact, that implementations allocate at least enough blocks to store all the items for there current size. So if a block is 100 items and you do
std::deque<int> foo;
foo.resize( 1010 );
You will get at least 11 blocks allocated. However given that in the above all 1010 int are default constructed do you really need to allocate the blocks at the time of the resize call? Could you instead allocate a given block only when someone actually inserts an item in some way. For instance the implementation could mark a block as "all default constructed" and not allocate it until someone uses it.
I ask as I have a situation where I potentially want a deque with a very large size that might be quite sparse in terms of which elements I end up using. Obviously I could use other data structures like a map but I'm interested in what the rules are for deque.
A related question given the signature of resize void resize ( size_type sz, T c = T() ); is whether the standard requires that the default constructor is called exactly sz times? If the answer is yes then I guess you can't do a sparse allocation at least for types that have a non trivial default constructor, although presumably it may still be possible for built in types like int or double.
All elements in the deque must be correctly constructed. If you need
a sparse implementation, I would suggest a deque (or a vector) to
pointers or a Maybe class (you really should have one in your toolbox
anyway) which doesn't contruct the type until it is valid. This is not
the role of deque.
23.3.3.3 states that deque::resize will append sz - size() default-inserted elements (sz is the first argument of deque::resize).
Default-insertion (see 23.2.1/13) means that an element is initialized by the expression allocator_traits<Allocator>::construct(m, p) (where m is an allocator and p a pointer of the type that is to be constructed). So memory has to be available and the element will be default constructed (initialized?).
All-in-all: deque::resize cannot be lazy about constructing objects if it wants to be conforming. You can add lazy construction to a type easily by wrapping it in a boost::optional or any other Maybe container.
Answering your second question. Interestingly enough, there is a difference between C++03 and C++11.
Using C++03, your signature
void resize ( size_type sz, T c = T() );
will default construct the parameter (unless you supply another value), and then copy construct the new members from this value.
In C++11, this function has been replaced by two overloads
void resize(size_type sz);
void resize(size_type sz, const T& c);
where the first one default constructs (or value initializes) sz elements.