Initialization of vectors - c++

When I execute the following statement:
vector <int> v;
What exactly will the value of v be?
Will it just be a pointer that points to the start of a memory block? Will its value be NULL?
Some points to possible duplicates to another question which is more complicated than this one, and focus less on the comparison of 1) calling the default constructor of vector class and 2) initialization of an array which I believe is a pointer to int.

Your syntax will call the constructor with no parameters, also known as the default constructor. According to the std::vector constructor documentation, you can see that it will create an empty vector.
The pointer where it points to does not matter since you are not supposed to dereference its values while the container is empty. Please note that if you want to store the value of the internal pointer, such as std::vector::data(), it may change anytime you add an element to the vector (well, technically, you can predict when the pointer will change, but it’s a good practice to do as if the pointer always changes).

What exactly will the value of v be?
pointer? - no.
NULL - no.
nullptr - no.
v is an instance of class std::vector<T> (where T is int).
On Ubuntu Linux 64 bit, a "std::vector<T> tVec;" occupies 24 bytes regardless of
sizeof(T),
or
number of elements.
The guts of the object are not similar to an array of int, but the implementation does maintain an array of T, probably in dynamic memory.
For each compiler, the implementation may vary.

vector <int> v;
What exactly will the value of v be?
That is the syntax of default initialisation. Therefore the object will be in a default initialised state. For class types such as std::vector, default initialisation calls the default constructor. An online reference describes the default constructor of vector thusly:
1) Default constructor. Constructs an empty container. If no allocator is supplied, allocator is obtained from a default-constructed instance.
Will it just be a pointer that points to the start of a memory block? Will its value be NULL?
A vector is not a pointer.
Among other members, a vector implementation does contain a pointer which may point to a buffer that the vector manages - you can get a copy of that pointer using the std::vector::data member function. The state of the internal pointer of a default initialised vector is unspecified. Since an empty vector does not need a buffer, that pointer may be null - but is not required to be.

Related

In C++ can I treat an array of single-member unions as an array of the element?

Suppose I am writing a fixed-size array class of runtime size, somewhat equivalent to Rust's Box<[T]> in order to save the space of tracking capacity when I know the array isn't going to change size after initialization.
In order to support types which do not have a default constructor, I want to be able to allow the user to supply a generator function that takes the index of the element and produces a T. In order to do this and decouple allocation and initialization, I follow the advice in CJ Johnson's CppCon 2019 talk "How to Hold a T" and initially create the array in terms of a single-member union:
template<typename T>
union MaybeUninit
{
MaybeUninit() {}
~MaybeUninit() {}
T val;
};
// ...
m_array = new MaybeUninit<T>[size];
// initialize all elements by setting val for each item
T* items = reinterpret_cast<T*>(m_array); // is this OK and dereferenceable?
My question is, once the generator is done and all the elements of m_array are initialized, am I allowed (according to the standard, regardless of whether a given compiler implementation permits it) to use reinterpret_cast<T*>(m_array) to treat the result as an array of the actual objects (the line marked "is this OK")? If not, is there any way to get from MaybeUninit<T>* to T* without copying?
In terms of which standard, I'm mainly interested in C++17 or later.
am I allowed (according to the standard, regardless of whether a given compiler implementation permits it) to use reinterpret_cast<T*>(m_array) to treat the result as an array of the actual objects (the line marked "is this OK")?
No, any pointer arithmetic on the resulting pointer will result in UB (for indices >1) or result in a one-past the end pointer that can't be dereferenced (for index 1). Only accessing the element at index 0 this way is allowed (but needs to still be constructed).
The only way you are allowed to perform pointer arithmetic is on pointers to the elements of an array. Your pointer is not pointing to an object that is element of an array (which then for the purpose of pointer arithmetic is considered to be belong to an array of length 1).
If not, is there any way to get from MaybeUninit* to T* without copying?
The pointer conversion is not an issue, but you can't index into the resulting pointer. The only way to avoid this is to have an actual array of T objects.
Note however that you don't need to construct every element in an array of T objects. For example:
std::allocator<T> alloc;
T* ptr = std::allocator_traits<decltype(alloc)>::allocate(size);
Now ptr is a pointer to an array of size objects of type T, but no actual objects of type T in it have their lifetime started. You can construct individual elements into it with placement-new (or std::construct_at or std::allocator_traits<decltype(alloc)>::construct) and destruct them with a destructor call (or std::destroy_at or std::allocator_traits<decltype(alloc)>::destruct). You need to do this with your union approach as well anyhow. This approach also allows you to easily exchange the allocator with a different one.
There will be no overhead for size or capacity management. All of that is now responsibility of the user. Whether this is a good idea is a different question.
Instead of std::allocator or an alternative Allocator implementation you could also use other functions that allocate memory and are specified to implicitly create objects, e.g. operator new or std::malloc, etc.

reserve() - data() trick on empty vector - is it correct?

As we know, std::vector when initialized like std::vector vect(n) or empty_vect.resize(n) not only allocates required amount of memory but also initializes it with default value (i.e. calls default constructor). This leads to unnecessary initialization especially if I have an array of integers and I'd like to fill it with some specific values that cannot be provided via any vector constructor.
Capacity on the other hand allocates the memory in call like empty_vect.reserve(n), but in this case vector still is empty. So size() returns 0, empty() returns true, operator[] generates exceptions.
Now, please look into the code:
{ // My scope starts here...
std::vector<int> vect;
vect.reserve(n);
int *data = vect.data();
// Here I know the size 'n' and I also have data pointer so I can use it as a regular array.
// ...
} // Here ends my scope, so vector is destroyed, memory is released.
The question is if "so I can use it as array" is a safe assumption?
No matter for arguments, I am just curious of above question. Anyway, as for arguments:
It allocates memory and automatically frees it on any return from function
Code does not performs unnecessary data initialization (which may affect performance in some cases)
No, you cannot use it.
The standard (current draft, equivalent wording in C++11) says in [vector.data]:
constexpr T* data() noexcept;
constexpr const T* data() const noexcept;
Returns: A pointer such that [data(), data() + size()) is a valid range.
For a non-empty vector, data() == addressof(front()).
You don't have any guarantee that you can access through the pointer beyond the vector's size. In particular, for an empty vector, the last sentence doesn't apply and so you cannot even be sure that you are getting a valid pointer to the underlying array.
There is currently no way to use std::vector with default-initialized elements.
As mentioned in the comments, you can use std::unique_ptr instead (requires #inclue<memory>):
auto data = std::unique_ptr<int[]>{new int[n]};
which will give you a std::unique_ptr<int[]> smart pointer to a dynamically sized array of int's, which will be destroyed automatically when the lifetime of data ends and that can transfer it's ownership via move operations.
It can be dereferenced and indexed directly with the usual pointer syntax, but does not allow direct pointer arithmetic. A raw pointer can be obtained from it via data.get().
It does not offer you the std::vector interface, though. In particular it does not provide access to its allocation size and cannot be copied.
Note: I made a mistake in a previous version of this answer. I used std::make_unique<int[]> without realizing that it actually also performs value-initialization (initialize to zero for ints). In C++20 there will be std::make_unique_default_init<int[]> which will default-initialize (and therefore leave ints with indeterminate value).

Can deque memory allocation be sparse?

Is it allowed by the standard for a deque to allocate is memory in a sparse way?
My understanding is that most implementations of deque allocate memory internally in blocks of some size. I believe, although I don't know this for a fact, that implementations allocate at least enough blocks to store all the items for there current size. So if a block is 100 items and you do
std::deque<int> foo;
foo.resize( 1010 );
You will get at least 11 blocks allocated. However given that in the above all 1010 int are default constructed do you really need to allocate the blocks at the time of the resize call? Could you instead allocate a given block only when someone actually inserts an item in some way. For instance the implementation could mark a block as "all default constructed" and not allocate it until someone uses it.
I ask as I have a situation where I potentially want a deque with a very large size that might be quite sparse in terms of which elements I end up using. Obviously I could use other data structures like a map but I'm interested in what the rules are for deque.
A related question given the signature of resize void resize ( size_type sz, T c = T() ); is whether the standard requires that the default constructor is called exactly sz times? If the answer is yes then I guess you can't do a sparse allocation at least for types that have a non trivial default constructor, although presumably it may still be possible for built in types like int or double.
All elements in the deque must be correctly constructed. If you need
a sparse implementation, I would suggest a deque (or a vector) to
pointers or a Maybe class (you really should have one in your toolbox
anyway) which doesn't contruct the type until it is valid. This is not
the role of deque.
23.3.3.3 states that deque::resize will append sz - size() default-inserted elements (sz is the first argument of deque::resize).
Default-insertion (see 23.2.1/13) means that an element is initialized by the expression allocator_traits<Allocator>::construct(m, p) (where m is an allocator and p a pointer of the type that is to be constructed). So memory has to be available and the element will be default constructed (initialized?).
All-in-all: deque::resize cannot be lazy about constructing objects if it wants to be conforming. You can add lazy construction to a type easily by wrapping it in a boost::optional or any other Maybe container.
Answering your second question. Interestingly enough, there is a difference between C++03 and C++11.
Using C++03, your signature
void resize ( size_type sz, T c = T() );
will default construct the parameter (unless you supply another value), and then copy construct the new members from this value.
In C++11, this function has been replaced by two overloads
void resize(size_type sz);
void resize(size_type sz, const T& c);
where the first one default constructs (or value initializes) sz elements.

std::vector elements initializing

std::vector<int> v1(1000);
std::vector<std::vector<int>> v2(1000);
std::vector<std::vector<int>::const_iterator> v3(1000);
How elements of these 3 vectors initialized?
About int, I test it and I saw that all elements become 0. Is this standard? I believed that primitives remain undefined. I create a vector with 300000000 elements, give non-zero values, delete it and recreate it, to avoid OS memory clear for data safety. Elements of recreated vector were 0 too.
What about iterator? Is there a initial value (0) for default constructor or initial value remains undefined? When I check this, iterators point to 0, but this can be OS
When I create a special object to track constructors, I saw that for first object, vector run the default constructor and for all others it run the copy constructor. Is this standard?
Is there a way to completely avoid initialization of elements? Or I must create my own vector? (Oh my God, I always say NOT ANOTHER VECTOR IMPLEMENTATION)
I ask because I use ultra huge sparse matrices with parallel processing, so I cannot use push_back() and of course I don't want useless initialization, when later I will change the value.
You are using this constructor (for std::vector<>):
explicit vector (size_type n, const T& value= T(), const Allocator& = Allocator());
Which has the following documentation:
Repetitive sequence constructor: Initializes the vector with its content set to a repetition, n times, of copies of value.
Since you do not specify the value it takes the default-value of the parameter, T(), which is int in your case, so all elements will be 0
They are default initialized.
About int, I test it and I saw that all elements become 0. Is this standard? I believed that primitives remain undefined.
No, an uninitialized int has an indeterminate value. These are default initialized, i.e.,
int i; // uninitialized, indeterminate value
int k = int(); // default initialized, value == 0
In C++11 the specification for the constructor vector::vector(size_type n) says that n elements are default-inserted. This is being defined as an element initialized by the expression allocator_traits<Allocator>::construct(m, p) (where m is of the allocator type and p a pointer to the type stored in the container). For the default allocator this expression is ::new (static_cast<void*>(p)) T() (see 20.6.8.2). This value-initializes each element.
The elements of a vector are default initialized, which in the case of POD types means zero initialized. There's no way to avoid it with a standard vector.

What is the element value in an uninitialized vector?

If I create a vector like vector<myClass> v(10);
what is the default value of each element?
Also, what if it is a vector<myUnion> v(10) ?
The constructor of std::vector<> that you are using when you declare your vector as
vector<myClass> v(10);
actually has more than one parameter. It has three parameters: initial size (that you specified as 10), the initial value for new elements and the allocator value.
explicit vector(size_type n, const T& value = T(),
const Allocator& = Allocator());
The second and the third parameters have default arguments, which is why you were are able to omit them in your declaration.
The default argument value for the new element is the default-contructed one, which in your case is MyClass(). This value will be copied to all 10 new elements by means of their copy-constructors.
What exactly MyClass() means depends on your class. Only you know that.
P.S. Standard library implementations are allowed to use function overloading instead of default arguments when implementing the above interface. If some implementation decides to use function overloading, it might declare a constructor with just a single parameter (size) in std::vector. This does not affect the end result though: all vector elements should begin their lives as if they were value-initialized.
vector<myClass> v;
its a empty vector with size and capacity as 0.
The answer to your second question is similar; vector<myUnion> v(10) will create an array of 10 myUnions initialized with their default constructor. However note that: 1) Unions can't have members with constructors, copy constructors or destructors, as the compiler won't know which member to construct, copy or destroy, and 2) As with classes and structs, members with built-in type such as int will be initialized as per default, which is to say not at all; their values will be undefined.