What is the maximum size of a vector? - c++

Why can't this code handle n=100000000 (8 zeros)?
The code will be terminated by this error:
terminate called after throwing an instance of 'std::bad_alloc'
So, the word "DONE" won't be printed out.
using namespace std;
int main()
{
vector <long long> v1;
long long n; cin>>n;
for(long long i=0;i<n;i++){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
Although the maximum size of v1 is 536870911.

What is the maximum size of a vector?
Depends on several factors. Here are some upper limits:
The size of the address space
The maximum representable value of std::vector::size_type
The theoretical upper limit given by std::vector::max_size
Available memory - unless system overcommits and you don't access the entire vector
Maximum available memory for a single process may be limited by the operating system.
Available contiguous address space which can be less than all of the free memory due to fragmentation.
The address space isn't an issue in 64 bit world.
Note that the size of the element type affects the number of elements that fit in a given range of memory.
The most likely the strictest limit in your case was the available memory. 536870911 is one long long short of 4 gigabytes.

From cppref max_size() docs:
This value typically reflects the theoretical limit on the size of the container ...
At runtime, the size of the container may be limited to a value smaller than max_size() by the amount of RAM available.

std::vector contains a pointer to a single dynamically allocated contiguous array. As you call push_back() in your loop, that array has to grow over time whenever the vector's size() exceeds is capacity(). As the array grows large, it becomes more difficult for a new larger contiguous array to be allocated to copy the old array elements into. Eventually, the array is so large that a new copy simply can't be allocated anymore, and thus std::bad_alloc gets thrown.
To avoid all of those reallocations while looping, call the vector's reserve() method before entering the loop:
int main() {
vector <long long> v1;
long long n; cin>>n;
v1.reserve(n); // <-- ADD THIS!
for(long long i = 0; i < n; ++i){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
That way, the array is allocated only 1 time.

Related

Why vector has different capacity and other than the size? [duplicate]

This question already has answers here:
size vs capacity of a vector?
(8 answers)
Closed 6 years ago.
Below are program of vector and gives different result for capacity in c++11 mode.
#include<iostream>
#include<vector>
using namespace std;
int main(){
vector<int>a ={1,2,3};
cout<<"vector a size :"<<a.size()<<endl;
cout<<"vector a capacity :"<<a.capacity()<<endl<<endl;;
vector<int>b ;
b.push_back(1);
b.push_back(2);
b.push_back(3);
cout<<"vector b size :"<<b.size()<<endl;
cout<<"vector b capacity :"<<b.capacity()<<endl;
return 0;
}
OUTPUT
vector a size :3
vector a capacity:3
vector b size :3
vector b capacity :4
Why this program gives different values for capacity of a and b while both have same number of values and how size is different from capacity?
The reason is related to the very essence of the extension algorithm of the vector.
When initializing a vector, the number of extra capacity applied is 0.
In the i-th time an extension is needed, the vector copies its contain to a new vector, with capacity doubled then its current size.
This method makes the whole idea of size-changing array very efficient, since in amortized time (meaning the average time over N operations), we get O(1) insertion complexity.
You can see that after we add one more integer to the first vector, we get a capacity of 6. http://coliru.stacked-crooked.com/a/f084820652f025b8
By allocating more elements than needed, the vector does not need to reallocate memory when new elements are added to the vector. Also, when reducing the size, reallocation is not needed at all.
Reallocation of memory is a relatively expensive operation (creating new block, copying elements across, removing old block).
The trade-off is that the vector may have allocated more memory than it will need (e.g. if it allocates memory for elements that never get added/used). Practically, unless available memory is scarce, the cost of allocating a larger block (and reallocating less often) is less than the cost or reallocating every time.

vector's size in C++

#include <iostream>
#include <vector>
#include "mixmax.h"//<-here, there is random number generator called mixmax
#include <algorithm>
#include <cmath>
using namespace std;
int main()
{
const unsigned long long int n=10000000;
vector < float > f(n);
vector < float > distance_1(n);
vector < float > distance_2(n);
rng_state_t s;
rng_state_t *x=&s;
seed_spbox(x,12345);//<-here we just devlare our random number generator
for(int i=0;i<n;i++)
f[i]=int(n*get_next_float(x));//,<-here we just get random numbers,like rand()
sort(f.begin(),f.end());
for(int i=0;i<n;i++)
{
distance_1[i]=f[i]-i;
distance_2[i]=(i+1)-f[i];
}
float discrep=max(*max_element(distance_1.begin(),distance_1.end()),*max_element(dis tance_2.begin(),distance_2.end()));
cout<<"discrep= "<<discrep<<endl;
cout<<"sqrt(n)*discrep= "<<discrep/sqrt(n)<<endl;
}
When I print f.max_size() (the vector declined above in code) gives me this huge number 4611686018427387903, but when I take n=10000000000, it does not work, it gives this error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped).
(I tried it in Visual Studio under windows.)
What's the problem ??? If vectors do not work for big sizes, can anyone tell me how can I use vectors or arrays with very big sizes ???
Quoting cplusplus.com,
std::vector::max_size
Returns the maximum number of elements that the
vector can hold.
This is the maximum potential size the container can reach due to
known system or library implementation limitations, but the container
is by no means guaranteed to be able to reach that size: it can still
fail to allocate storage at any point before that size is reached.
Hence, vector doesn't guarantee that it can hold max_size elements, it just an implementation limitation.
Also, chris mentioned:
Well 10 GB * sizeof(float) * 3 is a ton of memory to allocate. I'm
going to guess your OS isn't letting you allocate it all for a good
reason.
The OP asks,
If vectors do not work for big sizes, can anyone tell me how can I use
vectors or arrays with very big sizes ???
Yes you can. Try Roomy or STXXL.
max_size() is different of size() and is different of capacity()
Current capacity is n=10000000 so the last element is distance_1[9999999]
What's the problem ???
Presumably, the allocation fails because your computer doesn't have 120GB of memory available. max_size tells you how many elements the implementation of vector can theoretically manage, given enough memory; it doesn't know how much memory will actually be available when you run the program.
If this vectors do not work for big sizes Can anyone tell me how can I use vectors or arrays with big, very big size ???
Increase the amount of RAM or swap space on your computer (and make sure the OS is 64-bit, though from the value of max_size() I guess it is). Or use something like STXXL to use files to back up huge data structures.

Can std::vector capacity/size/reserve be used to manually manage vector memory allocation?

I'm running very time sensitive code and would need a scheme to reserve more space for my vectors at a specific place in the code, where I can know (approximately) how many elements will be added, instead of having std do it for me when the vector is full.
I haven't found a way to test this to make sure there are no corner cases of std that I do not know of, therefore I'm wondering how the capacity of a vector affects the reallocation of memory. More specifically, would the code below make sure that automatic reallocation never occurs?
code
std::vector<unsigned int> data;
while (condition) {
// Reallocate here
// get_elements_required() gives an estimate that is guaranteed to be >= the actual nmber of elements.
unsigned int estimated_elements_required = get_elements_required(...);
if ((data.capacity() - data.size()) <= estimated_elements_required) {
data.reserve(min(data.capacity() * 2, data.max_length - 1));
}
...
// NEVER reallocate here, I would rather see the program crash actually...
for (unsigned int i = 0; i < get_elements_to_add(data); ++i) {
data.push_back(elements[i]);
}
}
estimated_elements_required in the code above is an estimate that is guaranteed to be equal to, or greater than, the actual number of elements that will be added. The code actually adding elements performs operations based on the capacity of the vector itself, changing the capacity halfway through will generate incorrect results.
Yes, this will work.
From the definition of reserve:
It is guaranteed that no reallocation takes place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity().

Control over std::vector reallocation

By reading the std::vector reference I understood that
calling insert when the the maximum capacity is reached will cause the reallocation of the std::vector (causing iterator invalidation) because new memory is allocated for it with a bigger capacity. The goal is to keep the guarantee about contiguous data.
As long as I stick below the maximum capacity insert will not cause that (and iterators will be intact).
My question is the following:
When reserve is called automatically by insert, is there any way to control how much new memory must be reserved?
Suppose that I have a vector with an initial capacity of 100 and, when the maximum capacity is hit, I want to allocate an extra 20 bytes.
Is it possible to do that?
You can always track it yourself and call reserve before it would allocate, e.g.
static const int N = 20 // Amount to grow by
if (vec.capacity() == vec.size()) {
vec.reserve(vec.size() + N);
}
vec.insert(...);
You can wrap this in a function of your own and call that function instead of calling insert() directly.

size vs capacity of a vector?

I am a bit confused about this both of these look same to me.
Although it may happen that capacity and size may differ on different compilers. how it may differ.
Its also said that if we are out of memory the capacity changes.
All these things are bit unclear to me.
Can somebody give an explanation.(if possible with and example or if I can do any test on any program to understand it)
Size is not allowed to differ between multiple compilers. The size of a vector is the number of elements that it contains, which is directly controlled by how many elements you put into the vector.
Capacity is the amount of total space that the vector has. Under the hood, a vector just uses an array. The capacity of the vector is the size of that array. This is always equal to or larger than the size. The difference between them is the number of elements that you can add to the vector before the array under the hood needs to be reallocated.
You should almost never care about the capacity. It exists to let people with very specific performance and memory constraints do exactly what they want.
Size: the number of items currently in the vector
Capacity: how many items can be fit in the vector before it is "full". Once full, adding new items will result in a new, larger block of memory being allocated and the existing items being copied to it
Let's say you have a bucket. At most, this bucket can hold 5 gallons of water, so its capacity is 5 gallons. It may have any amount of water between 0 and 5, inclusive. The amount of water currently in the bucket is, in vector terms, its size. So if this bucket is half filled, it has a size of 2.5 gallons.
If you try to add more water to a bucket and it would overflow, you need to find a bigger bucket. So you get a bucket with a larger capacity and dump the old bucket's contents into the new one, then add the new water.
Capacity: Maximum amount of stuff the Vector/bucket can hold.
Size: Amount of stuff currently in the Vector/bucket.
Size is number of elements present in a vector
Capacity is the amount of space that the vector is currently using.
Let's understand it with a very simple example:
using namespace std;
int main(){
vector<int > vec;
vec.push_back(1);
vec.push_back(1);
vec.push_back(1);
cout<<"size of vector"<<vec.size()<<endl;
cout<<"capacity of vector"<<vec.capacity()<<endl;
return 0;
}
currently size is 3 and
capacity is 4.
Now if we push back one more element,
using namespace std;
int main(){
vector<int> vec;
vec.push_back(1);
vec.push_back(1);
vec.push_back(1);
vec.push_back(1);
cout<<"size of vector"<<vec.size()<<endl;
cout<<"capacity of vector"<<vec.capacity()<<endl;
return 0;
}
now
size is: 4
capacity is 4
now if we try to insert one more element in vector then size will become 5 but capacity will become 8.
it happens based on the datatype of vector, as here in this case vector in of type int, as we know size of int is 4 bytes so compiler will allocate 4 block of memory ..and when we try to add 5th element , vector::capacity() is doubled what we have currently.
same keep on..for example : if we try to insert 9th element then size of vector will be 9 and capacity will b 16..
size() tells you how many elements you currently have. capacity() tells you how large the size can get before the vector needs to reallocate memory for itself.
Capacity is always greater than or equal to size. You cannot index beyond element # size()-1.
The size is the number of elements in the vector. The capacity is the maximum number of elements the vector can currently hold.
The vector size is the total number of elements of a vector and it is always the same for all compilers. Vectors are re-sizeable.
The capacity is the maximum number of elements the vector can currently hold. It may differ for different compilers.
Capacity changes if it needs to, or you can set an initial capacity and it will not resize until that capacity is reached. It is automatically expanded.
Capacity > = Size
One is more of an important interface and the other is more of an important implementation detail. You will mostly deal with size and not capacity. In other words:
Size is the number of items in the vector. If you want to iterate through the vector, you need to know its size.
Capacity is how many items can be fit in the vector before more memory must be allocated to it. Once the capacity limit is reached, more memory is allocated to the vector.
An analogy to size is the number of balls in a box whereas the capacity is the box size. When programming, you normally want to know how many balls are in the box. The vector implementation should handle the capacity for you (making a bigger box once it is full).