vector's size in C++ - c++

#include <iostream>
#include <vector>
#include "mixmax.h"//<-here, there is random number generator called mixmax
#include <algorithm>
#include <cmath>
using namespace std;
int main()
{
const unsigned long long int n=10000000;
vector < float > f(n);
vector < float > distance_1(n);
vector < float > distance_2(n);
rng_state_t s;
rng_state_t *x=&s;
seed_spbox(x,12345);//<-here we just devlare our random number generator
for(int i=0;i<n;i++)
f[i]=int(n*get_next_float(x));//,<-here we just get random numbers,like rand()
sort(f.begin(),f.end());
for(int i=0;i<n;i++)
{
distance_1[i]=f[i]-i;
distance_2[i]=(i+1)-f[i];
}
float discrep=max(*max_element(distance_1.begin(),distance_1.end()),*max_element(dis tance_2.begin(),distance_2.end()));
cout<<"discrep= "<<discrep<<endl;
cout<<"sqrt(n)*discrep= "<<discrep/sqrt(n)<<endl;
}
When I print f.max_size() (the vector declined above in code) gives me this huge number 4611686018427387903, but when I take n=10000000000, it does not work, it gives this error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped).
(I tried it in Visual Studio under windows.)
What's the problem ??? If vectors do not work for big sizes, can anyone tell me how can I use vectors or arrays with very big sizes ???

Quoting cplusplus.com,
std::vector::max_size
Returns the maximum number of elements that the
vector can hold.
This is the maximum potential size the container can reach due to
known system or library implementation limitations, but the container
is by no means guaranteed to be able to reach that size: it can still
fail to allocate storage at any point before that size is reached.
Hence, vector doesn't guarantee that it can hold max_size elements, it just an implementation limitation.
Also, chris mentioned:
Well 10 GB * sizeof(float) * 3 is a ton of memory to allocate. I'm
going to guess your OS isn't letting you allocate it all for a good
reason.
The OP asks,
If vectors do not work for big sizes, can anyone tell me how can I use
vectors or arrays with very big sizes ???
Yes you can. Try Roomy or STXXL.

max_size() is different of size() and is different of capacity()
Current capacity is n=10000000 so the last element is distance_1[9999999]

What's the problem ???
Presumably, the allocation fails because your computer doesn't have 120GB of memory available. max_size tells you how many elements the implementation of vector can theoretically manage, given enough memory; it doesn't know how much memory will actually be available when you run the program.
If this vectors do not work for big sizes Can anyone tell me how can I use vectors or arrays with big, very big size ???
Increase the amount of RAM or swap space on your computer (and make sure the OS is 64-bit, though from the value of max_size() I guess it is). Or use something like STXXL to use files to back up huge data structures.

Related

How does compiler decide capacity of a vector? [duplicate]

What is the capacity() of an std::vector which is created using the default constuctor? I know that the size() is zero. Can we state that a default constructed vector does not call heap memory allocation?
This way it would be possible to create an array with an arbitrary reserve using a single allocation, like std::vector<int> iv; iv.reserve(2345);. Let's say that for some reason, I do not want to start the size() on 2345.
For example, on Linux (g++ 4.4.5, kernel 2.6.32 amd64)
#include <iostream>
#include <vector>
int main()
{
using namespace std;
cout << vector<int>().capacity() << "," << vector<int>(10).capacity() << endl;
return 0;
}
printed 0,10. Is it a rule, or is it STL vendor dependent?
The standard doesn't specify what the initial capacity of a container should be, so you're relying on the implementation. A common implementation will start the capacity at zero, but there's no guarantee. On the other hand there's no way to better your strategy of std::vector<int> iv; iv.reserve(2345); so stick with it.
Storage implementations of std::vector vary significantly, but all the ones I've come across start from 0.
The following code:
#include <iostream>
#include <vector>
int main()
{
using namespace std;
vector<int> normal;
cout << normal.capacity() << endl;
for (unsigned int loop = 0; loop != 10; ++loop)
{
normal.push_back(1);
cout << normal.capacity() << endl;
}
cin.get();
return 0;
}
Gives the following output:
0
1
2
4
4
8
8
8
8
16
16
under GCC 5.1, 11.2 - Clang 12.0.1 and:
0
1
2
3
4
6
6
9
9
9
13
under MSVC 2013.
As far as I understood the standard (though I could actually not name a reference), container instanciation and memory allocation have intentionally been decoupled for good reason. Therefor you have distinct, separate calls for
constructor to create the container itself
reserve() to pre allocate a suitably large memory block to accomodate at least(!) a given number of objects
And this makes a lot of sense. The only right to exist for reserve() is to give you the opportunity to code around possibly expensive reallocations when growing the vector. In order to be useful you have to know the number of objects to store or at least need to be able to make an educated guess. If this is not given you better stay away from reserve() as you will just change reallocation for wasted memory.
So putting it all together:
The standard intentionally does not specify a constructor that allows you to pre allocate a memory block for a specific number of objects (which would be at least more desirable than allocating an implementation specific, fixed "something" under the hood).
Allocation shouldn't be implicit. So, to preallocate a block you need to make a separate call to reserve() and this need not be at the same place of construction (could/should of course be later, after you became aware of the required size to accomodate)
Thus if a vector would always preallocate a memory block of implementation defined size this would foil the intended job of reserve(), wouldn't it?
What would be the advantage of preallocating a block if the STL naturally cannot know the intended purpose and expected size of a vector? It'll be rather nonsensical, if not counter-productive.
The proper solution instead is to allocate and implementation specific block with the first push_back() - if not already explicitely allocated before by reserve().
In case of a necessary reallocation the increase in block size is implementation specific as well. The vector implementations I know of start with an exponential increase in size but will cap the increment rate at a certain maximum to avoid wasting huge amounts of memory or even blowing it.
All this comes to full operation and advantage only if not disturbed by an allocating constructor. You have reasonable defaults for common scenarios that can be overriden on demand by reserve() (and shrink_to_fit()). So, even if the standard does not explicitely state so, I'm quite sure assuming that a newly constructed vector does not preallocate is a pretty safe bet for all current implementations.
As a slight addition to the other answers, I found that when running under debug conditions with Visual Studio a default constructed vector will still allocate on the heap even though the capacity starts at zero.
Specifically if _ITERATOR_DEBUG_LEVEL != 0 then vector will allocate some space to help with iterator checking.
https://learn.microsoft.com/en-gb/cpp/standard-library/iterator-debug-level
I just found this slightly annoying since I was using a custom allocator at the time and was not expecting the extra allocation.
This is an old question, and all answers here have rightly explained the standard's point of view and the way you can get an initial capacity in a portable manner by using std::vector::reserve;
However, I'll explain why it doesn't make sense for any STL implementation to allocate memory upon construction of an std::vector<T> object;
std::vector<T> of incomplete types;
Prior to C++17, it was undefined behavior to construct a std::vector<T> if the definition of T is still unknown at point of instantiation. However, that constraint was relaxed in C++17.
In order to efficiently allocate memory for an object, you need to know its size. From C++17 and beyond, your clients may have cases where your std::vector<T> class does not know the size of T. Does it makes sense to have memory allocation characteristics dependent on type completeness?
Unwanted Memory allocations
There are many, many, many times you'll need model a graph in software. (A tree is a graph); You are most likely going to model it like:
class Node {
....
std::vector<Node> children; //or std::vector< *some pointer type* > children;
....
};
Now think for a moment and imagine if you had lots of terminal nodes. You would be very pissed if your STL implementation allocates extra memory simply in anticipation of having objects in children.
This is just one example, feel free to think of more...
Standard doesnt specify initial value for capacity but the STL container automatically grows to accomodate as much data as you put in, provided you don't exceed the maximum size(use max_size member function to know).
For vector and string, growth is handled by realloc whenever more space is needed. Suppose you'd like to create a vector holding value 1-1000. Without using reserve, the code will typically result in between
2 and 18 reallocations during following loop:
vector<int> v;
for ( int i = 1; i <= 1000; i++) v.push_back(i);
Modifying the code to use reserve might result in 0 allocations during the loop:
vector<int> v;
v.reserve(1000);
for ( int i = 1; i <= 1000; i++) v.push_back(i);
Roughly to say, vector and string capacities grow by a factor of between 1.5 and 2 each time.

What is the maximum size of a vector?

Why can't this code handle n=100000000 (8 zeros)?
The code will be terminated by this error:
terminate called after throwing an instance of 'std::bad_alloc'
So, the word "DONE" won't be printed out.
using namespace std;
int main()
{
vector <long long> v1;
long long n; cin>>n;
for(long long i=0;i<n;i++){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
Although the maximum size of v1 is 536870911.
What is the maximum size of a vector?
Depends on several factors. Here are some upper limits:
The size of the address space
The maximum representable value of std::vector::size_type
The theoretical upper limit given by std::vector::max_size
Available memory - unless system overcommits and you don't access the entire vector
Maximum available memory for a single process may be limited by the operating system.
Available contiguous address space which can be less than all of the free memory due to fragmentation.
The address space isn't an issue in 64 bit world.
Note that the size of the element type affects the number of elements that fit in a given range of memory.
The most likely the strictest limit in your case was the available memory. 536870911 is one long long short of 4 gigabytes.
From cppref max_size() docs:
This value typically reflects the theoretical limit on the size of the container ...
At runtime, the size of the container may be limited to a value smaller than max_size() by the amount of RAM available.
std::vector contains a pointer to a single dynamically allocated contiguous array. As you call push_back() in your loop, that array has to grow over time whenever the vector's size() exceeds is capacity(). As the array grows large, it becomes more difficult for a new larger contiguous array to be allocated to copy the old array elements into. Eventually, the array is so large that a new copy simply can't be allocated anymore, and thus std::bad_alloc gets thrown.
To avoid all of those reallocations while looping, call the vector's reserve() method before entering the loop:
int main() {
vector <long long> v1;
long long n; cin>>n;
v1.reserve(n); // <-- ADD THIS!
for(long long i = 0; i < n; ++i){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
That way, the array is allocated only 1 time.

Any way that we can prevent the breakdown between reference and object when resize a vector?

I was debugging an issue and realized that when a vector is resizing, the reference will not work anymore. To illustrate this point, below is the minimal code. The output is 0 instead of 1. Is there anyway that we can prevent this happen except reserving a large space for x?
#include <iostream>
#include <vector>
using namespace std;
vector<int> x{};
int main(){
x.reserve(1);
x.push_back(0);
int & y = x[0];
x.resize(10);
y=1;
cout << x[0] << endl;
return 0;
}
This is called invalidation and the only way you can prevent it is if you make sure that the vector capacity does not change.
x.reserve(10);
x.push_back(0);
int &y = x[0];
x.resize(10);
The only way I can think of is to use std::deque instead of std::vector.
The reason for suggesting std::deque is this (from cppreference):
The storage of a deque is automatically expanded and contracted as
needed. Expansion of a deque is cheaper than the expansion of a
std::vector because it does not involve copying of the existing
elements to a new memory location.
That line about not copying is really the answer to your question. It means that the objects remain where you placed them (in memory) as long as the deque is alive.
However, on the very next line it says:
On the other hand, deques typically have large minimal memory cost; a
deque holding just one element has to allocate its full internal array
(e.g. 8 times the object size on 64-bit libstdc++; 16 times the object
size or 4096 bytes, whichever is larger, on 64-bit libc++).
It's now up to you to decide which is better - higher initial memory cost or changing your program's logic not to require referencing the items in the vector like that. You might also want to consider std::set or std::unordered_set for quickly finding an object within the container
There are several choices:
Don't use a vector.
Don't keep a reference.
Create a "smart reference" class that tracks the vector and the index and so it will obtain the appropriate object even if the vector moves.
You can create a vector of std::shared_ptr<> as well and keep the values instead of the interators.

Avoid the reallocation of a vector when its dimension has to be incremented

I have a
vector< pair<vector<double> , int>> samples;
This vector will contain a number of elements. For efficiency rason I initialize it in this way:
vector< pair<vector<double> , int>> samples(1000000);
I know the size in advance (not a compile-time) that I get from another container. The problem is that I have to decrease of 1 element the dimension of vector. Indeed, this case isn't a problem because resize with smaller dimension than the initial no do reallocation.I can do
samples.resize(999999);
The problem is that in some cases rather than decrease the dimension of 1 element I have to increment the dimension of an element. If I do
samples.resize(1000001);
there is the risk of do reallocation that I want avoid for efficiency rasons.
I ask if is a possible solution to my problem do like this:
vector< pair<vector<double> , int> samples;
samples.reserve(1000001);
samples.resize(1000000);
.
. Elaboration that fill samples
.
samples.resize(1000001); //here I don't want reallocation
or if there are better solutions?
Thanks in advance!
(I'm using C++11 compiler)
Just wrote a sample program to demonstrate, that resize is not going to reallocate space if capacity of the vector is sufficient:
#include <iostream>
#include <vector>
#include <utility>
#include <cassert>
using namespace std;
int main()
{
vector<pair<vector<double>, int>> samples;
samples.reserve(10001);
auto data = samples.data();
assert(10001==samples.capacity());
samples.resize(10000);
assert(10001 == samples.capacity());
assert(data == samples.data());
samples.resize(10001); //here I don't want reallocation
assert(10001==samples.capacity());
assert(data == samples.data());
}
This demo is based on assumption that std::vector guarantees contiguous memory and if data pointer does not change, than no realloc took place. This is also evident, by capacity() result to remain 10001 after every call to resize().
cppreference on vectors:
The storage of the vector is handled automatically, being expanded and contracted as needed. Vectors usually occupy more space than static arrays, because more memory is allocated to handle future growth. This way a vector does not need to reallocate each time an element is inserted, but only when the additional memory is exhausted. The total amount of allocated memory can be queried using capacity() function.
cppreference on reserve:
Correctly using reserve() can prevent unnecessary reallocations, but inappropriate uses of reserve() (for instance, calling it before every push_back() call) may actually increase the number of reallocations (by causing the capacity to grow linearly rather than exponentially) and result in increased computational complexity and decreased performance.
cppreference also sates to resize:
Complexity
Linear in the difference between the current size and count. Additional complexity possible due to reallocation if capacity is less than count
I ask if is a possible solution to my problem do like this:
samples.reserve(1000001);
samples.resize(1000000);
Yes, this is the solution.
or if there are better solutions?
Not that I know of.
As I recall when resize less than capacity, There will not be reallocation.
So, your code will work without reallocation.
cppreference.com
Vector capacity is never reduced when resizing to smaller size because that would invalidate all iterators, rather than only the ones that would be invalidated by the equivalent sequence of pop_back() calls.

Pointer to an array get size C++

int * a;
a = new int[10];
cout << sizeof(a)/sizeof(int);
if i would use a normal array the answer would be 10,
alas, the lucky number printed was 1, because sizeof(int) is 4 and iszeof(*int) is 4 too. How do i owercome this? In my case keeping size in memory is a complicated option. How do i get size using code?
My best guess would be to iterate through an array and search for it's end, and the end is 0, right? Any suggestions?
--edit
well, what i fear about vectors is that it will reallocate while pushing back, well you got the point, i can jus allocate the memory. Hoever i cant change the stucture, the whole code is releevant. Thanks for the answers, i see there's no way around, so ill just look for a way to store the size in memory.
what i asked whas not what kind of structure to use.
Simple.
Use std::vector<int> Or std::array<int, N> (where N is a compile-time constant).
If you know the size of your array at compile time, and it doens't need to grow at runtime, then use std::array. Else use std::vector.
These are called sequence-container classes which define a member function called size() which returns the number of elements in the container. You can use that whenever you need to know the size. :-)
Read the documentation:
std::array with example
std::vector with example
When you use std::vector, you should consider using reserve() if you've some vague idea of the number of elements the container is going to hold. That will give you performance benefit.
If you worry about performance of std::vector vs raw-arrays, then read the accepted answer here:
Is std::vector so much slower than plain arrays?
It explains why the code in the question is slow, which has nothing to do with std::vector itself, rather its incorrect usage.
If you cannot use either of them, and are forced to use int*, then I would suggest these two alternatives. Choose whatever suits your need.
struct array
{
int *elements; //elements
size_t size; //number of elements
};
That is self-explanatory.
The second one is this: allocate memory for one more element and store the size in the first element as:
int N = howManyElements();
int *array = int new[N+1]; //allocate memory for size storage also!
array[0] = N; //store N in the first element!
//your code : iterate i=1 to i<=N
//must delete it once done
delete []array;
sizeof(a) is going to be the size of the pointer, not the size of the allocated array.
There is no way to get the size of the array after you've allocated it. The sizeof operator has to be able to be evaluated at compile time.
How would the compiler know how big the array was in this function?
void foo(int size)
{
int * a;
a = new int[size];
cout << sizeof(a)/sizeof(int);
delete[] a;
}
It couldn't. So it's not possible for the sizeof operator to return the size of an allocated array. And, in fact, there is no reliable way to get the size of an array you've allocated with new. Let me repeat this there is no reliable way to get the size of an array you've allocated with new. You have to store the size someplace.
Luckily, this problem has already been solved for you, and it's guaranteed to be there in any implementation of C++. If you want a nice array that stores the size along with the array, use ::std::vector. Particularly if you're using new to allocate your array.
#include <vector>
void foo(int size)
{
::std::vector<int> a(size);
cout << a.size();
}
There you go. Notice how you no longer have to remember to delete it. As a further note, using ::std::vector in this way has no performance penalty over using new in the way you were using it.
If you are unable to use std::vector and std::array as you have stated, than your only remaning option is to keep track of the size of the array yourself.
I still suspect that your reasons for avoiding std::vector are misguided. Even for performance monitoring software, intelligent uses of vector are reasonable. If you are concerned about resizing you can preallocate the vector to be reasonably large.