Consider the following code, which allocates a 2d std::vector<std::vector<int> > of size 5 x 3 and prints the memory addresses of each element:
#include <iostream>
#include <vector>
int main() {
int n = 5, m = 3;
std::vector<std::vector<int> >vec (n, std::vector<int>(m));
for (int i = 0; i < n; ++i) {
for (int j = 0; j < m; ++j) {
std::cout << &(vec[i][j]) << " ";
}
std::cout << "\n";
}
}
Output:
0x71ecc0 0x71ecc4 0x71ecc8
0x71ece0 0x71ece4 0x71ece8
0x71ed00 0x71ed04 0x71ed08
0x71ed20 0x71ed24 0x71ed28
0x71ed40 0x71ed44 0x71ed48
Of course, for a fixed row, each column is contiguous in memory, but the rows are not. In particular, each row is 32 bytes past the start of the previous row, but since each row is only 12 bytes, this leaves a gap of 20 bytes. For example, since I thought vectors allocate contiguous memory, I would have expected the first address of the second row to be 0x71eccc. Why is this not the case, and how does vector decide how much of a gap to give?
The overhead size of a vector is not 0. You have 24 bytes between the last element of your vector and the first element of the next vector. Try this:
cout << sizeof(std::vector<int>) << endl;
You will find the output to be 24 (Likely for your implementation of std::vector and compiler etc). Live example which happens to match.
You can imagine the vector layout like this:
If you want your elements to actually be contiguous then you need to do:
Use a std::array<std::array<int>> for no overhead (c++11 only). Note that this cannot be resized.
Use std::vector<int> and the formula row * numRows + col to access the element for row, col.
Agree with Mr. Fox's results and solutions1. Disagree with the logic used to get there.
In order to be resizable, a std::vector contains a dynamically allocated block of contiguous memory, so the outer vector has a pointer to a block of storage that contains N vectors that are contiguous. However, each of these inner vectors contains a pointer to its own block of storage. The likelihood (without some really groovy custom allocator voodoo) of all N blocks of storage being contiguous is astonishingly small. The overhead of the N vectors is contiguous. The data almost certainly not, but could be.
1 Mostly. The std::arrays in a std::vector will be contiguous, but nothing prevents the implementor of std::array from putting in additional state information that prevents the arrays in the std::arrays from being contiguous. An implementor who wrote the std::array in such a way is an occupant of an odd headspace or solving a very interesting problem.
Related
I read in C++ book, that
"Even though C++ enables us to model multidimensional arrays, the memory where the array is contained is one-dimensional. So the compiler maps the multidimensional array into the memory space, which expands only one direction."
And I thought, if I want to change n-dimension Array to 1-dimension Array(in order of Ascending) why just I call n-d array memory address(which is stored in 1-d format) and just put into new 1-d Array??
(usually I used for statement to change n-d array into 1-d array)
Is it possible?? and if it possible it would be thankful to provide simple example code.
A pointer to the first item in the array is also a pointer to the complete array considered as a one-dimensional array. The guarantee that the items are stored contiguously in memory comes from the sizeof requirements, that the size of an array of n items is n times the size of the item, also when the item is itself an array.
Yes, it is possible to convert an n-dimensional array to 1-d array.
Let me show by taking an example of 2-d array:
for (i = 0; i < n; i++)
{
for (j = 0; j < m; j++)
{
b[i * n + j] = a[i][j];
}
}
For example, a float is 4 bytes. Does this mean that a vector containing 10 floats is exactly 40 bytes?
One may interpret the "size" of a vector in different ways:
vector::size() gives the number of elements currently stored in
the vector, regardless of the memory they allocate. So
myVector.size() in your case would be 10
vector:capacity() gives you the number of the elements for which the vector has allocated memory so far. Note that a vector, when continuously appending elements, allocates memory in "chunks", i.e. it might reserve space for 50 new elements at once in order to avoid repeated memory allocs. So capacity() >= size() is always true.
sizeof(myVector) would give you the size of the data structure vector necessary for managing a dynamically increasing series of elements. It typically contains dynamically allocated memory, which is not reflected by sizeof, such that sizeof is of less use in most cases.
See the following code:
int main() {
vector<float> fv;
for (int i=0; i<10; i++) {
fv.push_back(1.0);
}
cout << "size(): " << fv.size() << endl;
cout << "capacity(): " << fv.capacity() << endl;
cout << "sizeof(fv): " << sizeof(fv) << endl;
return 0;
}
Output:
size(): 10
capacity(): 16
sizeof(fv): 24
Hope it helps.
A std::vector has 3 different sizes. There is the size of the vector object itself and you get that with size(std::vector<some_type>). This size isn't really useful for anything though. Typically this will be the size of three pointers as that is how vector is typically implemented.
The second size is what is returned from the size() member function. The value returned from this is the number of elements in the vector. This is the "size" most people use when talking about the size of the vector.
The last size a vector has is the total number of elements it has allocated currently. That is obtained from using the capacity member function.
So a vector holding 10 floats needs to use at least 10 floats worth of memory but it could be using more as the capacity() is allowed to be grater then its size(). But size of the vector object itself (sizeof(name_of_vector)) will be some value and that value will never change no matter how many elements you add into the vector.
There is 2 ways to define std::vector(that i know of):
std::vector<int> vectorOne;
and
std::vector<int> vectorTwo(300);
So if i don't define the first and fill it with 300 int's then it has to reallocate memory to store those int's. that would mean it would not for example be address 0x0 through 0x300 but there could be memory allocated inbetween because it has to be reallocated after, but the second vector will already have those addresses reserved for them so there would be no space inbetween.
Does this affect perfomance at all and how could I meassure this?
std::vector is guaranteed to always store its data in a continuous block of memory. That means that when you add items, it has to try and increase its range of memory in use. If something else is in the memory following the vector, it needs to find a free block of the right size somewhere else in memory and copy all the old data + the new data to it.
This is a fairly expensive operation in terms of time, so it tries to mitigate by allocating a slightly larger block than what you need. This allows you to add several items before the whole reallocate-and-move-operation takes place.
Vector has two properties: size and capacity. The former is how many elements it actually holds, the latter is how many places are reserved in total. For example, if you have a vector with size() == 10 and capacity() == 18, it means you can add 8 more elements before it needs to reallocate.
How and when the capacity increases exactly, is up to the implementer of your STL version. You can test what happens on your computer with the following test:
#include <iostream>
#include <vector>
int main() {
using std::cout;
using std::vector;
// Create a vector with values 1 .. 10
vector<int> v(10);
std::cout << "v has size " << v.size() << " and capacity " << v.capacity() << "\n";
// Now add 90 values, and print the size and capacity after each insert
for(int i = 11; i <= 100; ++i)
{
v.push_back(i);
std::cout << "v has size " << v.size() << " and capacity " << v.capacity()
<< ". Memory range: " << &v.front() << " -- " << &v.back() << "\n";
}
return 0;
}
I ran it on IDEOne and got the following output:
v has size 10 and capacity 10
v has size 11 and capacity 20. Memory range: 0x9899a40 -- 0x9899a68
v has size 12 and capacity 20. Memory range: 0x9899a40 -- 0x9899a6c
v has size 13 and capacity 20. Memory range: 0x9899a40 -- 0x9899a70
...
v has size 20 and capacity 20. Memory range: 0x9899a40 -- 0x9899a8c
v has size 21 and capacity 40. Memory range: 0x9899a98 -- 0x9899ae8
...
v has size 40 and capacity 40. Memory range: 0x9899a98 -- 0x9899b34
v has size 41 and capacity 80. Memory range: 0x9899b40 -- 0x9899be0
You see the capacity increase and re-allocations happening right there, and you also see that this particular compiler chooses to double the capacity every time you hit the limit.
On some systems the algorithm will be more subtle, growing faster as you insert more items (so if your vector is small, you waste little space, but if it notices you insert a lot of items into it, it allocates more to avoid having to increase the capacity too often).
PS: Note the difference between setting the size and the capacity of a vector.
vector<int> v(10);
will create a vector with capacity at least 10, and size() == 10. If you print the contents of v, you will see that it contains
0 0 0 0 0 0 0 0 0 0
i.e. 10 integers with their default values. The next element you push into it, may (and likely will) cause a re-allocation. On the other hand,
vector<int> v();
v.reserve(10);
will create an empty vector, but with its initial capacity set to 10 rather than the default (probably 1). You can be certain that the first 10 elements you push into it will not cause an allocation (and the one probably will but not necessarily, as reserve may actually set the capacity to more than what you requested).
You should use reserve() method:
std::vector<int> vec;
vec.reserve(300);
assert(vec.size() == 0); // But memory is allocated
This solves the problem.
In your example it affects the performance greatly. You can expect, that when you overflow the vector, it doubles the allocated memory. So, if you push_back() into vector N times (and you haven't called "reserve()"), you can expect O(logN) reallocations, each of them causing copying of all values. So the total complexity is expected to be O(N*logN), although it is not specified by C++ standard.
The differences can be dramatic because if data is not adjacent in memory, the data may have to be fetched from main memory which is 200 times slower than a l1 cache fetch. This will not happen in a vector because data in a vector is required to be adjacent.
see https://www.youtube.com/watch?v=YQs6IC-vgmo
Use std::vector::reserve, when you can to avoid realloc events. The C++ 'chrono' header has good time utilities, to measure the time difference, in high resolution ticks.
Here is simple C++ code that compare iterating 2D array row major with column major.
#include <iostream>
#include <ctime>
using namespace std;
const int d = 10000;
int** A = new int* [d];
int main(int argc, const char * argv[]) {
for(int i = 0; i < d; ++i)
A[i] = new int [d];
clock_t ColMajor = clock();
for(int b = 0; b < d; ++b)
for(int a = 0; a < d; ++a)
A[a][b]++;
double col = static_cast<double>(clock() - ColMajor) / CLOCKS_PER_SEC;
clock_t RowMajor = clock();
for(int a = 0; a < d; ++a)
for(int b = 0; b < d; ++b)
A[a][b]++;
double row = static_cast<double>(clock() - RowMajor) / CLOCKS_PER_SEC;
cout << "Row Major : " << row;
cout << "\nColumn Major : " << col;
return 0;
}
Result for different values of d:
d = 10^3 :
Row Major : 0.002431
Column Major : 0.017186
d = 10^4 :
Row Major : 0.237995
Column Major : 2.04471
d = 10^5
Row Major : 53.9561
Column Major : 444.339
Now the question is why row major is faster than column major?
It obviously depends on the machine you're on but very generally speaking:
Your computer stores parts of your program's memory in a cache that has a much smaller latency than main memory (even when compensating for cache hit time).
C arrays are stored in a contiguous by row major order. This means if you ask for element x, then element x+1 is stored in main memory at a location directly following where x is stored.
It's typical for your computer cache to "pre-emptively" fill cache with memory addresses that haven't been used yet, but that are locally close to memory that your program has used already. Think of your computer as saying: "well, you wanted memory at address X so I am going to assume that you will shortly want memory at X+1, therefore I will pre-emptively grab that for you and place it in your cache".
When you enumerate your array via row major order, you're enumerating it in such a way where it's stored in a contiguous manner in memory, and your machine has already taken the liberty of pre-loading those addresses into cache for you because it guessed that you wanted it. Therefore you achieve a higher rate of cache hits. When you're enumerating an array in another non-contiguous manner then your machine likely won't predict the memory access pattern you're applying, so it wont be able to pre-emptively pull memory addresses into cache for you, and you won't incur as many cache hits, so main memory will have to be accessed more frequently which is slower than your cache.
Also, this might be better suited for https://cs.stackexchange.com/ because the way your system cache behaves is implemented in hardware, and spatial locality questions seem better suited there.
Your array is actually a ragged array, so row major isn't entirely a factor.
You're seeing better performance iterating over columns then rows because the row memory is laid out linearly, which reading sequentially is easy for the cache predictor to predict, and you amortize the pointer dereference to the second dimension since it only needs to be done once per row.
When you iterate over the rows then columns, you incur a pointer dereference to the second dimension per iteration. So by iterating over rows, you're adding a pointer dereference. Aside from the intrinsic cost, it's bad for cache prediction.
If you want a true two-dimensional array, laid out in memory using row-major ordering, you would want...
int A[1000][1000];
This lays out the memory contiguously in row-major order, instead of one array of pointers to arrays (which are not laid out contiguously). Iterating over this array using row-major would still perform faster than iterating column-major because of spatial locality and cache prediction.
The short answer is CPU caches.
Scott Mayers explains it very clearly here
Okay so I have;
int grid_x = 5
int * grid;
grid = new int[grid_x];
*grid = 34;
cout << grid[0];
Should line 3 create an array with 5 elements? Or fill the first element with the number 5?
Line 4 fills the first element, how do I fill the rest?
Without line 4, line 5 reads "-842150451".
I don't understand what is going on, I'm trying to create a 2 dimensional array using x and y values specified by the user, and then fill each element one by one with numeric values also specified by the user. My above code was an attempt to try it out with a 1 dimensional array first.
The default C++ way of creating a dynamic(ally resizable) array of int is:
std::vector<int> grid;
Don't play around with unsafe pointers and manual dynamic allocation when the standard library already encapsulates this for you.
To create a vector of 5 elements, do this:
std::vector<int> grid(5);
You can then access its individual elements using []:
grid[0] = 34;
grid[1] = 42;
You can add new elements to the back:
// grid.size() is 5
grid.push_back(-42);
// grid.size() now returns 6
Consult reference docs to see all operations available on std::vector.
Should line 3 create an array with 5 elements?
Yes. It won't initialise them though, which is why you see a weird value.
Or fill the first element with the number 5?
new int(grid_x), with round brackets, would create a single object, not an array, and specify the initial value.
There's no way to allocate an array with new and initialise them with a (non-zero) value. You'll have to assign the values after allocation.
Line 4 fills the first element, how do I fill the rest?
You can use the subscript operator [] to access elements:
grid[0] = 34; // Equivalent to: *(grid) = 34
grid[1] = 42; // Equivalent to: *(grid+1) = 42
// ...
grid[4] = 77; // That's the last one: 5 elements from 0 to 4.
However, you usually don't want to juggle raw pointers like this; the burden of having to delete[] the array when you've finished with it can be difficult to fulfill. Instead, use the standard library. Here's one way to make a two-dimensional grid:
#include <vector>
std::vector<std::vector<int>> grid(grid_x, std::vector<int>(grid_y));
grid[x][y] = 42; // for any x is between 0 and grid_x-1, y between 0 and grid_y-1
Or might be more efficient to use a single contiguous array; you'll need your own little functions to access that as a two-dimenionsal grid. Something like this might be a good starting point:
template <typename T>
class Grid {
public:
Grid(size_t x, size_t y) : size_x(x), size_y(y), v(x*y) {}
T & operator()(size_t x, size_t y) {return v[y*size_x + x];}
T const & operator()(size_t x, size_t y) const {return v[y*size_x + x];}
private:
size_t size_x, size_y;
std::vector<T> v;
};
Grid grid(grid_x,grid_y);
grid(x,y) = 42;
Should line 3 create an array with 5 elements? Or fill the first element with the number 5?
Create an array with 5 elements.
Line 4 fills the first element, how do I fill the rest?
grid[n] = x;
Where n is the index of the element you want to set and x is the value.
Line 3 allocates memory for 5 integers side by side in memory so that they can be accessed and modified by...
The bracket operator, x[y] is exactly equivalent to *(x+y), so you could change Line 4 to grid[0] = 34; to make it more readable (this is why grid[2] will do the same thing as 2[grid]!)
An array is simply a contiguous block of memory. Therefore it has a starting address.
int * grid;
Is the C representation of the address of an integer, you can read the * as 'pointer'. Since your array is an array of integers, the address of the first element in the array is effectively the same as the address of the array. Hence line 3
grid = new int[grid_x];
allocates enough memory (on the heap) to hold the array and places its address in the grid variable. At this point the content of that memory is whatever it was when the physical silicon was last used. Reading from uninitialised memory will result in unpredictable values, hence your observation that leaving out line 4 results in strange output.
Remember that * pointer? On line four you can read it as 'the content of the pointer' and therefore
*grid = 34;
means set the content of the memory pointed to by grid to the value 34. But line 3 gave grid the address of the first element of the array. So line 4 sets the first element of the array to be 34.
In C, arrays use a zero-based index, which means that the first element of the array is number 0 and the last is number-of-elements-in-the-array - 1. So one way of filling the array is to index each element in turn to set a value to it.
for(int index = 0; index < grid_x; index++)
{
grid[index] = 34;
}
Alternatively, you could continue to use a pointer to do the same job.
for(int* pointerToElement = grid; 0 < grid_x; grid_x-- )
{
// save 34 to the address held by the pointer
/// and post-increment the pointer to the next element.
*pointerToElement++ = 34;
}
Have fun with arrays and pointers, they consistently provide a huge range of opportunities to spend sleepless hours wondering why your code doesn't work, PC reboots, router catches fire, etc, etc.
int grid_x = 5
int * grid;
grid = new int[grid_x];
*grid = 34;
cout << grid[0];
Should line 3 create an array with 5 elements? Or fill the first
element with the number 5?
Definitely the former. With the operator "new" you are allocating memory
Line 4 fills the first element, how do I fill the rest?
Use operator [], e.g.:
for int (i=0; i < grid_x; i++) { //Reset the buffer
grid[i] = 0;
}
Without line 4, line 5 reads "-842150451".
You are just reading uninitialized memory, it could be any value.
I don't understand what is going on, I'm trying to create a 2
dimensional array using x and y values specified by the user, and then
fill each element one by one with numeric values also specified by the
user. My above code was an attempt to try it out with a 1 dimensional
array first.
Other users explained how to use vectors. If you have to set only once the size of your array, I usually prefer boost::scoped_array which takes care of deleting when the variable goes out of scope.
For a two dimensional array of size not known at compile time, you need something a little bit trickier, like a scoped_array of scoped_arrays. Creating it will require necessarily a for loop, though.
using boost::scoped_array;
int grid_x;
int grid_y;
///Reading values from user...
scoped_array<scoped_array<int> > grid(new scoped_array<int> [grid_x]);
for (int i = 0; i < grid_x; i++)
grid[i] = scoped_array<int>(new int[grid_y] );
You will be able then to access your grid elements as
grid[x][y];
Note:
It would work also taking scoped_array out of the game,
typedef int* p_int_t;
p_int_t* grid = new p_int_t [grid_x];
for (int i = 0; i < grid_x; i++)
grid[i] = new int[grid_y];
but then you would have to take care of deletion at the end of the array's life, of ALL sub arrays.