size of 2 dimensional vector - c++

I have been trying to figure out the size of 2 dimensional vector and not able to figure out entirely.
The test program that I have written is as below.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<int> one(1);
vector < vector<int> > two(1, vector <int>(1));
return 0;
}
Memory allocation when I check with the help of valgrind is confusing me. After executing the first statement in the main block, I get the below output.
==19882== still reachable: 4 (+4) bytes in 1 (+1) blocks
So far so good. But after running the next statement I get the below log.
==19882== still reachable: 32 (+28) bytes in 3 (+2) blocks
Now this is confusing, I don't know how to justify the 28 bytes allocated.
If I change the second line as below
vector < vector<int> > two(1, vector <int>(0));
I get the below log
==19882== still reachable: 32 (+24) bytes in 3 (+1) blocks
Kindly help in understanding how the memory is allocated.

tl;dr
The first case just shows the allocation for the (int) storage managed by the vector. The second shows both the inner vector's int storage, and the storage for the inner vector object itself.
So it's telling you this
vector<int> one(1);
allocates one block of 4 bytes.
It doesn't tell you about the automatic storage for the vector object itself, only the dynamic storage for the single integer: assuming sizeof(int)==4, this seems pretty reasonable.
Next it tells you this:
vector < vector<int> > two(1, vector <int>(1));
allocates two more blocks of 28 bytes in total.
Now, one of those blocks will contain the dynamic storage for the vector<int> - remember the previous instance was an automatic local - and the other block will contain the dynamic storage for the nested vector's single integer.
We can assume the second (single integer) allocation is a single block of 4 bytes, as it was last time. So, the dynamically-allocated vector<int> itself is taking 24 bytes in another single block.
Is 24 bytes a reasonable size for a std::vector instance? That could easily be
template <typename T> class vector {
T* begin;
size_t used;
size_t allocated;
};
on a platform with 64-bit pointers and size_t. This assumes a stateless allocator, which is probably right.

Related

What is the maximum size of a vector?

Why can't this code handle n=100000000 (8 zeros)?
The code will be terminated by this error:
terminate called after throwing an instance of 'std::bad_alloc'
So, the word "DONE" won't be printed out.
using namespace std;
int main()
{
vector <long long> v1;
long long n; cin>>n;
for(long long i=0;i<n;i++){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
Although the maximum size of v1 is 536870911.
What is the maximum size of a vector?
Depends on several factors. Here are some upper limits:
The size of the address space
The maximum representable value of std::vector::size_type
The theoretical upper limit given by std::vector::max_size
Available memory - unless system overcommits and you don't access the entire vector
Maximum available memory for a single process may be limited by the operating system.
Available contiguous address space which can be less than all of the free memory due to fragmentation.
The address space isn't an issue in 64 bit world.
Note that the size of the element type affects the number of elements that fit in a given range of memory.
The most likely the strictest limit in your case was the available memory. 536870911 is one long long short of 4 gigabytes.
From cppref max_size() docs:
This value typically reflects the theoretical limit on the size of the container ...
At runtime, the size of the container may be limited to a value smaller than max_size() by the amount of RAM available.
std::vector contains a pointer to a single dynamically allocated contiguous array. As you call push_back() in your loop, that array has to grow over time whenever the vector's size() exceeds is capacity(). As the array grows large, it becomes more difficult for a new larger contiguous array to be allocated to copy the old array elements into. Eventually, the array is so large that a new copy simply can't be allocated anymore, and thus std::bad_alloc gets thrown.
To avoid all of those reallocations while looping, call the vector's reserve() method before entering the loop:
int main() {
vector <long long> v1;
long long n; cin>>n;
v1.reserve(n); // <-- ADD THIS!
for(long long i = 0; i < n; ++i){
v1.push_back(i+1);
}
cout<<"DONE"<<endl;
}
That way, the array is allocated only 1 time.

multi dimensional array allocation in chunks

I was poking around with multidimensional arrays today, and i came across blog which distinguishes rectangular arrays, and jagged arrays; usually i would do this on both jagged and rectangular:
Object** obj = new Obj*[5];
for (int i = 0; i < 5; ++i)
{
obj[i] = new Obj[10];
}
but in that blog it was said that if i knew that the 2d array was rectangular then i'm better off allocating the entire thing in a 1d array and use an improvised way of accessing the elements, something like this:
Object* obj = new Obj[rows * cols];
obj[x * cols + y];
//which should have been obj[x][y] on the previous implementation
I somehow have a clue that allocating a continuous memory chunk would be good, but i don't really understand how big of a difference this would make, can somebody explain?
First and less important, when you allocate and free your object you only need to do a single allocation/deallocation.
More important: when you use the array you basically get to trade a multiplication against a memory access. On modern computers, memory access is much much much slower than arithmetic.
That's a bit of a lie, because much of the slowness of memory accesses gets hidden by caches -- regions of memory that are being accessed frequently get stored in fast memory inside, or very near to, the CPU and can be accessed faster. But these caches are of limited size, so (1) if your array isn't being used all the time then the row pointers may not be in the cache and (2) if it is being used all the time then they may be taking up space that could otherwise be used by something else.
Exactly how it works out will depend on the details of your code, though. In many cases it will make no discernible difference one way or the other to the speed of your program. You could try it both ways and benchmark.
[EDITED to add, after being reminded of it by Peter Schneider's comment:] Also, if you allocate each row separately they may end up all being in different parts of memory, which may make your caches a bit less effective -- data gets pulled into cache in chunks, and if you often go from the end of one row to the start of the next then you'll benefit from that. But this is a subtle one; in some cases having your rows equally spaced in memory may actually make the cache perform worse, and if you allocate several rows in succession they may well end up (almost) next to one another in memory anyway, and in any case it probably doesn't matter much unless your rows are quite short.
Allocating a 2D array as a one big chunk permits the compiler to generate a more efficient code than doing it in multiple chunks. At least, there would be one pointer dereferencing operation in one chunk approach. BTW, declaring the 2D array like this:
Object obj[rows][cols];
obj[x][y];
is equivalent to:
Object* obj = new Obj[rows * cols];
obj[x * cols + y];
in terms of speed. But the first one in not dynamic (you need to specify the values of "rows" and "cols" at compile time.
By having one large contiguous chunk of memory, you may get improved performance because there is more chance that memory accesses are already in the cache. This idea is called cache locality. We say the large array has better cache locality. Modern processors have several levels of cache. The lowest level is generally the smallest and the fastest.
It still pays to access the array in meaningful ways. For example, if data is stored in row-major order and you access it in column-major order, you are scattering your memory accesses. At certain sizes, this access pattern will negate the advantages of caching.
Having good cache performance is far preferable to any concerns you may have about multiplying values for indexing.
If one of the dimensions of your array is a compile time constant you can allocate a "truly 2-dimensional array" in one chunk dynamically as well and then index it the usual way. Like all dynamic allocations of arrays, new returns a pointer to the element type. In this case of a 2-dimensional array the elements are in turn arrays -- 1-dimensional arrays. The syntax of the resulting element pointer is a bit cumbersome, mostly because the dereferencing operator*() has a lower precedence than the indexing operator[](). One possible allocation statement could be int (*arr7x11)[11] = new int[7][11];.
Below is a complete example. As you see, the innermost index in the allocation can be a run-time value; it determines the number of elements in the allocated array. The other indices determine the element type (and hence element size as well as overall size) of the dynamically allocated array, which of course must be known to perform the allocation. As discussed above, the elements are themselves arrays, here 1-dimensional arrays of 11 ints.
#include<cstdio>
using namespace std;
int main(int argc, char **argv)
{
constexpr int cols = 11;
int rows = 7;
// overwrite with cmd line arg if present.
// if scanf fails, default is retained.
if(argc >= 2) { sscanf(argv[1], "%d", &rows); }
// The actual allocation of "rows" elements of
// type "array of 'cols' ints". Note the brackets
// around *arr7x11 in order to force operator
// evaluation order. arr7x11 is a pointer to array,
// not an array of pointers.
int (*arr7x11)[cols] = new int[rows][cols];
for(int row = 0; row<rows; row++)
{
for(int col = 0; col<cols; col++)
{
arr7x11[row][col] = (row+1)*1000 + col+1;
}
}
for(int row = 0; row<rows; row++)
{
for(int col = 0; col<cols; col++)
{
printf("%6d", arr7x11[row][col]);
}
putchar('\n');
}
return 0;
}
A sample session:
g++ -std=c++14 -Wall -o 2darrdecl 2darrdecl.cpp && ./2darrdecl 3
1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011

hashtable needs lots of memory

I've declared and defined the following HashTable class. Note that I needed a hashtable of hashtables so my HashEntry struct contains a HashTable pointer. The public part is not a big deal it has the traditional hash table functions so I removed them for simplicity.
enum Status{ACTIVE, DELETED, EMPTY};
enum Type{DNS_ENTRY, URL_ENTRY};
class HashTable{
private:
struct HashEntry{
std::string key;
Status current_status;
std::string ip;
int access_count;
Type entry_type;
HashTable *table;
HashEntry(
const std::string &k = std::string(),
Status s = EMPTY,
const std::string &u = std::string(),
const int &a = int(),
Type e = DNS_ENTRY,
HashTable *t = NULL
): key(k), current_status(s), ip(u), access_count(a), entry_type(e), table(t){}
};
std::vector<HashEntry> array;
int currentSize;
public:
HashTable(int size = 1181, int csz = 0): array(size), currentSize(csz){}
};
I am using quadratic probing and I double the size of the vector in my rehash function when I hit array.size()/2. The following list is used when a larger table size is needed.
int a[16] = {49663, 99907, 181031, 360461,...}
My problem is that this class consumes so much memory. I've just profiled it with massif and found out that it needs 33MB(33 million bytes!) of memory for 125000 insertion. To be clear, actually
1 insertion -> 47352 Bytes
8 insertion -> 48376 Bytes
512 insertion -> 76.27KB
1000 insertion 2MB (array size increased to 49663 here)
27000 insertion-> 8MB (array size increased to 99907 here)
64000 insertion -> 16MB (array size increased to 181031 here)
125000 insertion-> 33MB (array size increased to 360461 here)
These may be unnecessary but I just wanted to show you how memory usage changes with the input. As you can see, when rehashing is done, memory usage doubles. For example, our initial array size was 1181. And we have just seen that 125000 elements -> 33MB.
To debug the problem, I changed the initial size to 360461. Now 127000 insertion does not need rehashing. And I see that 20MB of memory is used with this initial value. That is still huge, but I think it suggests there is a problem with rehashing. The following is my rehash function.
void HashTable::rehash(){
std::vector<HashEntry> oldArray = array;
array.resize(nextprime(array.size()));
for(int j = 0; j < array.size(); j++){
array[j].current_status = EMPTY;
}
for(int i = 0; i < oldArray.size(); i++){
if(oldArray[i].current_status == ACTIVE){
insert(oldArray[i].key);
int pos = findPos(oldArray[i].key);
array[pos] = oldArray[i];
}
}
}
int nextprime(int arraysize){
int a[16] = {49663, 99907, 181031, 360461, 720703, 1400863, 2800519, 5600533, 11200031, 22000787, 44000027};
int i = 0;
while(arraysize >= a[i]){i++;}
return a[i];
}
This is the insert function used in rehashing and everywhere else.
bool HashTable::insert(const std::string &k){
int currentPos = findPos(k);
if(isActive(currentPos)){
return false;
}
array[currentPos] = HashEntry(k, ACTIVE);
if(++currentSize > array.size() / 2){
rehash();
}
return true;
}
What am I doing wrong here? Even if it's caused by rehashing, when no rehashing is done it is still 20MB and I believe 20MB is way too much for 100k items. This hashtable is supposed to contain like 8 million elements.
The fact that 360,461 HashEntry's take 20 MB is hardly surprising. Did you try looking at sizeof(HashEntry)?
Each HashEntry includes two std::strings, a pointer, and three int's. As the old joke has it, it's not easy to answer the question "How long is a string?", in this case because there are a large variety of string implementations and optimizations, so you might find that sizeof(std::string) is anywhere between 4 and 32 bytes. (It would only be 4 bytes on a 32-bit architecture.) In practice, a string requires three pointers and the string itself unless it happens to be empty. If sizeof(std::string) is the same as sizeof(void*), then you've probably got a not-too-recent GNU standard library, in which the std::string is an opaque pointer to a block containing two pointers, a reference count, and the string itself. If sizeof(std::string) is 32 bytes, then you might have a recent GNU standard library implementation in which there is a bit of extra space in the string structure for the short-string optimization. See the answer to Why does libc++'s implementation of std::string take up 3x memory as libstdc++? for some measurements. Let's just say 32 bytes per string, and ignore the details; it won't be off by much.
So two strings (32 bytes each) plus a pointer (8 bytes) plus three ints (another 12 bytes) and four bytes of padding because one of the ints is between two 8-byte aligned objects, and that's a total of 88 bytes per HashEntry. And if you have 360,461 hash entries, that would be 31,720,568 bytes, about 30 MB. The fact that you're "only" using 20MB is probably because you're using the old GNU standard library, which optimizes empty strings to a single pointer, and the majority of your strings are empty strings because half the slots have never been used.
Now, let's take a look at rehash. Reduced to its essentials:
void rehash() {
std::vector<HashEntry> tmp = array; /* Copy the entire array */
array.resize(new_size()); /* Internally does another copy */
for (auto const& entry : tmp)
if (entry.used()) array.insert(entry); /* Yet another copy */
}
At peak, we had two copies of the smaller array as well as the new big array. Even if the new array is only 20 MB, it's not surprising that peak memory usage was almost twice that. (Indeed, this is again surprisingly small, not surprisingly big. Possibly it was not actually necessary to change the address of the new vector because it was at the end of the current allocated memory space, which could just be extended.)
Note that we did two copies of all that data, and array.resize() potentially did another one. Let's see if we can do better:
void rehash() {
std::vector<HashEntry> tmp(new_size()); /* Make an array of default objects */
for (auto const& entry: array)
if (entry.used()) tmp.insert(entry); /* Copy into the new array */
std::swap(tmp, array); /* Not a copy, just swap three pointers */
}
This way, we only do one copy. Instead of a (possible) internal copy by resize, we do a bulk construction of the new elements, which should be similar. (It's just zeroing out the memory.)
Also, in the new version we only copy the actual strings once each, instead of twice each, which is the fiddliest part of the copy and thus probably quite a large saving.
Proper string management could reduce that overhead further. rehash doesn't actually need to copy the strings, since they are not changed. So we could keep the strings elsewhere, say in a vector of strings, and just use the index into the vector in the HashEntry. Since you are not expecting to hold billions of strings, only millions, the index could a four-byte int. By also shuffling the HashEntry fields around and reducing the enums to a byte instead of four bytes (in C++11, you can specify the underlying integer type of an enum), the HashEntry could be reduced to 24 bytes, and there wouldn't be a need to leave space for as many string descriptors.
Since you are using open addressing, half your hash slots have to be empty. Since HashEntry is quite large, storing a full HashEntry in each empty slot is terribly wasteful.
You should store your HashEntry structs somewhere else and put HashEntry* in your hash table, or switch to chaining with a much denser load factor. Either one will reduce this waste.
Also, if you're going to move HashEntry objects around, swap instead of copying, or use move semantics so you don't have to copy so many strings. Be sure to clear out the strings in any entries you're no longer using.
Also, even though you say you need HashTables of HashTables, you don't really explain why. It's usually more efficient to use one hash table with efficiently represented compound keys if small hash tables are not memory-efficient.
I have changed my structure a little bit just as you all suggested, but there is this one thing that nobody has noticed.
When rehashing/resizing is being done, my rehashfunction calls insert. In this insert function I am incrementing the currentSize, which holds how many elements a hashtable has. So each time a resizing is needed, currentSize doubles itself while it should have stayed the same. I removed that line and wrote the proper code for rehashing and now I think I'm okay.
I am using two different structs now, and the program consumes 1.6GB memory for 8 million elements, which is what expected due to multibyte strings, and integers. That number was like 7-8GB before.

3d stl vector memory size issue

std::vector< std::vector< std::vector<int> > > sp(1, std::vector< std::vector<int> >(1,std::vector<int>(1)));
What should be the memory allocated for this 3d vector?
Massif shows 84 bytes, but shouldn't it be close to the size of int(4 bytes) ?
When you use STL you have to consider that your data structures are not only composed by the data itself but the meta-data. They are objects not memory regions.
For each vector object, you have several attributes. Look at:
http://en.cppreference.com/w/cpp/container/vector
Normally a single std::vector is implemented using 3 pointers
a pointer to the begin of the allocated area
a pointer to the end of valid data
a pointer to the end of the allocated area (reserved space)
thus on a 64-bit platform it's at least 3x8 = 24 bytes in addition to actual content, of course.
A 3d vector with one integer therefore would occupy at least 24x3 + sizeof(int) = 76 bytes supposing integers are 4 bytes. With 8-bytes integer would be 80 bytes, not counting any extra alignment needed for example by the heap allocator.
By hand calculus it seems that each vector holds 7 elements at start. Thus 7*sizeof(int)*3 = 84
Why should it be close to 4 bytes? std::vector is a class with more than just the element attribute! If you are short on memory you should probably not use std::vector and simply use an array or your own implementation of an ArrayList that is closer to the standard array size!

Nested STL vector using way too much memory

I have an STL vector My_Partition_Vector of Partition objects, defined as
struct Partition // the event log data structure
{
int key;
std::vector<std::vector<char> > partitions;
float modularity;
};
The actual nested structure of Partition.partitions varies from object to object but in the total number of chars stored in Partition.partitions is always 16.
I assumed therefore that the total size of the object should be more or less 24 bytes (16 + 4 + 4). However for every 100,000 items I add to My_Partition_Vector, memory consumption (found using ps -aux) increases by around 20 MB indicating around 209 bytes for each Partition Object.
This is a nearly 9 Fold increase!? Where is all this extra memory usage coming from? Some kind of padding in the STL vector, or the struct? How can I resolve this (and stop it reaching into swap)?
For one thing std::vector models a dynamic array so if you know that you'll always have 16 chars in partitions using std::vector is overkill. Use a good old C style array/matrix, boost::array or boost::multi_array.
To reduce the number of re-allocations needed for inserting/adding elements due to it's memory layout constrains std::vector is allowed to preallocate memory for a certain number of elements upfront (and it's capacity() member function will tell you how much).
While I think he may be overstating the situation just a tad, I'm in general agreement with DeadMG's conclusion that what you're doing is asking for trouble.
Although I'm generally the one looking at (whatever mess somebody has made) and saying "don't do that, just use a vector", this case might well be an exception. You're creating a huge number of objects that should be tiny. Unfortunately, a vector typically looks something like this:
template <class T>
class vector {
T *data;
size_t allocated;
size_t valid;
public:
// ...
};
On a typical 32-bit machine, that's twelve bytes already. Since you're using a vector<vector<char> >, you're going to have 12 bytes for the outer vector, plus twelve more for each vector it holds. Then, when you actually store any data in your vectors, each of those needs to allocate a block of memory from the free store. Depending on how your free store is implemented, you'll typically have a minimum block size -- frequently 32 or even 64 bytes. Worse, the heap typically has some overhead of its own, so it'll add some more memory onto each block, for its own book-keeping (e.g., it might use a linked list of blocks, adding another pointer worth of data to each allocation).
Just for grins, let's assume you average four vectors of four bytes apiece, and that your heap manager has a 32-byte minimum block size and one extra pointer (or int) for its bookkeeping (giving a real minimum of 36 bytes per block). Multiplying that out, I get 204 bytes apiece -- close enough to your 209 to believe that's reasonably close to what you're dealing with.
The question at that point is how to deal with the problem. One possibility is to try to work behind the scenes. All the containers in the standard library use allocators to get their memory. While they default allocator gets memory directly from the free store, you can substitute a different one if you choose. If you do some looking around, you can find any number of alternative allocators, many/most of which are to help with exactly the situation you're in -- reducing wasted memory when allocating lots of small objects. A couple to look at would be the Boost Pool Allocator and the Loki small object allocator.
Another possibility (that can be combined with the first) would be to quit using a vector<vector<char> > at all, and replace it with something like:
char partitions[16];
struct parts {
int part0 : 4;
int part1 : 4;
int part2 : 4;
int part3 : 4;
int part4 : 4;
int part5 : 4;
int part6 : 4
int part7 : 4;
};
For the moment, I'm assuming a maximum of 8 partitions -- if it could be 16, you can add more to parts. This should probably reduce memory usage quite a bit more, but (as-is) will affect your other code. You could also wrap this up into a small class of its own that provides 2D-style addressing to minimize impact on the rest of your code.
If you store a near constant amount of objects, then I suggest to use a 2-dimensional array.
The most likely reason for the memory consumption is debug data. STL implementations usually store A LOT of debug data. Never profile an application with debug flags on.
...This is a bit of a side conversation, but boost::multi_array was suggested as an alternative to the OP's use of nested vectors. My finding was that multi_array was using a similar amount of memory when applied to the OP's operating parameters.
I derived this code from the example at Boost.MultiArray. On my machine, this showed multi_array using about 10x more memory than ideally required assuming that the 16 bytes are arranged in a simple rectangular geometry.
To evaluate the memory usage, I checked the system monitor while the program was running and I compiled with
( export CXXFLAGS="-Wall -DNDEBUG -O3" ; make main && ./main )
Here's the code...
#include <iostream>
#include <vector>
#include "boost/multi_array.hpp"
#include <tr1/array>
#include <cassert>
#define USE_CUSTOM_ARRAY 0 // compare memory usage of my custom array vs. boost::multi_array
using std::cerr;
using std::vector;
#ifdef USE_CUSTOM_ARRAY
template< typename T, int YSIZE, int XSIZE >
class array_2D
{
std::tr1::array<char,YSIZE*XSIZE> data;
public:
T & operator () ( int y, int x ) { return data[y*XSIZE+x]; } // preferred accessor (avoid pointers)
T * operator [] ( int index ) { return &data[index*XSIZE]; } // alternative accessor (mimics boost::multi_array syntax)
};
#endif
int main ()
{
int COUNT = 1024*1024;
#if USE_CUSTOM_ARRAY
vector< array_2D<char,4,4> > A( COUNT );
typedef int index;
#else
typedef boost::multi_array<char,2> array_type;
typedef array_type::index index;
vector<array_type> A( COUNT, array_type(boost::extents[4][4]) );
#endif
// Assign values to the elements
int values = 0;
for ( int n=0; n<COUNT; n++ )
for(index i = 0; i != 4; ++i)
for(index j = 0; j != 4; ++j)
A[n][i][j] = values++;
// Verify values
int verify = 0;
for ( int n=0; n<COUNT; n++ )
for(index i = 0; i != 4; ++i)
for(index j = 0; j != 4; ++j)
{
assert( A[n][i][j] == (char)((verify++)&0xFF) );
#if USE_CUSTOM_ARRAY
assert( A[n][i][j] == A[n](i,j) ); // testing accessors
#endif
}
cerr <<"spinning...\n";
while ( 1 ) {} // wait here (so you can check memory usage in the system monitor)
return 0;
}
On my system, sizeof(vector) is 24. This probably corresponds to 3 8-byte members: capacity, size, and pointer. Additionally, you need to consider the actual allocations which would be between 1 and 16 bytes (plus allocation overhead) for the inner vector and between 24 and 384 bytes for the outer vector ( sizeof(vector) * partitions.capacity() ).
I wrote a program to sum this up...
for ( int Y=1; Y<=16; Y++ )
{
const int X = 16/Y;
if ( X*Y != 16 ) continue; // ignore imperfect geometries
Partition a;
a.partitions = vector< vector<char> >( Y, vector<char>(X) );
int sum = sizeof(a); // main structure
sum += sizeof(vector<char>) * a.partitions.capacity(); // outer vector
for ( int i=0; i<(int)a.partitions.size(); i++ )
sum += sizeof(char) * a.partitions[i].capacity(); // inner vector
cerr <<"X="<<X<<", Y="<<Y<<", size = "<<sum<<"\n";
}
The results show how much memory (not including allocation overhead) is need for each simple geometry...
X=16, Y=1, size = 80
X=8, Y=2, size = 104
X=4, Y=4, size = 152
X=2, Y=8, size = 248
X=1, Y=16, size = 440
Look at the how the "sum" is calculated to see what all of the components are.
The results posted are based on my 64-bit architecture. If you have a 32-bit architecture the sizes would be almost half as much -- but still a lot more than what you had expected.
In conclusion, std::vector<> is not very space efficient for doing a whole bunch of very small allocations. If your application is required to be efficient, then you should use a different container.
My approach to solving this would probably be to allocate the 16 chars with
std::tr1::array<char,16>
and wrap that with a custom class that maps 2D coordinates onto the array allocation.
Below is a very crude way of doing this, just as an example to get you started. You would have to change this to meet your specific needs -- especially the ability to specify the geometry dynamically.
template< typename T, int YSIZE, int XSIZE >
class array_2D
{
std::tr1::array<char,YSIZE*XSIZE> data;
public:
T & operator () ( int y, int x ) { return data[y*XSIZE+x]; } // preferred accessor (avoid pointers)
T * operator [] ( int index ) { return &data[index*XSIZE]; } // alternative accessor (mimics boost::multi_array syntax)
};
16 bytes is a complete and total waste. You're storing a hell of a lot of data about very small objects. A vector of vector is the wrong solution to use. You should log sizeof(vector) - it's not insignificant, as it performs a substantial function. On my compiler, sizeof(vector) is 20. So each Partition is 4 + 4 + 16 + 20 + 20*number of inner partitions + memory overheads like the vectors not being the perfect size.
You're only storing 16 bytes of data, and wasting ridiculous amounts of memory allocating them in the most segregated, highest overhead way you could possibly think of. The vector doesn't use a lot of memory - you have a terrible design.