What is QList's maximum size? - c++

Has anyone encountered a maximum size for QList?
I have a QList of pointers to my objects and have found that it silently throws an error when it reaches the 268,435,455th item, which is exactly 28 bits. I would have expected it to have at least a 31bit maximum size (minus one bit because size() returns a signed integer), or a 63bit maximum size on my 64bit computer, but this doesn't appear to be the case. I have confirmed this in a minimal example by executing QList<void*> mylist; mylist.append(0); in a counting loop.
To restate the question, what is the actual maximum size of QList? If it's not actually 2^32-1 then why? Is there a workaround?
I'm running a Windows 64bit build of Qt 4.8.5 for MSVC2010.

While the other answers make a useful attempt at explaining the problem, none of them actually answer the question or missed the point. Thanks to everyone for helping me track down the issue.
As Ali Mofrad mentioned, the error thrown is a std::bad_alloc error when the QList fails to allocate additional space in my QList::append(MyObject*) call. Here's where that happens in the Qt source code:
qlist.cpp: line 62:
static int grow(int size) //size = 268435456
{
//this is the problem line
volatile int x = qAllocMore(size * sizeof(void *), QListData::DataHeaderSize) / sizeof(void *);
return x; //x = -2147483648
}
qlist.cpp: line 231:
void **QListData::append(int n) //n = 1
{
Q_ASSERT(d->ref == 1);
int e = d->end;
if (e + n > d->alloc) {
int b = d->begin;
if (b - n >= 2 * d->alloc / 3) {
//...
} else {
realloc(grow(d->alloc + n)); //<-- grow() is called here
}
}
d->end = e + n;
return d->array + e;
}
In grow(), the new size requested (268,435,456) is multiplied by sizeof(void*) (8) to compute the size of the new block of memory to accommodate the growing QList. The problem is, 268435456*8 equals +2,147,483,648 if it's an unsigned int32, or -2,147,483,648 for a signed int32, which is what's getting returned from grow() on my OS. Therefore, when std::realloc() is called in QListData::realloc(int), we're trying to grow to a negative size.
The workaround here, as ddriver suggested, is to use QList::reserve() to pre-allocate the space, preventing my QList from ever having to grow.
In short, the maximum size for QList is 2^28-1 items unless you pre-allocate, in which case the maximum size truly is 2^31-1 as expected.
Update (Jan 2020): This appears to have changed in Qt 5.5, such that 2^28-1 is now the maximum size allowed for QList and QVector, regardless of whether or not you reserve in advance. A shame.

Has anyone encountered a maximum size for QList? I have a QList of pointers to my objects and have found that it silently throws an error when it reaches the 268,435,455th item, which is exactly 28 bits. I would have expected it to have at least a 31bit maximum size (minus one bit because size() returns a signed integer), or a 63bit maximum size on my 64bit computer, but this doesn't appear to be the case.
Theoretical maximum positive number stored in int is 2^31 - 1. Size of pointer is 4 bytes (for 32bit machine), so maximum possible number of them is 2^29 - 1. Appending data to the container will increases fragmentation of heap memory, so there is possible that you can allocate only half of possible memory. Try use reserve() or resize() instead.
Moreover, Win32 has some limits for memory allocation. So application compiled without special options cannot allocate more than this limit (1G or 2G).
Are you sure about this huge containers? Is it better to optimize application?

QList stores its elements in a void * array.
Hence, a list with 228 items, of which each one is a void *, will be 230 bytes long on a 32 bit machine, and 231 bytes on a 64 bit machine. I doubt you can request such a big chunk of contiguous memory.
And why allocating such a huge list anyhow? Are you sure you really need it?
The idea of be backed by an array of void * elements is because several operations on the list can be moved to non-templated code, therefore reducing the amount of generated code.
QList stores items straight in the void * array if the type is small enough (i.e. sizeof(T) <= sizeof(void*)), and if the type can be moved in memory via memmove. Otherwise, each item will be allocated on the heap via new, and the array will store the pointers to those items. A set of type traits is used to figure out how to handle each type, see Q_DECLARE_TYPEINFO.
While in theory this approach may sound attractive, in practice:
For all primitive types smaller than void * (char; int and float on 64 bit; etc.) you waste from 50 to 75% of the allocated space in the array
For all movable types bigger than void * (double on 32bit, QVariant, ...), you pay a heap allocation per each item in the list (plus the array itself)
QList code is generally less optimized than QVector one
Compilers these days do a pretty good job at merging template instantiations, hence the original reason for this design gets lost.
Today it's a much better idea to stick with QVector. Unfortunately the Qt APIs expose QList everywhere and can't change them (and we need C++11 to define QList as a template alias for QVector...)

I test this in Ubuntu 32bit with 4GB RAM using qt4.8.6. Maximum size for me is 268,435,450
I test this in Windows7 32bit with 4GB RAM using qt4.8.4. Maximum size for me is 134,217,722
This error happend : 'std::bad_alloc'
#include <QCoreApplication>
#include <QDebug>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QList<bool> li;
for(int i=0; ;i++)
{
li.append(true);
if(i>268435449)
qDebug()<<i;
}
return a.exec();
}
Output is :
268435450
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Related

Why does std::set container use much more memory than the size of its data?

For example, we have 10^7 32bit integers. The memory usage to store these integers in an array is 32*10^7/8=40MB. However, inserting 10^7 32bit integers into a set takes more than 300MB of memory. Code:
#include <iostream>
#include <set>
int main(int argc, const char * argv[]) {
std::set<int> aa;
for (int i = 0; i < 10000000; i++)
aa.insert(i);
return 0;
}
Other containers like map, unordered_set takes even more memory with similar tests. I know that set is implemented with red black tree, but the data structure itself does not explain the high memory usage.
I am wondering the reason behind this 5~8 times original data memory usage, and some workaround/alternatives for a more memory efficient set.
Let's examine std::set implementation in GCC (which is not much different in other compilers). std::set is implemented as a red-black tree on GCC. Each node has a pointer to parent, left, and right nodes and a color enumerator (_S_red and _S_black). This means that besides the int (which is probably 4 bytes) there are 3 pointers (8 * 3 = 24 bytes for a 64-bit system) and one enumerator (since it comes before the pointers in _Rb_tree_node_base, it is padded to 8 byte boundary, so effectively it takes extra 8 bytes).
So far I have counted 24 + 8 + 4 = 36 bytes for each integer in the set. But since the node has to be aligned to 8 bytes, it has to be padded so that it is divisible by 8. Which means each node takes 40 bytes (10 times bigger than int).
But this is not all. Each such node is allocated by std::allocator. This allocator uses new to allocate each node. Since delete can't know how much memory to free, each node also has some heap-related meta-data. The meta-data should at least contain the size of the allocated block, which usually takes 8 bytes (in theory, it is possible to use some kind of Huffman coding and store only 1 byte in most cases, but I never saw anybody do that).
Considering everything, the total for each int node is 48 bytes. This is 12 times more than int. Every int in the set takes 12 times more than it would have taken in an array or a vector.
Your numbers suggest that you are on a 32 bit system, since your data takes only 300 MB. For 32 bit system, pointers take 4 bytes. This makes it 3 * 4 + 4 = 16 byte for the red-black tree related data in nodes + 4 for int + 4 for meta-data. This totals with 24 bytes for each int instead of 4. This makes it 6 times larger than vector for a big set. The numbers suggest that heap meta-data takes 8 bytes and not just 4 bytes (maybe due to alignment constraint).
So on your system, instead of 40MB (had it been std::vector), it is expected to take 280MB.
If you want to save some peanuts, you can use a non-standard allocator for your sets. You can avoid the metadata overhead by using boost's Segregated storage node allocators. But that is not such a big win in terms of memory. It could boost your performance, though, since the allocators are simpler than the code in new and delete.

Cannot resize vector to 1765880295

I want to allocate a vector of size 1765880295 and so i used resize(1765880295) but the program stops running.The adjact problem would be code block not responding.
what is wrong?
Although the max_size gives 4294967295 which is greater than 1765880295 the problem is still the same even without resizing the vector.
Depending on what is stored in the vector -- for example, a 32-bit pointer or uint32, the size of the vector (number of elements * size of each element) will exceed the maximum addressable space on a 32-bit system.
The max_size is dependent on the implementation (some may have 1073741823 as their max_size). But even if your implementation supports a bigger number, the program will fail if there is not enough memory.
For example: if you have a vector<int>, and the sizeof(int) == 4 // bytes, we do the math, and...
1765880295 * 4 bytes = 7063521180 bytes ≈ 6.578 gygabytes
So you would require around 6.6GiB of free memory to allocate that enormous vector.

Getting User store segfault error

I am receiving the error "User store segfault # 0x000000007feff598" for a large convolution operation.
I have defined the resultant array as
int t3_isize = 0;
int t3_irowcount = 0;
t3_irowcount=atoi(argv[2]);
t3_isize = atoi(argv[3]);
int iarray_size = t3_isize*t3_irowcount;
uint64_t t_result[iarray_size];
I noticed that if the array size is less than 2^16 - 1, the operation doesn't fail, but for the array size 2^16 or higher, I get the segfault error.
Any idea why this is happening? And how can i rectify this?
“I noticed that if the array size is greater than 2^16 - 1, the operation doesn't fail, but for the array size 2^16 or higher, I get the segfault error”
↑ Seems a bit self-contradictory.
But probably you're just allocating a too large array on the stack. Using dynamic memory allocation (e.g., just switch to using std::vector) you avoid that problem. For example:
std::vector<uint64_t> t_result(iarray_size);
In passing, I would ditch the Hungarian notation-like prefixes. For example, t_ reads like this is a type. The time for Hungarian notation was late 1980's, and its purpose was to support Microsoft's Programmer's Workbench, a now dicontinued (for very long) product.
You're probably declaring too large of an array for the stack. 216 elements of 8 bytes each is quite a lot (512K bytes).
If you just need static allocation, move the array to file scope.
Otherwise, consider using std::vector, which will allocate storage from the heap and manage it for you.
Using malloc() solved the issue.
uint64_t* t_result = (uint64_t*) malloc(sizeof(uint64_t)*iarray_size);

Size of std::array, std::vector and raw array

Lets we have,
std::array <int,5> STDarr;
std::vector <int> VEC(5);
int RAWarr[5];
I tried to get size of them as,
std::cout << sizeof(STDarr) + sizeof(int) * STDarr.max_size() << std::endl;
std::cout << sizeof(VEC) + sizeof(int) * VEC.capacity() << std::endl;
std::cout << sizeof(RAWarr) << std::endl;
The outputs are,
40
20
40
Are these calculations correct? Considering I don't have enough memory for std::vector and no way of escaping dynamic allocation, what should I use? If I would know that std::arrayresults in lower memory requirement I could change the program to make array static.
These numbers are wrong. Moreover, I don't think they represent what you think they represent, either. Let me explain.
First the part about them being wrong. You, unfortunately, don't show the value of sizeof(int) so we must derive it. On the system you are using the size of an int can be computed as
size_t sizeof_int = sizeof(RAWarr) / 5; // => sizeof(int) == 8
because this is essentially the definition of sizeof(T): it is the number of bytes between the start of two adjacent objects of type T in an array. This happens to be inconsistent with the number print for STDarr: the class template std::array<T, n> is specified to have an array of n objects of type T embedded into it. Moreover, std::array<T, n>::max_size() is a constant expression yielding n. That is, we have:
40 // is identical to
sizeof(STDarr) + sizeof(int) * STDarr.max_size() // is bigger or equal to
sizeof(RAWarr) + sizeof_int * 5 // is identical to
40 + 40 // is identical to
80
That is 40 >= 80 - a contradication.
Similarily, the second computation is also inconsistent with the third computation: the std::vector<int> holds at least 5 elements and the capacity() has to be bigger than than the size(). Moreover, the std::vector<int>'s size is non-zero. That is, the following always has to be true:
sizeof(RAWarr) < sizeof(VEC) + sizeof(int) * VEC.capacity()
Anyway, all this is pretty much irrelevant to what your actual question seems to be: What is the overhead of representing n objects of type T using a built-in array of T, an std::array<T, n>, and an std::vector<T>? The answer to this question is:
A built-in array T[n] uses sizeof(T) * n.
An std::array<T, n> uses the same size as a T[n].
A std::vector<T>(n) has needs some control data (the size, the capacity, and possibly and possibly an allocator) plus at least 'n * sizeof(T)' bytes to represent its actual data. It may choose to also have a capacity() which is bigger than n.
In addition to these numbers, actually using any of these data structures may require addition memory:
All objects are aligned at an appropriate address. For this there may be additional byte in front of the object.
When the object is allocated on the heap, the memory management system my include a couple of bytes in addition to the memory made avaiable. This may be just a word with the size but it may be whatever the allocation mechanism fancies. Also, this memory may live somewhere else than the allocate memory, e.g. in a hash table somewhere.
OK, I hope this provided some insight. However, here comes the important message: if std::vector<T> isn't capable of holding the amount of data you have there are two situations:
You have extremely low memory and most of this discussion is futile because you need entirely different approaches to cope with the few bytes you have. This would be the case if you are working on extremely resource constrained embedded systems.
You have too much data an using T[n] or std::array<T, n> won't be of much help because the overhead we are talking of is typically less than 32 bytes.
Maybe you can describe what you are actually trying to do and why std::vector<T> is not an option.

memory usage of in class - converting double to float didn't reduce memory usage as expected

I am initializing millions of classes that are of the following type
template<class T>
struct node
{
//some functions
private:
T m_data_1;
T m_data_2;
T m_data_3;
node* m_parent_1;
node* m_parent_2;
node* m_child;
}
The purpose of the template is to enable the user to choose float or double precision, with the idea being that by node<float> will occupy less memory (RAM).
However, when I switch from double to float the memory footprint of my program does not decrease as I expect it to. I have two questions,
Is it possible that the compiler/operating system is reserving more space than required for my floats (or even storing them as a double). If so, how do I stop this happening - I'm using linux on 64 bit machine with g++.
Is there a tool that lets me determine the amount of memory used by all the different classes? (i.e. some sort of memory profiling) - to make sure that the memory isn't being goobled up somewhere else that I haven't thought of.
If you are compiling for 64-bit, then each pointer will be 64-bits in size. This also means that they may need to be aligned to 64-bits. So if you store 3 floats, it may have to insert 4 bytes of padding. So instead of saving 12 bytes, you only save 8. The padding will still be there whether the pointers are at the beginning of the struct or the end. This is necessary in order to put consecutive structs in arrays to continue to maintain alignment.
Also, your structure is primarily composed of 3 pointers. The 8 bytes you save take you from a 48-byte object to a 40 byte object. That's not exactly a massive decrease. Again, if you're compiling for 64-bit.
If you're compiling for 32-bit, then you're saving 12 bytes from a 36-byte structure, which is better percentage-wise. Potentially more if doubles have to be aligned to 8 bytes.
The other answers are correct about the source of the discrepancy. However, pointers (and other types) on x86/x86-64 are not required to be aligned. It is just that performance is better when they are, which is why GCC keeps them aligned by default.
But GCC provides a "packed" attribute to let you exert control over this:
#include <iostream>
template<class T>
struct node
{
private:
T m_data_1;
T m_data_2;
T m_data_3;
node* m_parent_1;
node* m_parent_2;
node* m_child;
} ;
template<class T>
struct node2
{
private:
T m_data_1;
T m_data_2;
T m_data_3;
node2* m_parent_1;
node2* m_parent_2;
node2* m_child;
} __attribute__((packed));
int
main(int argc, char *argv[])
{
std::cout << "sizeof(node<double>) == " << sizeof(node<double>) << std::endl;
std::cout << "sizeof(node<float>) == " << sizeof(node<float>) << std::endl;
std::cout << "sizeof(node2<float>) == " << sizeof(node2<float>) << std::endl;
return 0;
}
On my system (x86-64, g++ 4.5.2), this program outputs:
sizeof(node<double>) == 48
sizeof(node<float>) == 40
sizeof(node2<float>) == 36
Of course, the "attribute" mechanism and the "packed" attribute itself are GCC-specific.
In addtion to the valid points that Nicol makes:
When you call new/malloc, it doesn't necessarily correspond 1 to 1 with a call the the OS to allocate memory. This is because in order to reduce the number of expensive syste, calls, the heap manager may allocate more than is requested, and then "suballocate" chunks of that when you call new/malloc. Also, memory can only be allocated 4kb at a time (typically - this is the minimum page size). Essentially, there may be chunks of memory allocated that are not currently actively used, in order to speed up future allocations.
To answer your questions directly:
1) Yes, the runtime will very likely allocate more memory then you asked for - but this memory is not wasted, it will be used for future news/mallocs, but will still show up in "task manager" or whatever tool you use. No, it will not promote floats to doubles. The more allocations you make, the less likely this edge condition will be the cause of the size difference, and the items in Nicol's will dominate. For a smaller number of allocations, this item is likely to dominate (where "large" and "small" depends entirely on your OS and Kernel).
2) The windows task manager will give you the total memory allocated. Something like WinDbg will actually give you the virtual memory range chunks (usually allocated in a tree) that were allocated by the run-time. For Linux, I expect this data will be available in one of the files in the /proc directory associated with your process.