Should a big QByteArray be heap allocated? - c++

On Windows on a Qt C++ application, I'm reading image data from an instrument. The instrument driver gives me a pointer to the data and basic image information:
width 1400
height 1200
size 3360000
depth 14 bits
In other words, an image file is almost 3MB.
I'm storing that on a QByteArray, on that stack.
Is that 3MB too big for the stack? Should that be on the heap?

QByteArray does its own heap allocation (in the constructor or whenever it needs to grow). So you can declare a QByteArray of any size on the stack. When the variable goes out of scope, the destructor will release the memory.
The C++ standard library equivalent is std::vector<uint8_t> or std::vector<char>.

Related

How to extend ESP32 heap size?

I'm writing a code about play gif from SDCard on TFT Screen, so I create a array to put the gif file. (Using Nodemcu-32s 4MB)
#include<TFT_eSPI.h>
#include<SPI.h>
#include<AnimatedGIF.h>
TFT_eSPI tft;
AnimatedGIF gif;
uint8_t *gifArray;
int gifArrayLen;
void setup(){
tft.init();
tft.setRotation(2);
tft.fillScreen(TFT_BLACK);
File fgif=SD.open("/test.gif",FILE_READ);
gifArrayLen=fgif.size();
gifArray=(uint8_t *)malloc(gifArrayLen);
for(int i=0;i<gifArrayLen;i++) gifArray[i]=fgif.read();
fgif.close();
}
void loop(){
tft.startWrite();
gif.open(gifArray,gifArrayLen*sizeof(uint8_t),GIFDraw);
while(gif.playFrame(true,NULL)) yield();
gif.close();
tft.endWrite();
}
But if gif size > 131KB, it will trigger fatal error like this.
Guru Meditation Error: Core 1 panic'ed (StoreProhibited). Exception was unhandled.
After malloc the array, when I set a value on it, it triggered.
I found some forum says it's because it over the FreeRTOS heap size.
Can I extend the heap size or use another storage method to replace it?
The "4MB" in NodeMCU refers to the size of flash, the size of RAM on ESP32 is fixed at 512KB, roughly 200KB of which is used by IRAM cache/code sections, leaving around 320KB for program memory, half of which is available for dynamic allocation.
From documentation Heap Memory - Available Heap:
Due to a technical limitation, the maximum statically allocated DRAM
usage is 160KB. The remaining 160KB (for a total of 320KB of DRAM) can
only be allocated at runtime as heap.
There is a way to connect more RAM via SPI, but it's going to be very slow. You might want to look into a larger SoC with more RAM instead.
The heap is the pool of memory available to your program to allocate storage.
You're using an ESP32, which is a small CPU that has a small amount of RAM, roughly 400KB.
You cannot extend the heap on the ESP32. Unlike Linux which has virtual memory, there's nothing to extend it with.
However, you can add additional storage in the form of external PSRAM - additional RAM that is accessed via SPI. PSRAM will be slower than internal RAM and will have some restrictions on its use (for instance, call stacks cannot live in PSRAM and certain DMA buffers cannot be located there).
The ESP32's manufacturer, Espressif, has documented how to use PSRAM with the ESP32.
You'll generally (maybe exclusively) find PSRAM available built onto ESP32 boards, so if you're using an ESP32 module that doesn't already have it you'll need a new ESP32 module that does.
And as other people have pointed out, you can artificially cause less memory to be available in the heap by over-allocating and freeing memory, fragmenting the heap so that there are no large pieces still available. This is called heap fragmentation and has been written about extensively here on Stack Overflow and other web sites.
You can use IRAM to add more space to heap. Usually, the DRAM comes from SRAM1 (128 KB) and SRAM2 (200 KB) with a total of 328 KB. But because ROM uses 8 KB for their function in RAM, the total available DRAM is only 320 KB. IRAM (from SRAM0) is usually used for code not data. But since not all space is used for code, the remaining byte is available to data. If you use arduino framework, usually there will be about 72 KB available to use as heap. But the access must be 32-bit aligned, such as array of int or pointer. See this links for further information.
https://demo-dijiudu.readthedocs.io/en/stable/api-reference/system/mem_alloc.html
https://kenny-peng.com/2021/08/23/esp32_iram.html

Stack size in QThread

What is the maximal default stack size when using QThread in QT5 and C++? I have a QVector in my thread and I am calling myvector.append() method, and I`m interested how big my vector can be. I found uint QThread::stackSize() const method which returns Stack size, but only if it was previously changed by method setStackSize(), but what is the default Stack Size?
QThread stack size can be read and set by these two calls:
uint QThread::stackSize() const
void QThread::setStackSize(uint stackSize)
If Qthread is decade into OS specific thread in different OS, in linux pthread max stack size is 8M.
But sounds like you are concerned that QVector is growing in stack, this is not happening. QVector stores data in heap.
From source code of QVector,
void QVector<T>::append(const T &t)
{
...
if (!isDetached() || isTooSmall) {
...
reallocData(d->size, isTooSmall ? d->size + 1 : d->alloc, opt);
...
}
All it does it allocates new space in HEAP (pre-allocate predefined count of elements) and make sure data stored in continuous memory space. Doesn't looks like it concerns about page boundary etc.
The stack size only plays are role if you are compiling a 32 bit application and you're allocating the storage for your buffer explicitly on the stack. Just using an automatic QVector or std::vector instance doesn't allocate any large buffers on the stack - you'd need to use a custom allocator for that.
IIRC in 64 bit applications, the memory is laid out such that you won't ever run out of stack space for any reasonable number of threads.

What is the maximum allowed size of an "unsigned char" array in Visual C++ 6.0?

For working with graphics, I need to have an array of unsigned char. It must be 3 dimensional, with the first dimension being size 4, (1st byte Blue, 2nd byte Green, 3rd byte Red, 4th byte Unused). A simple array for a 640x480 image is then done like this:
unsigned char Pixels[4][640][480]
But the problem is, it always crashes the program immediately when it is run. It compiles fine. It links fine. It has no errors or warnings. But when it's run it immediately crashes. I had many other lines of code, but it is this one I found that causes the immediate crash. It's not like I don't have enough RAM to hold this data. It's only a tiny amount of data, just enough for a single 640x480 full color image. But I've only seen such immediate crashes before, when a program tries to read or write to unallocated memory (for example using CopyMemory API function, where the source or destination either partially or entirely outside the memory space of already defined variables). But this isn't such a memory reading or writing operation. It is a memory allocating operation. That should NEVER fail, unless there's not enough RAM in the PC. And my PC certainly has enough RAM (no modern computer would NOT have enough RAM for this). Can somebody tell me why it is messing up? Is this a well known problem with VC++ 6.0?
If this is inside a function, then it will be allocated on the stack at runtime. It is more than a megabyte, so it might well be too big for the stack. You have two obvious options:
(i) make it static:
static unsigned char Pixels[4][640][480];
(ii) make it dynamic, i.e. allocate it from the heap (and don't forget to delete it when you have finished):
unsigned char (*Pixels)[640][480] = new unsigned char[4][640][480];
...
delete[] Pixels;
Option (i) is OK if the array will be needed for the lifetime of the application. Otherwise option (ii) is better.
Visual C++ by default gives programs 1MB of stack. The size the array you are trying to allocate on the stack is 1200KB which is going to bust your stack. You need to allocate your array on the heap. std::vector is your best bet for this.
using namespace std;
vector<vector<vector<unsigned char>>> A(4, vector<vector<unsigned char>>(640, vector<unsigned char>(480, 0)));
This looks a bit more confusing but will do what you want in terms of initialising the array and means you don't have to worry about memory leaks.
Alternatively if this isn't an option then it is possible to increase the stack size by passing /STACK: followed by the desired stack size in bytes to the linker.
Edit: in the interests of speed you may wish to use a single allocated block of memory instead:
std::unique_ptr<unsigned char [][640][480]> A(new unsigned char [4][640][480]);

Memory Demands: Heap vs Stack in C++

So I had a strange experience this evening.
I was working on a program in C++ that required some way of reading a long list of simple data objects from file and storing them in the main memory, approximately 400,000 entries. The object itself is something like:
class Entry
{
public:
Entry(int x, int y, int type);
Entry(); ~Entry();
// some other basic functions
private:
int m_X, m_Y;
int m_Type;
};
Simple, right? Well, since I needed to read them from file, I had some loop like
Entry** globalEntries;
globalEntries = new Entry*[totalEntries];
entries = new Entry[totalEntries];// totalEntries read from file, about 400,000
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = new Entry(.......);
}
That addition to the program added about 25 to 35 megabytes to the program when I tracked it on the task manager. A simple change to stack allocation:
Entry* globalEntries;
globalEntries = new Entry[totalEntries];
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = Entry(.......);
}
and suddenly it only required 3 megabytes. Why is that happening? I know pointer objects have a little bit of extra overhead to them (4 bytes for the pointer address), but it shouldn't be enough to make THAT much of a difference. Could it be because the program is allocating memory inefficiently, and ending up with chunks of unallocated memory in between allocated memory?
Your code is wrong, or I don't see how this worked. With new Entry [count] you create a new array of Entry (type is Entry*), yet you assign it to Entry**, so I presume you used new Entry*[count].
What you did next was to create another new Entry object on the heap, and storing it in the globalEntries array. So you need memory for 400.000 pointers + 400.000 elements. 400.000 pointers take 3 MiB of memory on a 64-bit machine. Additionally, you have 400.000 single Entry allocations, which will all require sizeof (Entry) plus potentially some more memory (for the memory manager -- it might have to store the size of allocation, the associated pool, alignment/padding, etc.) These additional book-keeping memory can quickly add up.
If you change your second example to:
Entry* globalEntries;
globalEntries = new Entry[count];
for (...) {
globalEntries [i] = Entry (...);
}
memory usage should be equal to the stack approach.
Of course, ideally you'll use a std::vector<Entry>.
First of all, without specifying which column exactly you were watching, the number in task manager means nothing. On a modern operating system it's difficult even to define what you mean with "used memory" - are we talking about private pages? The working set? Only the stuff that stays in RAM? does reserved but not committed memory count? Who pays for memory shared between processes? Are memory mapped file included?
If you are watching some meaningful metric, it's impossible to see 3 MB of memory used - your object is at least 12 bytes (assuming 32 bit integers and no padding), so 400000 elements will need about 4.58 MB. Also, I'd be surprised if it worked with stack allocation - the default stack size in VC++ is 1 MB, you should already have had a stack overflow.
Anyhow, it is reasonable to expect a different memory usage:
the stack is (mostly) allocated right from the beginning, so that's memory you nominally consume even without really using it for anything (actually virtual memory and automatic stack expansion makes this a bit more complicated, but it's "true enough");
the CRT heap is opaque to the task manager: all it sees is the memory given by the operating system to the process, not what the C heap has "really" in use; the heap grows (requesting memory to the OS) more than strictly necessary to be ready for further memory requests - so what you see is how much memory it is ready to give away without further syscalls;
your "separate allocations" method has a significant overhead. The all-contiguous array you'd get with new Entry[size] costs size*sizeof(Entry) bytes, plus the heap bookkeeping data (typically a few integer-sized fields); the separated allocations method costs at least size*sizeof(Entry) (size of all the "bare elements") plus size*sizeof(Entry *) (size of the pointer array) plus size+1 multiplied by the cost of each allocation. If we assume a 32 bit architecture with a cost of 2 ints per allocation, you quickly see that this costs size*24+8 bytes of memory, instead of size*12+8 for the contiguous array in the heap;
the heap normally really gives away blocks that aren't really the size you asked for, because it manages blocks of fixed size; so, if you allocate single objects like that you are probably paying also for some extra padding - supposing it has 16 bytes blocks, you are paying 4 bytes extra per element by allocating them separately; this moves out memory estimation to size*28+8, i.e. an overhead of 16 bytes per each 12-byte element.

Why is the heap after array allocation so large

I've got a very basic application that boils down to the following code:
char* gBigArray[200][200][200];
unsigned int Initialise(){
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=new char;
return sizeof(gBigArray);
}
The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. I've loaded the application into VMMap by SysInternals and have the following values:
I'm unsure what Image means (listed under Type), although irrelevant of that its value is around what I'm expecting. What is really throwing things out for me is the Heap value, which is where the apparent enormous size is coming from.
What I don't understand is why this is? According to this answer if I've understood it correctly, gBigArray would be placed in the data or bss segment - however I'm guessing as each element is an uninitialised pointer it would be placed in the bss segment. Why then would the heap value be larger by a silly amount than what is required?
It doesn't sound silly if you know how memory allocators work. They keep track of the allocated blocks so there's a field storing the size and also a pointer to the next block, perhaps even some padding. Some compilers place guarding space around the allocated area in debug builds so if you write beyond or before the allocated area the program can detect it at runtime when you try to free the allocated space.
you are allocating one char at a time. There is typically a space overhead per allocation
Allocate the memory on one big chunk (or at least in a few chunks)
Do not forget that char* gBigArray[200][200][200]; allocates space for 200*200*200=8000000 pointers, each word size. That is 32 MB on a 32 bit system.
Add another 8000000 char's to that for another 8MB. Since you are allocating them one by one it probably can't allocate them at one byte per item so they'll probably also take the word size per item resulting in another 32MB (32 bit system).
The rest is probably overhead, which is also significant because the C++ system must remember how many elements an array allocated with new contains for delete [].
Owww! My embedded systems stuff would roll over and die if faced with that code. Each allocation has quite a bit of extra info associated with it and either is spaced to a fixed size, or is managed via a linked list type object. On my system, that 1 char new would become a 64 byte allocation out of a small object allocator such that management would be in O(1) time. But in other systems, this could easily fragment your memory horribly, make subsequent new and deletes run extremely slowly O(n) where n is number of things it tracks, and in general bring doom upon an app over time as each char would become at least a 32 byte allocation and be placed in all sorts of cubby holes in memory, thus pushing your allocation heap out much further than you might expect.
Do a single large allocation and map your 3D array over it if you need to with a placement new or other pointer trickery.
Allocating 1 char at a time is probably more expensive. There are metadata headers per allocation so 1 byte for a character is smaller than the header metadata so you might actually save space by doing one large allocation (if possible) that way you mitigate the overhead of each individual allocation having its own metadata.
Perhaps this is an issue of memory stride? What size of gaps are between values?
30 MB is for the pointers. The rest is for the storage you allocated with the new call that the pointers are pointing to. Compilers are allowed to allocate more than one byte for various reasons, like to align on word boundaries, or give some growing room in case you want it later. If you want 8 MB worth of characters, leave the * off your declaration for gBigArray.
Edited out of the above post into a community wiki post:
As the answers below say, the issue here is I am creating a new char 200^3 times, and although each char is only 1 byte, there is overhead for every object on the heap. It seems creating a char array for all chars knocks the memory down to a more believable level:
char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){
unsigned int mIndex=0;
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
return sizeof(gBigArray);
}