Let's consider this C++ code as a rough example.
int *A = new int [5];
int *B = new int [5];
int *C = new int [5];
delete []A;
delete []C;
int *D = new int [10];
Obviously any machine can handle this case without any problems with buffer overflow or memory leak. However let's imagine that the lengths are multiplied by one million or even a bigger number. As far as I know addresses (at least virtual addresses) of all array elements are consecutive. So whenever I create an array I can be sure that they are contiguous chunks in virtual memory and I can perform pointer arithmetic to access n-th element if I have pointer to the first one. My question is illustrated in the following image( registers representing end of array are ignored for the sake of simplicity).
After allocating A, B, C in the heap we free A and C and get two free memory chunks of length 5 (marked with green dots). What happens when I want to allocate an array of length 10? I think that there are 3 possible cases.
I will get bad_alloc exception for not having contiguous 10 length memory chunk.
The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.
The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).
Which one of these is most possible answer or is there another possible case I didn't take into account?
I will get bad_alloc exception for not having contiguous 10 length memory chunk.
This can happen.
The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.
This cannot happen. Moving an object to a different address is not possible in C++ because it would invalidate existing pointers.
The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).
This also cannot happen. In C++ array elements are stored contiguously, so that pointer arithmetic is possible.
But there are in fact more possibilities. To understand them, we must account for the fact that the memory can be virtual. This, among other things, means that available address space may be larger than the amount of physically present memory. A chunk of physical memory can be assigned any address from the available address space.
As an example, consider a machine with 8GB (2^33 bytes) of memory running a 64-bit os on a 64-bit CPU. Addresses allocated to the program do not gave all be less that 8GB; it can receive a megabyte chunk of memory at address 0x00000000ffff0000 and another megabyte chunk at address 0x0000ffffffff0000. The total amount of memory allocated to the program cannot be more than 2^33 bytes, but each chunk can be located anywhere in the 2^64 space. (In reality this is a bit more complicated but similar enough to what I describe).
In your picture, you have 15 little squares that represent chunks of memory. Let's say it's physical memory. Virtual memory is 15,000 little squares, of which you can use any 15 at any given time.
So, considering this fact, the following scenarios are also possible.
A chunk of virtual address space is given to the program that is not backed by real physical memory. When and if the program attempts to access this space, the OS will try to allocate physical memory and map it to the corresponding address so that the program can continue. If this attempt fails, the program may be killed by the OS. The newly-free memory is now available to other programs that may want it.
The two short chunks of memory are mapped to new virtual addresses such that they form one long contiguous chunk in the virtual memory space. Remember that typically there are many more virtual memory addresses than there is physical memory, and it is normally easy to find an unassigned range. Typically this scenario is only realized when the memory chunks in question are large.
The problem that you are asking about is called heap-fragmentation, and it's a real, hard problem.
I will get bad_alloc exception for not having contiguous 10 length memory chunk.
This is the theory. But such a situation is really only possible within a 32-bit process; the 64-bit address space is vast.
That is, with a 64-bit process, it is more likely that heap fragmentation stops your new implementation from reusing some memory, which leads to an out-of-memory condition since it needs to ask the kernel for new memory for the entire D array instead of half of it. Also, such an OOM-condition will more likely cause your process to get shot by the OOM-killer sometime when you try to access a location in D, rather than new throwing an exception, because the kernel won't realize that it has overcommitted its memory before it's too late. For more information, google "memory overcommitment".
The program will automatically reallocate array B to the beginning of the heap and join together the rest of the unused memory.
No, it can't. You are in C++, and your runtime does not know where you have possibly stored pointers to B, so it would either run the danger of missing a pointer that needs to be modified, or run the danger of modifying something that's not a pointer to B but happens to have the same bit pattern.
The array D will be split into 2 parts and stored not contiguously causing not constant access time for n-th element of the array (if there are much more than 2 splits it starts to resemble a linked list rather than an array).
This is also not possible because C++ guarantees contiguous storage of arrays (to allow array accesses to be implemented via pointer arithmetic).
Related
Runtime error: pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffff for frequency sort
In first answer of that link, it says that appending char to string can cause memory issue.
string s = "";
char c = 'a';
int max = INT_MAX;
for(int j=0;j<max;j++)
s = s + c;
The answer explains [s=s+c in above code copies the same string again and again so it will cause memory issue.] But I don't understand why that code copies the same string again and again.
Is there someone who is likely to make me understand that part :)?
I don't understand why that code copies the same string again and
again.
Okay, let's look at the what happens each time the loop is iterated:
s = s + c;
There are three things the program has to do in order to execute that line of code:
Compute the temporary value s + c -- to do that, the program has to create a temporary, anonymous std::string object, and allocate for it (from the heap) an internal byte-buffer that is at least one byte larger than the number of chars currently in s (so that it can hold all of s's old contents, plus the additional char provided by c)
Set s equal to the temporary-string. In C++03 and earlier, this would be done by reallocating s's internal byte-buffer to be larger, then copying all of the bytes from the temporary-string into s's new/larger buffer. C++11 optimizes this a bit via the new move-assignment operator, so that all the bytes don't have to be copied; rather, s can simply take ownership of the temporary-string's byte-buffer.
Free the temporary string's resources, now that we're done using it. In practice, this takes the form of the std::string class's destructor calling delete[] on the old (no-longer-large-enough) byte-buffer.
Given that the above is going to be performed at least 2 billion times in a loop, it's already quite inefficient.
However, what I think the answer you referred to was particularly concerned about was heap fragmentation. Keep in mind that heap allocation doesn't work by magic; when you (or the std::string class, or anyone) asks to allocate N bytes of memory from the heap, the heap implementation's job is to find N bytes of contiguous memory and return it. And since there is no provision in C++ for moving blocks of memory around (as doing so would invalidate any pointers that the program might have pointing into those blocks of memory), the heap can't create an N-byte contiguous-memory-chunk out of smaller chunks; instead, there has to be a range of contiguous-memory-space already available. For example, it does the heap no good to have a total of 1GB of memory available, if that 1GB of memory is made up of thousands of nonconsecutive 1KB chunks and the caller is asking for a 2KB allocation.
Therefore, the heap's job is to efficiently allocate chunks of memory of the sizes the program requests, and when they are freed again, it will try to glue them back together into larger chunks again if it can, but it may not always be able to. Certain patterns of allocating and freeing memory may result in heap fragmentation, which is simply a large number of discontinuous memory allocations that render the small regions of free memory between them unusable for large allocations.
Whether or not this particular allocate/free pattern would cause that, I'm not sure; given that only one or two buffers are being allocated at a time, the heap may be able to reabsorb them back into adjacent free-memory chunks as they get freed again -- it probably depends on the particular heap algorithm the system is using, as well as on whether any other threads are allocating/freeing heap memory while this is going on. But I wouldn't be too surprised if there are systems out there where it would cause problems (particularly on 16-bit or 32-bit systems where virtual address space is limited, or embedded systems that don't use virtual memory)
I'm a student taking a class on Data Structures in C++ this semester and I came across something that I don't quite understand tonight. Say I were to create a pointer to an array on the heap:
int* arrayPtr = new int [4];
I can access this array using pointer syntax
int value = *(arrayPtr + index);
But if I were to add another value to the memory position immediately after the end of the space allocated for the array, I would then be able to access it
*(arrayPtr + 4) = 0;
int nextPos = *(arrayPtr + 4);
//the value of nextPos will be 0, or whatever value I previously filled that space with
The position in memory of *(arrayPtr + 4) is past the end of the space allocated for the array. But as far as I understand, the above still would not cause any problems. So aside from it being a requirement of C++, why even give arrays a specific size when declaring them?
When you go past the end of allocated memory, you are actually accessing memory of some other object (or memory that is free right now, but that could change later). So, it will cause you problems. Especially if you'll try to write something to it.
I can access this array using pointer syntax
int value = *(arrayPtr + index);
Yeah, but don't. Use arrayPtr[index]
The position in memory of *(arrayPtr + 4) is past the end of the space allocated for the array. But as far as I understand, the above still would not cause any problems.
You understand wrong. Oh so very wrong. You're invoking undefined behavior and undefined behavior is undefined. It may work for a week, then break one day next week and you'll be left wondering why. If you don't know the collection size in advance use something dynamic like a vector instead of an array.
Yes, in C/C++ you can access memory outside of the space you claim to have allocated. Sometimes. This is what is referred to as undefined behavior.
Basically, you have told the compiler and the memory management system that you want space to store four integers, and the memory management system allocated space for you to store four integers. It gave you a pointer to that space. In the memory manager's internal accounting, those bytes of ram are now occupied, until you call delete[] arrayPtr;.
However, the memory manager has not allocated that next byte for you. You don't have any way of knowing, in general, what that next byte is, or who it belongs to.
In a simple example program like your example, which just allocates a few bytes, and doesn't allocate anything else, chances are, that next byte belongs to your program, and isn't occupied. If that array is the only dynamically allocated memory in your program, then it's probably, maybe safe to run over the end.
But in a more complex program, with multiple dynamic memory allocations and deallocations, especially near the edges of memory pages, you really have no good way of knowing what any bytes outside of the memory you asked for contain. So when you write to bytes outside of the memory you asked for in new you could be writing to basically anything.
This is where undefined behavior comes in. Because you don't know what's in that space you wrote to, you don't know what will happen as a result. Here's some examples of things that could happen:
The memory was not allocated when you wrote to it. In that case, the data is fine, and nothing bad seems to happen. However, if a later memory allocation uses that space, anything you tried to put there will be lost.
The memory was allocated when you wrote to it. In that case, congratulations, you just overwrote some random bytes from some other data structure somewhere else in your program. Imagine replacing a variable somewhere in one of your objects with random data, and consider what that would mean for your program. Maybe a list somewhere else now has the wrong count. Maybe a string now has some random values for the first few characters, or is now empty because you replaced those characters with zeroes.
The array was allocated at the edge of a page, so the next bytes don't belong to your program. The address is outside your program's allocation. In this case, the OS detects you accessing random memory that isn't yours, and terminates your program immediately with SIGSEGV.
Basically, undefined behavior means that you are doing something illegal, but because C/C++ is designed to be fast, the language designers don't include an explicit check to make sure you don't break the rules, like other languages (e.g. Java, C#). They just list the behavior of breaking the rules as undefined, and then the people who make the compilers can have the output be simpler, faster code, since no array bounds checks are made, and if you break the rules, it's your own problem.
So yes, this sometimes works, but don't ever rely on it.
It would not cause any problems in a a purely abstract setting, where you only worry about whether the logic of the algorithm is sound. In that case there's no reason to declare the size of an array at all. However, your computer exists in the physical world, and only has a limited amount of memory. When you're allocating memory, you're asking the operating system to let you use some of the computer's finite memory. If you go beyond that, the operating system should stop you, usually by killing your process/program.
Yes, you must write it as arrayptr[index] because the position in memory of *(arrayptr + 4) is past the end of the space which you have allocated for the array. Its the flaw in C++ that the array size cant be extended once allocated.
So I had a strange experience this evening.
I was working on a program in C++ that required some way of reading a long list of simple data objects from file and storing them in the main memory, approximately 400,000 entries. The object itself is something like:
class Entry
{
public:
Entry(int x, int y, int type);
Entry(); ~Entry();
// some other basic functions
private:
int m_X, m_Y;
int m_Type;
};
Simple, right? Well, since I needed to read them from file, I had some loop like
Entry** globalEntries;
globalEntries = new Entry*[totalEntries];
entries = new Entry[totalEntries];// totalEntries read from file, about 400,000
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = new Entry(.......);
}
That addition to the program added about 25 to 35 megabytes to the program when I tracked it on the task manager. A simple change to stack allocation:
Entry* globalEntries;
globalEntries = new Entry[totalEntries];
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = Entry(.......);
}
and suddenly it only required 3 megabytes. Why is that happening? I know pointer objects have a little bit of extra overhead to them (4 bytes for the pointer address), but it shouldn't be enough to make THAT much of a difference. Could it be because the program is allocating memory inefficiently, and ending up with chunks of unallocated memory in between allocated memory?
Your code is wrong, or I don't see how this worked. With new Entry [count] you create a new array of Entry (type is Entry*), yet you assign it to Entry**, so I presume you used new Entry*[count].
What you did next was to create another new Entry object on the heap, and storing it in the globalEntries array. So you need memory for 400.000 pointers + 400.000 elements. 400.000 pointers take 3 MiB of memory on a 64-bit machine. Additionally, you have 400.000 single Entry allocations, which will all require sizeof (Entry) plus potentially some more memory (for the memory manager -- it might have to store the size of allocation, the associated pool, alignment/padding, etc.) These additional book-keeping memory can quickly add up.
If you change your second example to:
Entry* globalEntries;
globalEntries = new Entry[count];
for (...) {
globalEntries [i] = Entry (...);
}
memory usage should be equal to the stack approach.
Of course, ideally you'll use a std::vector<Entry>.
First of all, without specifying which column exactly you were watching, the number in task manager means nothing. On a modern operating system it's difficult even to define what you mean with "used memory" - are we talking about private pages? The working set? Only the stuff that stays in RAM? does reserved but not committed memory count? Who pays for memory shared between processes? Are memory mapped file included?
If you are watching some meaningful metric, it's impossible to see 3 MB of memory used - your object is at least 12 bytes (assuming 32 bit integers and no padding), so 400000 elements will need about 4.58 MB. Also, I'd be surprised if it worked with stack allocation - the default stack size in VC++ is 1 MB, you should already have had a stack overflow.
Anyhow, it is reasonable to expect a different memory usage:
the stack is (mostly) allocated right from the beginning, so that's memory you nominally consume even without really using it for anything (actually virtual memory and automatic stack expansion makes this a bit more complicated, but it's "true enough");
the CRT heap is opaque to the task manager: all it sees is the memory given by the operating system to the process, not what the C heap has "really" in use; the heap grows (requesting memory to the OS) more than strictly necessary to be ready for further memory requests - so what you see is how much memory it is ready to give away without further syscalls;
your "separate allocations" method has a significant overhead. The all-contiguous array you'd get with new Entry[size] costs size*sizeof(Entry) bytes, plus the heap bookkeeping data (typically a few integer-sized fields); the separated allocations method costs at least size*sizeof(Entry) (size of all the "bare elements") plus size*sizeof(Entry *) (size of the pointer array) plus size+1 multiplied by the cost of each allocation. If we assume a 32 bit architecture with a cost of 2 ints per allocation, you quickly see that this costs size*24+8 bytes of memory, instead of size*12+8 for the contiguous array in the heap;
the heap normally really gives away blocks that aren't really the size you asked for, because it manages blocks of fixed size; so, if you allocate single objects like that you are probably paying also for some extra padding - supposing it has 16 bytes blocks, you are paying 4 bytes extra per element by allocating them separately; this moves out memory estimation to size*28+8, i.e. an overhead of 16 bytes per each 12-byte element.
I've got a very basic application that boils down to the following code:
char* gBigArray[200][200][200];
unsigned int Initialise(){
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=new char;
return sizeof(gBigArray);
}
The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. I've loaded the application into VMMap by SysInternals and have the following values:
I'm unsure what Image means (listed under Type), although irrelevant of that its value is around what I'm expecting. What is really throwing things out for me is the Heap value, which is where the apparent enormous size is coming from.
What I don't understand is why this is? According to this answer if I've understood it correctly, gBigArray would be placed in the data or bss segment - however I'm guessing as each element is an uninitialised pointer it would be placed in the bss segment. Why then would the heap value be larger by a silly amount than what is required?
It doesn't sound silly if you know how memory allocators work. They keep track of the allocated blocks so there's a field storing the size and also a pointer to the next block, perhaps even some padding. Some compilers place guarding space around the allocated area in debug builds so if you write beyond or before the allocated area the program can detect it at runtime when you try to free the allocated space.
you are allocating one char at a time. There is typically a space overhead per allocation
Allocate the memory on one big chunk (or at least in a few chunks)
Do not forget that char* gBigArray[200][200][200]; allocates space for 200*200*200=8000000 pointers, each word size. That is 32 MB on a 32 bit system.
Add another 8000000 char's to that for another 8MB. Since you are allocating them one by one it probably can't allocate them at one byte per item so they'll probably also take the word size per item resulting in another 32MB (32 bit system).
The rest is probably overhead, which is also significant because the C++ system must remember how many elements an array allocated with new contains for delete [].
Owww! My embedded systems stuff would roll over and die if faced with that code. Each allocation has quite a bit of extra info associated with it and either is spaced to a fixed size, or is managed via a linked list type object. On my system, that 1 char new would become a 64 byte allocation out of a small object allocator such that management would be in O(1) time. But in other systems, this could easily fragment your memory horribly, make subsequent new and deletes run extremely slowly O(n) where n is number of things it tracks, and in general bring doom upon an app over time as each char would become at least a 32 byte allocation and be placed in all sorts of cubby holes in memory, thus pushing your allocation heap out much further than you might expect.
Do a single large allocation and map your 3D array over it if you need to with a placement new or other pointer trickery.
Allocating 1 char at a time is probably more expensive. There are metadata headers per allocation so 1 byte for a character is smaller than the header metadata so you might actually save space by doing one large allocation (if possible) that way you mitigate the overhead of each individual allocation having its own metadata.
Perhaps this is an issue of memory stride? What size of gaps are between values?
30 MB is for the pointers. The rest is for the storage you allocated with the new call that the pointers are pointing to. Compilers are allowed to allocate more than one byte for various reasons, like to align on word boundaries, or give some growing room in case you want it later. If you want 8 MB worth of characters, leave the * off your declaration for gBigArray.
Edited out of the above post into a community wiki post:
As the answers below say, the issue here is I am creating a new char 200^3 times, and although each char is only 1 byte, there is overhead for every object on the heap. It seems creating a char array for all chars knocks the memory down to a more believable level:
char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){
unsigned int mIndex=0;
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
return sizeof(gBigArray);
}
as the title says, I want to know in c++, whether the memory allocated by one new operation is consecutive...
BYTE* data = new BYTE[size];
In this code, whatever size is given, the returned memory region is consecutive. If the heap manager can't allocate consecutive memory of size, it's fail. an exception (or NULL in malloc) will be returned.
Programmers will always see the illusion of consecutive (and yes, infinite :-) memory in a process's address space. This is what virtual memory provides to programmers.
Note that programmers (other than a few embedded systems) always see virtual memory. However, virtually consecutive memory could be mapped (in granularity of 'page' size, which is typically 4KB) in physical memory in arbitrary fashion. That mapping, you can't see, and mostly you don't need to understand it (except for very specific page-level optimizations).
What about this?
BYTE* data1 = new BYTE[size1];
BYTE* data2 = new BYTE[size2];
Sure, you can't say the relative address of data1 and data2. It's generally non-deterministic. It depends on heap manager (such as malloc, often new is just wrapped malloc) policies and current heap status when a request was made.
The memory allocated in your process's address space will be contiguous.
How those bytes are mapped into physical memory is implementation-specific; if you allocate a very large block of memory, it is likely to be mapped to different parts of physical memory.
Edit: Since someone disagrees that the bytes are guaranteed to be contiguous, the standard says (3.7.3.1):
The allocation function attempts to allocate the requested amount of storage. If it is successful, it shall return the address of the start of a block of storage whose length in bytes shall be at least as large as the requested size.
Case 1:
Using "new" to allocate an array, as in
int* foo = new int[10];
In this case, each element of foo will be in contiguous virtual memory.
Case 2:
Using consecutive "new" operations non-atomically, as in
int* foo = new int;
int* bar = new int;
In this case, there is never a guarantee that the memory allocated between calls to "new" will be adjacent in virtual memory.
The virtual addresses of the allocated bytes will be contiguous. They will also be physically contiguous within resident pages backing the address space of your process. The mapping of physical pages to regions of the process virtual space is very OS and platform specific, but in general you cannot assume physically contiguous range larger then or not aligned on a page.
If by your question you mean "Will successive (in time) new() operations return adjacent chunks of memory, with no gaps in between?", this old programmer will suggest, very politely, that you should not rely on it.
The only reason that question would come up was if you intended to walk a pointer "out" of one data object and "into" the next one. This is a really bad idea, since you have no guarantee that the next object in the address space is of anything remotely resembling the same type as the previous one.
Yes.
Don't bother about the "virtual memory" issue: apart that there could be cases when you haven't at all a system that supports virtual memory, from your PoV you get a consecutive memory chunk. That's all.
Physical memory is never contiguous its logical memory which is contiguous.