Why do successive calls to new[] not allocate contiguous memory? - c++

I am using Ubuntu 14.04 64-bit. Here is my C++ code to see how memory is used.
int main() {
int **ptr;
ptr = new int* [2];
cout << &ptr << " -> " << ptr << endl;
for (int r = 1; r <= 2; r++) {
ptr[r-1] = new int [2 * r];
cout << &ptr[r-1] << " -> " << ptr[r-1] << endl;
for (int c = 0; c < 2 * r; c++) {
ptr[r-1][c] = r * c;
cout << &ptr[r-1][c] << " -> " << ptr[r-1][c] << endl;
}
}
return 0;
}
Here is my output:
0x7fff09faf018 -> 0x1195010
0x1195010 -> 0x1195030
0x1195030 -> 0
0x1195034 -> 1
0x1195018 -> 0x1195050
0x1195050 -> 0
0x1195054 -> 2
0x1195058 -> 4
0x119505c -> 6
I expected the OS would allocate memory contiguously. So ptr[0][0] would be at 0x1195020 instead of 0x1195030!? What does OS use at 0x1195020 - 0x119502F, 0x1195038 - 0x0x119504F for?

Because:
Some space at the beginning and end of each block of allocated memory is often used for bookkeeping. (In particular, many allocators find it useful to store the size of the preceding/following blocks, or pointers to them, around there.)
The memory allocator may "round up" the size of an allocated block to make things easier for it. For instance, an allocation of 7 bytes will likely be rounded up to 8 bytes, if not even 16 or 32.
Blocks of memory may already be available in noncontiguous locations. (Keep in mind that the C runtime may have been making some memory allocations of its own before main() even runs.)
The allocator may have a plan in mind for laying out memory which would be ruined by putting the next block at the "next" address. (It may, for instance, have reserved that memory for allocations of a particular size.)
Why should it? There are no guarantees. Allocated memory could end up anywhere. (Well, almost.) Don't make any assumptions, just let memory go wherever the allocator says it'll go, and you'll be fine.

Related

Beginner quesiton memory allocation c++ [duplicate]

This question already has answers here:
What are the differences between virtual memory and physical memory?
(6 answers)
Closed 1 year ago.
I'am currently learning c++. During some heap allocation exercises I tried to generate a bad allocation. My physical memory is about 38GB. Why is it possible to allocate such a high amount of memory? Is my basic calculation of bytes wrong? I don't get it. Can anyone give me a hint please? Thx.
#include <iostream>
int main(int argc, char **argv){
const size_t MAXLOOPS {1'000'000'000};
const size_t NUMINTS {2'000'000'000};
int* p_memory {nullptr};
std::cout << "Starting program heap_overflow.cpp" << std::endl;
std::cout << "Max Loops: " << MAXLOOPS << std::endl;
std::cout << "Number of Int per allocation: " << NUMINTS << std::endl;
for(size_t loop=0; loop<MAXLOOPS; ++loop){
std::cout << "Trying to allocate new heap in loop " << loop
<< ". current allocated mem = " << (NUMINTS * loop * sizeof(int))
<< " Bytes." << std::endl;
p_memory = new (std::nothrow) int[NUMINTS];
if (nullptr != p_memory)
std::cout << "Mem Allocation ok." << std::endl;
else {
std::cout << "Mem Allocation FAILED!." << std::endl;
break;
}
}
return 0;
}
Output:
...
Trying to allocate new heap in loop 17590. current allocated mem = 140720000000000 Bytes.
Mem Allocation ok.
Trying to allocate new heap in loop 17591. current allocated mem = 140728000000000 Bytes.
Mem Allocation FAILED!.
Many (but not all) virtual-memory-capable operating systems use a concept known as demand-paging - when you allocate memory, you perform bookkeeping allowing you to use that memory. However, you do not reserve actual pages of physical memory at that time.1
When you actually attempt to read or write to any byte within a page of that allocated memory, a page fault occurs. The fault handler detects that the page has been pre-allocated but not demand-paged in. It then reserves a page of physical memory, and sets up the PTE before returning control to the program.
If you attempt to write into the memory you allocate right after each allocation, you may find that you run out of physical memory much faster.
Notes:
1 It is possible to have an OS implementation that supports virtual memory but immediately allocates physical memory to back virtual allocations; virtual memory is a necessary, but not sufficient condition, to replicate your experiment.
One comment mentions swapping to disk. This is likely a red herring - the pagefile size is typically comparable in size to memory, and the total allocation was around 140 TB which is much larger than individual disks. It's also ineffective to page-to-disk empty, untouched pages.

why my struct takes more memory than requested?

I am testing the following code with Visual Studio 2019 Diagnostic Tools.
It says that memory consumption is 55 KB instead of the 20 KB I previously calculated. As you can see, it is much more memory than I thought and I don't know why.
What I want to know is: what is happening or how could I calculate the correct memory consumption? (since I don't always have the "Diagnostic Tools" at hand.)
#include <iostream>
#define TEST_SIZE_ARR 1000
struct Node
{
Node(int)
: id(0),
time(0),
next(0),
back(0)
{}
int id;
int time;
Node* next;
Node* back;
};
int main()
{
int counter = 0;
std::cout << "= Node =" << std::endl;
std::cout << "Array size: " << sizeof(Node*) << " * " << TEST_SIZE_ARR << " = " << sizeof(Node*) * TEST_SIZE_ARR << std::endl;
std::cout << "Element size: " << sizeof(Node) << " * " << TEST_SIZE_ARR << " = " << sizeof(Node) * TEST_SIZE_ARR << std::endl;
Node **dataArr = new Node*[TEST_SIZE_ARR]; //break point
for (counter = 0; counter < TEST_SIZE_ARR; counter++) //break point
{
dataArr[counter] = new Node(counter);
}
counter++; //break point
return 0;
}
Console:
Array size: 4 * 1000 = 4000
Element size: 16 * 1000 = 16000
Diagnostic tool:
Array size: 3.94 KB
Element size: 50.78 KB
Your diagnostic tool is measuring an allocation overhead of 36 bytes per allocation.
50.78 KB is 52000 bytes, or 52 bytes per element allocation. Minus 16 is 36 bytes.
4000 bytes with 36 bytes overhead is 4036 bytes, which is 3.94 KB.
The heap has to track which blocks of memory are in use and which are not. Possibly your diagnostic tool has additional overhead and self measures stupidly; I don't know.
In your case, it appears to be adding an additional 36 bytes per value returned from new. Your system seems to be 32 bit pointers (ick), so that is enough room for 9 pointers. You probably want to include the size of each allocation in its block, which is 4 bytes on a 32 bit system. That leaves 8 pointers.
What your heap is using those 8 pointers for, I don't know. Maybe a skip list, or a red black tree, or even some buffers around each allocation to detect memory corruption because you profiled a debug build and heap.
In general, small heap allocations are inefficient and a bad idea. It is one of the many reasons why block containers, like std vector, are good idea, and node containers are iffy.

Stack and Heap address region is differs in Windows and linux

I'm now testing the address area of heap and stack in C++
my code is
#include <iostream>
using namespace std;
int g;
int uninitialized_g;
class Heap{
int a;
int b;
};
int main() {
int stack_variable = 3;
int stack_variable_1 = 3;
g = 3;
Heap * heap_class = new Heap;
Heap * heap_class_1 = new Heap;
cout << "Static initialized g's addr = " << &g << endl;
cout << "Static un-initialized g's addr = " << &uninitialized_g << endl;
cout << "Stack stack_variable's addr = " << &stack_variable << endl;
cout << "Stack stack_variable1's addr = " << &stack_variable_1 << endl;
cout << "Heap heap_class's addr = " << heap_class << endl;
cout << "Heap heap_class1's addr = " << heap_class_1 << endl;
delete heap_class;
delete heap_class_1;
return 0;
}
and in windows eclipse with MinGW, the result is
Static initialized g's addr = 0x407020
Static un-initialized g's addr = 0x407024
Stack stack_variable's addr = 0x22fed4
Stack stack_variable1's addr = 0x22fed0
Heap heap_class's addr = 0x3214b0
Heap heap_class1's addr = 0x3214c0
and in linux with g++ result is
Static initialized g's addr = 0x601180
Static un-initialized g's addr = 0x601184
Stack stack_variable's addr = 0x7ffff5c8c2c8
Stack stack_variable1's addr = 0x7ffff5c8c2cc
Heap heap_class's addr = 0x1c7c010
Heap heap_class1's addr = 0x1c7c030
which make sense to me.
So, the questions are,
In windows result, why is heap memory address allocated sometimes higher than stack?
In linux, heap addressing makes sense. But why stack address grows higher?
Thanks in advance.
Your program runs in an environment called operating system. So there is more code in action as you probably expected.
1) Stack & Heap
The stack address of the first thread is defined by the operating system. You might set some values in the PE32 exe file header that request some specific value. But this is at least different on Linux.
The C runtime library requests the operating system for some memory. IIRC with the function s_brk. The operating system can provide memory as it likes. Keep in mind that despite you have a linear address space you don't have a contiguous memory layout. It more reminds to a swiss cheese.
2) Addresses of local variables
This is a not specified behavior. It is free to the compiler to assignd the order in memory for the local variables. Sometimes I have seen that the order is alphabetical (just try a rename) or that it changes with the level of optmization. Just accept it.

Maximum memory that can be allocated dynamically and at compile time in c++

I am playing around to understand how much memory can be allocated. Initially I thought that the maximum memory which can be allocated is equal to Physical memory (RAM). I checked my RAM on Ubuntu 12.04 by running the command as shown below:
~$ free -b
total used free shared buffers cached
Mem: 3170848768 2526740480 644108288 0 265547776 1360060416
-/+ buffers/cache: 901132288 2269716480
Swap: 2428497920 0 2428497920
As shown above,total physical memory is 3Gig (3170848768 bytes) out of which only 644108288 bytes is free, so I assumed I can at max allocate only this much memory. I tested it by writing the small program with only two lines below:
char * p1 = new char[644108290] ;
delete p1;
Since code ran perfectly , it means it allocated memory successfully. Also I tried to allocate the memory greater than the available physical free memory still it did not throw any error. Then per question
maximum memory which malloc can allocate
I thought it must be using the virtual memory.So I tested the code for free swap memory and it also worked.
char * p1 = new char[2428497920] ;
delete p1;
The I tried to allocate the free swap plus free RAM bytes of memory
char * p1 = new char[3072606208] ;
delete p1;
But this time code failed throwing the bad_alloc exception.Why the code didn't work this time.
Now I allocated the memory at compile time in a new program as shown below:
char p[3072606208] ;
char p2[4072606208] ;
char p3[5072606208];
cout<<"Size of array p = " <<sizeof p <<endl;
cout<<"Size of array p2 = " <<sizeof p2<<endl;
cout<<"Size of array p2 = " <<sizeof p3;
The out put shows
Size of array p = 3072606208
Size of array p1 = 4072606208
Size of array p2 = 777638912
Could you please help me understand what is happening here. Why did it allowed the memory to be allocated at the compile time but not at dynamically.
When allocated compile time how come p and p1 were able to allocate memory greater than swap plus free RAM memory. Where as p2 failed.
How exactly is this working. Is this some undefined behaviour or os specific behaviour. Thanks for your help. I am using Ubuntu 12.04 and gcc 4.6.3.
Memory pages aren't actually mapped to your program until you use them. All malloc does is reserve a range of the virtual address space. No physical RAM is mapped to those virtual pages until you try to read or write them.
Even when you allocate global or stack ("automatic") memory, there's no mapping of physical pages until you touch them.
Finally, sizeof() is evaluated at compile time, when the compiler has no idea what the OS will do later. So it will just tell you the expected size of the object.
You'll find that things will behave very differently if you try to memset the memory to 0 in each of your cases. Also, you might want to try calloc, which zeroes its memory.
Interesting.... one thing to note: when you write
char p[1000];
you allocate (well, reserve) 100 bytes on the stack.
When you write
char* p = malloc(100);
you allocate 100 bytes on the heap. Big difference. Now I don't know why the stack allocations are working - unless the value between the [] is being read as an int by the compiler and is thus wrapping around to allocate a much smaller block.
Most OSs don't allocate physical memory anyway, they give you pages from a virtual address space which remains unused (and therefore unallocated) until you use them, then the memory-manager unit of the CPU will nip in to give you the memory you asked for. Try writing to those bytes you allocated and see what happens.
Also, on windows at least, when you allocate a block of memory, you can only reserve the largest contiguous block the OS has available - so as the memory gets fragmented by repeated allocations, the largest side block you can malloc reduces. I don't know if Linux has this problem too.
There's a huge difference between these two programs:
program1.cpp
int main () {
char p1[3072606208];
char p2[4072606208];
char p3[5072606208];
std::cout << "Size of array p1 = " << sizeof(p1) << std::endl;
std::cout << "Size of array p2 = " << sizeof(p2) << std::endl;
std::cout << "Size of array p3 = " << sizeof(p3) << std::endl;
}
program2.cpp:
char p1[3072606208];
char p2[4072606208];
char p3[5072606208];
int main () {
std::cout << "Size of array p1 = " << sizeof(p1) << std::endl;
std::cout << "Size of array p2 = " << sizeof(p2) << std::endl;
std::cout << "Size of array p3 = " << sizeof(p3) << std::endl;
}
The first allocates memory on the stack; it's going to get a segmentation fault due to stack overflow. The second doesn't do much at all. That memory doesn't quite exist yet. It's in the form of data segments that aren't touched. Let's modify the second program so that the data are touched:
char p1[3072606208];
char p2[4072606208];
char p3[5072606208];
int main () {
p1[3072606207] = 0;
p2[3072606207] = 0;
p3[3072606207] = 0;
std::cout << "Size of array p1 = " << sizeof(p1) << std::endl;
std::cout << "Size of array p2 = " << sizeof(p2) << std::endl;
std::cout << "Size of array p3 = " << sizeof(p3) << std::endl;
}
This doesn't allocate memory for p1, p2, or p3 on the heap or the stack. That memory lives in data segments. It's a part of the application itself. There's one big problem with this: On my machine, this version won't even link.
The first thing to note is that in modern computers is that processes do not get direct access to RAM (at the application level). Rather the OS will provide each process with a "virtual address space". The OS intercepts calls to access virtual memory reserves real memory as and when needed.
So when malloc or new says it's found enough memory for you, it just means that its found enough memory for you in the virtual address space. You can check this by running the following program with the memset line and with it commented out. (careful, this program uses a busy loop).
#include <iostream>
#include <new>
#include <string.h>
using namespace std;
int main(int argc, char** argv) {
size_t bytes = 0x7FFFFFFF;
size_t len = sizeof(char) * bytes;
cout << "len = " << len << endl;
char* arr = new char[len];
cout << "done new char[len]" << endl;
memset(arr, 0, len); // set all values in array to 0
cout << "done setting values" << endl;
while(1) {
// stops program exiting immediately
// press Ctrl-C to exit
}
return 0;
}
When memset is part of the program you will notice the memory used by your computer jumps massively, and without it you should barely notice any difference if any. When memset it called is accessed all the elements of the array, forcing the OS to make the space available in physical memory. Since the argument for new is a size_t (see here) then the maximum argument you can call it with is 2^32-1, though this isn't guaranteed to succeed (it certainly doesn't on my machine).
As for your stack allocations: David Hammem's answer says it better than I could. I am surprised you were able to compile those programs. Using the same setup as you (Ubuntu 12.04 and gcc 4.6) I get compile errors like:
test.cpp: In function ‘int main(int, char**)’:
test.cpp:14:6: error: size of variable ‘arr’ is too large
try the following code:
bool bExit = false;
unsigned int64 iAlloc = 0;
do{
char *test = NULL;
try{
test = new char[1]();
iAlloc++;
}catch(bad_alloc){
bExit = true;}
}while(!bExit);
char chBytes[130] = {0};
sprintf(&chBytes, "%d", iAlloc);
printf(&chBytes);
In one run don't open other programms, in the other run load a few large files in an application which use memory mapped files.
This may help you to understand.

C++ free() changing other memory

I started noticing that sometimes when deallocating memory in some of my programs, they would inexplicably crash. I began narrowing down the culprit and have come up with an example that illustrates a case that I am having difficulty understanding:
#include <iostream>
#include <stdlib.h>
using namespace std;
int main() {
char *tmp = (char*)malloc(16);
char *tmp2 = (char*)malloc(16);
long address = reinterpret_cast<long>(tmp);
long address2 = reinterpret_cast<long>(tmp2);
cout << "tmp = " << address << "\n";
cout << "tmp2 = " << address2 << "\n";
memset(tmp, 1, 16);
memset(tmp2, 1, 16);
char startBytes[4] = {0};
char endBytes[4] = {0};
memcpy(startBytes, tmp - 4, 4);
memcpy(endBytes, tmp + 16, 4);
cout << "Start: " << static_cast<int>(startBytes[0]) << " " << static_cast<int>(startBytes[1]) << " " << static_cast<int>(startBytes[2]) << " " << static_cast<int>(startBytes[3]) << "\n";
cout << "End: " << static_cast<int>(endBytes[0]) << " " << static_cast<int>(endBytes[1]) << " " << static_cast<int>(endBytes[2]) << " " << static_cast<int>(endBytes[3]) << "\n";
cout << "---------------\n";
free(tmp);
memcpy(startBytes, tmp - 4, 4);
memcpy(endBytes, tmp + 16, 4);
cout << "Start: " << static_cast<int>(startBytes[0]) << " " << static_cast<int>(startBytes[1]) << " " << static_cast<int>(startBytes[2]) << " " << static_cast<int>(startBytes[3]) << "\n";
cout << "End: " << static_cast<int>(endBytes[0]) << " " << static_cast<int>(endBytes[1]) << " " << static_cast<int>(endBytes[2]) << " " << static_cast<int>(endBytes[3]) << "\n";
free(tmp2);
return 0;
}
Here is the output that I am seeing:
tmp = 8795380
tmp2 = 8795400
Start: 16 0 0 0
End: 16 0 0 0
---------------
Start: 17 0 0 0
End: 18 0 0 0
I am using Borland's free compiler. I am aware that the header bytes that I am looking at are implementation specific, and that things like "reinterpret_cast" are bad practice. The question I am merely looking to find an answer to is: why does the first byte of "End" change from 16 to 18?
The 4 bytes that are considered "end" are 16 bytes after tmp, which are 4 bytes before tmp2. They are tmp2's header - why does a call to free() on tmp affect this place in memory?
I have tried the same example using new [] and delete [] to create/delete tmp and tmp2 and the same results occur.
Any information or help in understanding why this particular place in memory is being affected would be much appreciated.
You will have to ask your libc implementation why it changes. In any case, why does it matter? This is a memory area that libc has not allocated to you, and may be using to maintain its own data structures or consistency checks, or may not be using at all.
Basically you are looking at memory you didn't allocate. You can't make any supposition on what happens to the memory outside what you requested (ie the 16 bytes you allocated). There is nothing abnormal going on.
The runtime and compilers are free to do whatever they want to do with them so you should not use them in your programs. The runtime probably change the values of those bytes to keep track of its internal state.
Deallocating memory is very unlikely to crash a program. On the other hand, accessing memory you have deallocated like in your sample is big programming mistake that is likely to do so.
A good way to avoid this is to set any pointers you free to NULL. Doing so you'll force your program to crash when accessing freed variables.
It's possible that the act of removing an allocated element from the heap modifies other heap nodes, or that the implementation reserves one or more bytes of headers for use as guard bytes from previous allocations.
The memory manager must remember for example what is the size of the memory block that has been allocated with malloc. There are different ways, but probably the simplest one is to just allocate 4 bytes more than the size requested in the call and store the size value just before the pointer returned to the caller.
The implementation of free can then subtract 4 bytes from the passed pointer to get a pointer to where the size has been stored and then can link the block (for example) to a list of free reusable blocks of that size (may be using again those 4 bytes to store the link to next block).
You are not supposed to change or even look at bytes before/after the area you have allocated. The result of accessing, even just for reading, memory that you didn't allocate is Undefined Behavior (and yes, you really can get a program to really crash or behave crazily just because of reading memory that wasn't allocated).