Where does C++ create stack and heap in memory? - c++

I'm using Microsoft Visual Studio 2008
When I create a pointer to an object, it will receive a memory address which in my case is an 8 digit hexadecimal number. E.g.: 0x02e97fc0
With 8 hexadecimal digits a computer can address 4GB of memory. I've got 8GB of memory in my computer:
Does that mean that my IDE is not using more than 4GBs out of my memory?
Is the IDE able to address only the first 4GB of my memory or any 4GB out of the 8GBs not used?
The question is not only about the size of the memory used. It is also about the location of the memory used. Latter hasn't been detailed here: The maximum amount of memory any single process on Windows can address

Where does C++ create stack and heap in memory?
Well, C++ does not really handle memory, it ask the operating system to do so. When a binary object (.exe, .dll, .so ...) is loaded into memory, this is the OS which allocate memory for the stack. When you dynamically allocate memory with new, you're asking the OS for some space in the heap.
1) Does that mean that my IDE is not using more than 4GBs out of my memory?
No, not really. In fact, modern OS like Windows use what is called virtual address space. It maps an apparently contiguous memory segment (say 0x1000 to 0xffff) to a segment of virtual space just for your program; you have absolutely no guarantee over where your objects really lie in memory. When an address is dereferenced, the OS do some magic and let your program access the physical address in memory.
Having 32 bits addresses means a single instance of your program can't use more that 4GB of memory. Two instances of your same program can, since the OS can allocate two different segments of physical address inside the apparently same segment of virtual address (0x00000000 to 0xffffffff). And Windows will allocate yet more overlapping address spaces for its own processes.
2) Is the IDE able to address only the first 4GB of my memory or any 4GB out of the 8GBs not used?
Any. Even non-contiguous memory, even disk memory ... no one can tell.
Found some Microsoft source in the comments about it: https://msdn.microsoft.com/en-us/library/aa366778.aspx

Related

Can I create an array exceeding RAM, if I have enough swap memory?

Let's say I have 8 Gigabytes of RAM and 16 Gigabytes of swap memory. Can I allocate a 20 Gigabyte array there in C? If yes, how is it possible? What would that memory layout look like?
[linux] Can I create an array exceeding RAM, if I have enough swap memory?
Yes, you can. Note that accessing swap is veerry slooww.
how is it possible
Allocate dynamic memory. The operating system handles the rest.
How would that memory layout look like?
On an amd64 system, you can have 256 TiB of address space. You can easily fit a contiguous block of 8 GiB in that space. The operating system divides the virtual memory into pages and copies the pages between physical memory and swap space as needed.
Modern operating systems use virtual memory. In Linux and most other OSes rach process has it's own address space according to the abilities of the architecture. You can check the size of the virtual address space in /proc/cpuinfo. For example you may see:
address sizes : 43 bits physical, 48 bits virtual
This means that virtual addresses use 48 bit. Half of that is reserved for the kernel so you only can use 47 bit, or 128TiB. Any memory you allocate will be placed somewhere in those 128 TiB of address space as if you actually had that much memory.
Linux uses demand page loading and per default over commits memory. When you say
char *mem = (char*)malloc(1'000'000'000'000);
what happens is that Linux picks a suitable address and just records that you have allocated 1'000'000'000'000 (rounded up to the nearest page) of memory starting at that point. (It does some sanity check that the amount isn't totally bonkers depending on the amount of physical memory that is free, the amount of swap that is free and the overcommit setting. Per default you can allocate a lot more than you have memory and swap.)
Note that at this point no physical memory and no swap space is connected to your allocated block at all. This changes when you first write to the memory:
mem[4096] = 0;
At this point the program will page fault. Linux checks the address is actually something your program is allowed to write to, finds a physical page and map it to &mem[4096]. Then it lets the program retry to write there and everything continues.
If Linux can't find a physical page it will try to swap something out to make a physical page available for your programm. If that also fails your program will receive a SIGSEGV and likely die.
As a result you can allocate basically unlimited amounts of memory as long as you never write to more than the physical memory and swap and support. On the other hand if you initialize the memory (explicitly or implicitly using calloc()) the system will quickly notice if you try to use more than available.
You can, but not with a simple malloc. It's platform-dependent.
It requires an OS call to allocate swapable memory (it's VirtualAlloc on Windows, for example, on Linux it should be mmap and related functions).
Once it's done, the allocated memory is divided into pages, contiguous blocks of fixed size. You can lock a page, therefore it will be loaded in RAM and you can read and modify it freely. For old dinosaurs like me, it's exactly how EMS memory worked under DOS... You address your swappable memory with a kind of segment:offset method: first, you divide your linear address by the page size to find which page is needed, then you use the remainder to get the offset within this page.
Once unlocked, the page remains in memory until the OS needs memory: then, an unlocked page will be flushed to disk, in swap, and discarded in RAM... Until you lock (and load...) it again, but this operation may requires to free RAM, therefore another process may have its unlocked pages swapped BEFORE your own page is loaded again. And this is damnly SLOOOOOOW... Even on a SSD!
So, it's not always a good thing to use swap. A better way is to use memory mapped files - perfect for reading very big files mostly sequentially, with few random accesses - if it can suits your needs.

Best way to allocate large memory

In my Visual C++ app, I know the total objects(CMyObject) to be allocated is 16728064 and each object is 64 byte, so the total memory to be allocated is 1GB. The memory will be allocated in the beginning, used in the whole lifetime of the app, and release in the end.
In such a case, what is the best way to allocate the memory?
Current I try to allocate the memory at the beginning, as follows:
CMyObject *p = new CMyObject[16728064];
// Perform tasks.
delete [] p;
But the allocation will fail for most of the time. Now I want to do as follows:
CMyObject *p[10];
p[0] = new CMyObject[1672806];
p[1] = new CMyObject[1672806];
…
// Perform tasks
Delete [] p[0];
….
This seems to work for some time.
Therefore, should I split the allocation into pieces as small as possible? Or are there any good solutions for such a situation?
Thanks
In general, yes you should split larger allocations into smaller fragments. Depending on your system, it may not have 1GB of contiguous memory.
Assuming this is X86 processor or something similar, only the virtual address space is contiguous. For X86, physical memory is composed of 4096 byte pages, and the physical pages do not have to be contiguous, only the mapped virtual address space.
When I run Windows XP 32 bit, on a system with 4GB, it shows 3.6 GB of physical memory available, and usually my test programs don't have a problem with allocating 1 GB, with failures to allocate memory occurring somewhere between 1.5GB and 2GB.
My guess is the reason for failure with large allocations of available physical memory has to do with the operating system as opposed to a processor virtual to physical mapping limitation.
What operating system are you using?

What is the max addressable memory space in a 32-bit C++ program?

In debug mode I saw that the pointers have addresses like 0x01210040,
but as I realized, 0x means hexadecimal right? And there're 8 hex digits, i.e. in total there're are 128 bits that are addressed?? So does that mean that for 32-bit system the first two digits are always 0, and for a 64-bit system the first digit is 0?
Also, may I ask that, for a 32-bit program, would I be able to allocate as much as 3GB of memory as long as I remain in the heap and use only malloc()? Or is there some limitations the Windows system poses on a single thread? (the IDE I'm using is VS2012)
Since actually I was running a 32-bit program in a 64-bit system, but the program crashed with a memory leak when it only allocated about 1.5GB of memory...and I can't seem to figure out why.
(Oooops...sorry guys I think I made a simple mistake with the first question...indeed one hex digit is 4 bits, and 8 makes 32bits. However here is another question...how is address represented in a 64-bit program?)
For 32-bit Windows, the limit is actually 2GB usable per process, with virtual addresses from 0x00000000 (or simply 0x0) through 0x7FFFFFFF. The rest of the 4GB address space (0x80000000 through 0xFFFFFFFF) for use by Windows itself. Note that these have nothing to do with the actual physical memory addresses.
If your program is large address space aware, this limit is increased to 3GB on 32bit systems and 4GB for 32bit programs running on 64bit Windows.
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366912(v=vs.85).aspx
And for the higher limits for large address space aware programs (IMAGE_FILE_LARGE_ADDRESS_AWARE), see here:
http://msdn.microsoft.com/en-us/library/aa366778.aspx
You might also want to take a look at the Virtual Memory article on Wikipedia to better understand how the mapping between virtual addresses and physical addresses works. The first MSDN link above also has a short explanation:
The virtual address space for a process is the set of virtual memory
addresses that it can use. The address space for each process is
private and cannot be accessed by other processes unless it is shared.
A virtual address does not represent the actual physical location of
an object in memory; instead, the system maintains a page table for
each process, which is an internal data structure used to translate
virtual addresses into their corresponding physical addresses. Each
time a thread references an address, the system translates the virtual
address to a physical address. The virtual address space for 32-bit
Windows is 4 gigabytes (GB) in size and divided into two partitions:
one for use by the process and the other reserved for use by the
system. For more information about the virtual address space in 64-bit
Windows, see Virtual Address Space in 64-bit Windows.
EDIT: As user3344003 points out, these values are not the amount of memory you can allocate using malloc or otherwise use for storing values, they just represent the size of the virtual address space.
There are a number of limits that would restrict the size of your malloc allocation.
1) The number of bits, restricts the size of the address space. For 32-bits, that is 4B.
2) System the subdivide that for the various processor modes. These days, usually 2GB goes to the user and 2GB to the kernel.
3) The address space may be limited by the size of the page tables.
4) The total virtual memory may be limited by the size of the page file.
5) Before you start malloc'ing, there be stuff already in the virtual address space (e.g., code stack, reserved area, data). Your malloc needs to return a contiguous block of memory. Largest theoretical block it could return has to fit within unallocated areas of virtual memory.
6) Your memory management heap may restrict the size that can be allocated.
There probably other limitations that I have omitted.
-=-=-=-=-
If your program crashed after allocating 1.5GB through malloc, did you check the return value from malloc to see if it was not null?
-=-=-=-=-=
The best way to allocate huge blocks of memory is through operating system services to map pages into the virtual address space.---not using malloc.
In reference to the following article
For a 32-bit application launched in a 32-bit Windows, the total size of all the mentioned data types must not exceed 2 Gbytes.
The same 32-bit program launched in a 64-bit system can allocate about 4 Gbytes (actually about 3.5 Gbytes)
The practical data you are looking at is around 1.7 GB due to space occupied by windows.
By any chance how did you find out the memory it had allocated when it crashed.?

Can a pointer point to an address after 4GB?

If we compile and execute the code below:
int *p;
printf("%d\n", (int)sizeof(p));
it seems that the size of a pointer to whatever the type is 4 bytes, which means 32 bit, so 232 adresses are possible to store in a pointer. Since every address is associated to 1 byte, 232 bytes give 4 GB.
So, how can a pointer point to the address after 4 GB of memory? And how can a program use more than 4 GB of memory?
By principle, if you can't represent an address which goes over 2^X-1 then you can't address more than 2^X bytes of memory.
This is true for x86 even if some workarounds have been implemented and used (like PAE) that allows to have more physical memory even if with limits imposed by the fact that these are more hacks than real solutions to the problem.
With a 64 bit architecture the standard size of a pointer is doubled, so you don't have to worry anymore.
Mind that, in any case, virtual memory translates addresses from the process space to the physical space so it's easy to see that a hardware could support more memory even if the maximum addressable memory from the process point of view is still limited by the size of a pointer.
"How can a pointer point to the address after 4GB of memory?"
There is a difference between the physical memory available to the processor and the "virtual memory" seen by the process. A 32 bit process (which has a pointer of size 4 bytes) is limited to 4GB however the processor maintains a mapping (controlled by the OS) that lets each process have its own memory space, up to 4GB each.
That way 8GB of memory could be used on a 32 bit system, if there were two processes each using 4GB.
To access >4GB of address space you can do one of the following:
Compile in x86_64 (64 bit) on a 64 bit OS. This is the easiest.
Use AWE memory. AWE allows mapping a window of memory which (usually) resides above 4GB. The window address can be mapped and remapped again and again. Was used in large database applications and RAM drives in the 32 bit era.
Note that a memory address where the MSB is 1 is reserved for the kernel. Windows allows under several conditions to use up to 3GB (per process), the top 1GB is always for the kernel.
By default a 32 bit process has 2GB of user mode address space. It's possible to get 3GB via a special linker flag (in VS: /LARGEADDRESSAWARE).

virtual v. physical memory in assessing C/C++ memory leak

I have a C++ application that I am trying to iron the memory leaks out of and I realized I don't fully understand the difference between virtual and physical memory.
Results from top (so 16.8g = virtual, 111m = physical):
4406 um 20 0 16.8g 111m 4928 S 64.7 22.8 36:53.65 client
My process holds 500 connections, one for each user, and at these numbers it means there is about 30 MB of virtual overhead for each user. Without going into the details of my application, the only way this could sound remotely realistic, adding together all the vectors, structs, threads, functions on the stack, etc., is if I have no idea what virtual memory actually means. No -O optimization flags, btw.
So my questions are:
what operations in C++ would inflate virtual memory so much?
Is it a problem if my task is using gigs of virtual memory?
The stack and heap function variables, vectors, etc. - do those necessarily increase the use of physical memory?
Would removing a memory leak (via delete or free() or such) necessarily reduce both physical and virtual memory usage?
Virtual memory is what your program deals with. It consists of all of the addresses returned by malloc, new, et al. Each process has its own virtual-address space. Virtual address usage is theoretically limited by the address size of your program: 32-bit programs have 4GB of address space; 64-bit programs have vastly more. Practically speaking, the amount of virtual memory that a process can allocate is less than those limits.
Physical memory are the chips soldered to your motherboard, or installed in your memory slots. The amount of physical memory in use at any given time is limited to the amount of physical memory in your computer.
The virtual-memory subsystem maps virtual addresses that your program uses to physical addresses that the CPU sends to the RAM chips. At any particular moment, most of your allocated virtual addresses are unmapped; thus physical memory use is lower than virtual memory use. If you access a virtual address that is allocated but not mapped, the operating system invisibly allocates physical memory and maps it in. When you don't access a virtual address, the operating system might unmap the physical memory.
To take your questions in turn:
what operations in C++ would inflate virtual memory so much?
new, malloc, static allocation of large arrays. Generally anything that requires memory in your program.
Is it a problem if my task is using gigs of virtual memory?
It depends upon the usage pattern of your program. If you allocate vast tracks of memory that you never, ever touch, and if your program is a 64-bit program, it may be okay that you are using gigs of virtual memory.
Also, if your memory use grows without bound, you will eventually run out of some resource.
The stack and heap function variables, vectors, etc. - do those necessarily increase the use of physical memory?
Not necessarily, but likely. The act of touching a variable ensures that, at least momentarily, it (and all of the memory "near" it) is in physical memory. (Aside: containers like std::vector may be allocated on either stack or heap, but the contained objects are allocated on the heap.)
Would removing a memory leak (via delete or free() or such) necessarily reduce both physical and virtual memory usage?
Physical: probably. Virtual: yes.
Virtual memory is the address space used by the process. Each process has a full view of the 64 bit (or 32, depending on the architecture) addressable bytes of a pointer, but not every byte maps to something real. The operating system manages the table that maps virtual address to real physical memory pages -- or whatever that address really is (no matter it seems to be memory for your application). For instance, for your application an address may point to some function, but in reality it has not yet been loaded from disk, and when you call it, it generates a page fault interruption, that the kernel treats by loading the appropriated section from the executable and mapping it to the address space of your application, so it can be executed.
From a Linux perspective (and I believe most modern OS's):
Allocating memory inflates virtual memory. Actually using the allocated memory inflates physical memory usage. Do it too much and it will be swapped to disk, and eventually your process will be killed.
mmaping files will increase only virtual memory usage, this includes the size of the executables: the larger they are, the more virtual memory used.
The only problem of using up virtual memory is that you may have it depleted. It is mainly an issue on 32 bits system, where you only have 4gb of it (and 1gb is reserved for the kernel, so application data only have 3gb).
Function calls, that allocates variables on stack, may increase physical memory usage, but you (usually) won't leak this memory.
Allocated heap variables takes up virtual memory, but will only actually get the physical memory if you read/write on them.
Freeing or deleting variables does not necessarily reduces virtual/physical memory consumption, it depends on the allocator internals, but usually does.
You can set following environment variables to control internal memory allocations by malloc. After setting it, it will answer all four questions. If you want to know other options please refer :
http://man7.org/linux/man-pages/man3/mallopt.3.html
export MALLOC_MMAP_THRESHOLD_=8192
export MALLOC_ARENA_MAX=4