I have learn in the few past days the issue with memory overcommitment (when memory overcommit is activated, which is usually a default), which basically means that:
void* p = malloc(100);
the operative system gives you 100 contiguous (virtual) addresses taken from the (virtual) address space of your process, whose total range is OS-defined. Since that memory region has not been initialized yet, it doesn't count as ocuppied storage from a system-wide point of view, so it's a pure abstraction besides consuming your virtual addresses.
memset(p, 0, 5);
That uses the first 5 bytes, so from the point of view of the OS, your process ocuppies now 5 extra bytes, and so the system has 5 bytes less of free storage. You have still 95 bytes of uninitialized storage.
The system only crash or start killing processes when the combined ocuppied storage (initialized) of every process is beyond what the OS can hold.
If my understanding is right at this regard, is there a way to "des-"initialize a region of memory when you are done with it, in order to increase the system-wide free space, without loosing the address region requested by malloc or aligned_malloc (so you don't increase fragmentation over time)?
The purpose of this question is more theoretical than practical and not about actually "freeing memory", but about freeing memory while conserving already assigned virtual addresses.
Source about the difference between requesting virtual addresses and ocuppying storage: https://www.win.tue.nl/~aeb/linux/lk/lk-9.html#ss9.6
PD: With knowing it for Linux to fill my curiosity I'm ok.
No, there is no way.
On most systems, as soon as you allocate memory, it counts towards RAM or swap.
As your link shows, on Linux, you may need to access the memory once so that the memory actually gets allocated. But as soon as you do, the system must keep that memory available somewhere, in case you access it later.
The way to tell the system you are done with the memory is to actually free it.
Related
I am observing the following behavior in my test program:
I am doing malloc() for 1 MB and then free() it after sleep(10). I am doing this five times. I am observing memory consumption in top while the program is running.
Once free()-d, I am expecting the program's virtual memory (VIRT) consumption to be down by 1 MB. But actually it isn't. It stays stable. What is the explanation for this behavior? Does malloc() do some reserve while allocating memory?
Once free()-d, I am expecting program's virtual memory (VIRT) consumption to be down by 1MB.
Well, this is not guaranteed by the C standard. It only says, once you free() the memory, you should not be accessing that any more.
Whether the memory block is actually returned to the available memory pool or kept aside for future allocations is decided by the memory manager.
The C standard doesn't force on the implementer of malloc and free to return the memory to the OS directly. So different C library implementations will behave differently. Some of them might give it back directly and some might not. In fact, the same implementation will also behave differently depending on the allocation sizes and patterns.
This behavior, of course, is for good reasons:
It is not always possible. OS-level memory allocations usually are done in pages (4KB, 4MB, or ... sizes at once). And if a small part of the page is still being used after freeing another part then the page cannot be given back to the operating system until that part is also freed.
Efficiency. It is very likely that an application will ask for memory again. So why give it back to the OS and ask for it again soon after. (of course, there is probably a limit on the size of the memory kept.)
In most cases, you are not accountable for the memory you free if the implementation decided to keep it (assuming it is a good implementation). Sooner or later it will be reallocated or returned to the OS. Hence, optimizing for memory usage should be based on the amount you have malloc-ed and you haven't free-d. The case where you have to worry about this, is when your allocation patterns/sizes start causing memory fragmentation which is a very big topic on its own.
If you are, however, on an embedded system and the amount of memory available is limited and you need more control over when/how memory is allocated and freed then you need to ask for memory pages from the OS directly and manage it manually.
Edit: I did not explain why you are not accountable for memory you free.
The reason is, on a modern OS, allocated memory is virtual. Meaning if you allocate 512MB on 32-bit system or 10TB of 64-bit system, as long as you don't read or write to that memory, it will not reserve any physical space for it. Actually, it will only reserve physical memory for the pages you touch from that big block and not the entire block. And after "a while of not using that memory", its contents will be copied to disk and the underlying physical memory will be used for something else.
This is very dependent on the actual malloc implementation in use.
Under Linux, there is a threshold (MMAP_THRESHOLD) to decide where the memory for a given malloc() request comes from.
If the requested amount is below or equal to MMAP_THRESHOLD, the request is satisfied by either taking it from the so-called "free list", if any memory blocks have already been free()d. Otherwise, the "break line" of the program (i. e. the end of the data segment) is increased and the memory made available to the program by this process is used for the request.
On free(), the freed memory block is added to the free list. If there is enough free memory at the very end of the data segment, the break line (mentionned above) is moved again to shrink the data segment, returning the excess memory to the OS.
If the requested amount exceeds MMAP_THRESHOLD, a separate memory block is requested by the OS and returned again during free().
See also https://linux.die.net/man/3/malloc for details.
Let's say I have 8 Gigabytes of RAM and 16 Gigabytes of swap memory. Can I allocate a 20 Gigabyte array there in C? If yes, how is it possible? What would that memory layout look like?
[linux] Can I create an array exceeding RAM, if I have enough swap memory?
Yes, you can. Note that accessing swap is veerry slooww.
how is it possible
Allocate dynamic memory. The operating system handles the rest.
How would that memory layout look like?
On an amd64 system, you can have 256 TiB of address space. You can easily fit a contiguous block of 8 GiB in that space. The operating system divides the virtual memory into pages and copies the pages between physical memory and swap space as needed.
Modern operating systems use virtual memory. In Linux and most other OSes rach process has it's own address space according to the abilities of the architecture. You can check the size of the virtual address space in /proc/cpuinfo. For example you may see:
address sizes : 43 bits physical, 48 bits virtual
This means that virtual addresses use 48 bit. Half of that is reserved for the kernel so you only can use 47 bit, or 128TiB. Any memory you allocate will be placed somewhere in those 128 TiB of address space as if you actually had that much memory.
Linux uses demand page loading and per default over commits memory. When you say
char *mem = (char*)malloc(1'000'000'000'000);
what happens is that Linux picks a suitable address and just records that you have allocated 1'000'000'000'000 (rounded up to the nearest page) of memory starting at that point. (It does some sanity check that the amount isn't totally bonkers depending on the amount of physical memory that is free, the amount of swap that is free and the overcommit setting. Per default you can allocate a lot more than you have memory and swap.)
Note that at this point no physical memory and no swap space is connected to your allocated block at all. This changes when you first write to the memory:
mem[4096] = 0;
At this point the program will page fault. Linux checks the address is actually something your program is allowed to write to, finds a physical page and map it to &mem[4096]. Then it lets the program retry to write there and everything continues.
If Linux can't find a physical page it will try to swap something out to make a physical page available for your programm. If that also fails your program will receive a SIGSEGV and likely die.
As a result you can allocate basically unlimited amounts of memory as long as you never write to more than the physical memory and swap and support. On the other hand if you initialize the memory (explicitly or implicitly using calloc()) the system will quickly notice if you try to use more than available.
You can, but not with a simple malloc. It's platform-dependent.
It requires an OS call to allocate swapable memory (it's VirtualAlloc on Windows, for example, on Linux it should be mmap and related functions).
Once it's done, the allocated memory is divided into pages, contiguous blocks of fixed size. You can lock a page, therefore it will be loaded in RAM and you can read and modify it freely. For old dinosaurs like me, it's exactly how EMS memory worked under DOS... You address your swappable memory with a kind of segment:offset method: first, you divide your linear address by the page size to find which page is needed, then you use the remainder to get the offset within this page.
Once unlocked, the page remains in memory until the OS needs memory: then, an unlocked page will be flushed to disk, in swap, and discarded in RAM... Until you lock (and load...) it again, but this operation may requires to free RAM, therefore another process may have its unlocked pages swapped BEFORE your own page is loaded again. And this is damnly SLOOOOOOW... Even on a SSD!
So, it's not always a good thing to use swap. A better way is to use memory mapped files - perfect for reading very big files mostly sequentially, with few random accesses - if it can suits your needs.
int A[10000000]; //This gives a segmentation fault
int *A = (int*)malloc(10000000*sizeof(int));//goes without any set fault.
Now my question is, just out of curiosity, that if ultimately we are able to allocate higher space for our data structures, say for example, BSTs and linked lists created using the pointers approach in C have no as such memory limit(unless the total size exceeds the size of RAM for our machine) and for example, in the second statement above of declaring a pointer type, why is that we can't have an array declared of higher size(until it reaches the memory limit!!)...Is this because the space allocated is contiguous in a static sized array?.But then from where do we get the guarantee that in the next 1000000 words in RAM no other piece of code would be running...??
PS: I may be wrong in some of the statements i made..please correct in that case.
Firstly, in a typical modern OS with virtual memory (Linux, Windows etc.) the amount of RAM makes no difference whatsoever. Your program is working with virtual memory, not with RAM. RAM is just a cache for virtual memory access. The absolute limiting factor for maximum array size is not RAM, it is the size of the available address space. Address space is the resource you have to worry about in OSes with virtual memory. In 32-bit OSes you have 4 gigabytes of address space, part of which is taken up for various household needs and the rest is available to you. In 64-bit OSes you theoretically have 16 exabytes of address space (less than that in practical implementations, since CPUs usually use less than 64 bits to represent the address), which can be perceived as practically unlimited.
Secondly, the amount of available address space in a typical C/C++ implementation depends on the memory type. There's static memory, there's automatic memory, there's dynamic memory. The address space limits for each memory type are pre-set in advance by the compiler. Which raises the question: where are you declaring your large array? Which memory type? Automatic? Static? You provided no information, but this is absolutely necessary. If you are attempting to declare it as a local variable (automatic memory), then no wonder it doesn't work, since automatic memory (aka "stack memory") has very limited address space assigned to it. Your array simply does not fit. Meanwhile, malloc allocates dynamic memory, which normally has the largest amount of address space available.
Thirdly, many compilers provide you with options that control the initial distribution of address space between different kinds of memory. You can request a much larger stack size for your program by manipulating such options. Quite possibly you can request a stack so large, than your local array will fit in it without any problems. But in practice, for obvious reasons, it makes very little sense to declare huge arrays as local variables.
Assuming local variables, this is because on modern implementations automatic variables will be allocated on the stack which is very limited in space. This link gives some of the common stack sizes:
platform default size
=====================================
SunOS/Solaris 8172K bytes
Linux 8172K bytes
Windows 1024K bytes
cygwin 2048K bytes
The linked article also notes that the stack size can be changed for example in Linux, one possible way from the shell before running your process would be:
ulimit -s 32768 # sets the stack size to 32M bytes
While malloc on modern implementations will come from the heap, which is only limited to the memory you have available to the process and in many cases you can even allocate more than is available due to overcommit.
I THINK you're missing the difference between total memory, and your programs memory space. Your program runs in an environment created by your operating system. It grants it a specific memory range to the program, and the program has to try to deal with that.
The catch: Your compiler can't 100% know the size of this range.
That means your compiler will successfully build, and it will REQUEST that much room in memory when the time comes to make the call to malloc (or move the stack pointer when the function is called). When the function is called (creating a stack frame) you'll get a segmentation fault, caused by the stack overflow. When the malloc is called, you won't get a segfault unless you try USING the memory. (If you look at the manpage for malloc() you'll see it returns NULL when there's not enough memory.)
To explain the two failures, your program is granted two memory spaces. The stack, and the heap. Memory allocated using malloc() is done using a system call, and is created on the heap of your program. This dynamically accepts or rejects the request and returns either the start address, or NULL, depending on a success or fail. The stack is used when you call a new function. Room for all the local variables is made on the stack, this is done by program instructions. Calling a function can't just FAIL, as that would break program flow completely. That causes the system to say "You're now overstepping" and segfault, stopping the execution.
When memory is allocated in a computer, how does it know which bytes are already occupied and can't be overwritten?
So if these are some bytes of memory that aren't being used:
[0|0|0|0]
How does the computer know whether they are or not? They could just be an integer that equals zero. Or it could be empty memory. How does it know?
That depends on the way the allocation is performed, but it generally involves manipulation of data belonging to the allocation mechanism.
When you allocate some variable in a function, the allocation is performed by decrementing the stack pointer. Via the stack pointer, your program knows that anything below the stack pointer is not allocated to the stack, while anything above the stack pointer is allocated.
When you allocate something via malloc() etc. on the heap, things are similar, but more complicated: all theses allocators have some internal data structures which they never expose to the calling application, but which allow them to select which memory addresses to return on an allocation request. Some malloc() implementation, for instance, use a number of memory pools for small objects of fixed size, and maintain linked lists of free objects for each fixed size which they track. That way, they can quickly pop one memory region of that list, only doing more expensive computations when they run out of regions to satisfy a certain request size.
In any case, each of the allocators have to request memory from the system kernel from time to time. This mechanism always works on complete memory pages (usually 4 kiB), and works via the syscalls brk() and mmap(). Again, the kernel keeps track of which pages are visible in which processes, and at which addresses they are mapped, so there is additional memory allocated inside the kernel for this.
These mappings are made available to the processor via the page tables, which uses them to resolve the virtual memory addresses to the physical addresses. So here, finally, you have some hardware involved in the process, but that is really far, far down in the guts of the mechanics, much below anything that a userspace process is ever able to see. Still, even the page tables are managed by the software of the kernel, not by the hardware, the hardware only interpretes what the software writes into the page tables.
First of all, I have the impression that you believe that there is some unoccupied memory that doesn't holds any value. That's wrong. You can imagine the memory as a very large array when each box contains a value whereas someone put something in it or not. If a memory was never written, then it contains a random value.
Now to answer your question, it's not the computer (meaning the hardware) but the operating system. It holds somewhere in its memory some tables recording which part of the memory are used. Also any byte of memory can be overwriten.
In general, you cannot tell by looking at content of memory at some location whether that portion of memory is used or not. Memory value '0' does not mean the memory is not used.
To tell what portions of memory are used you need some structure to tell you this. For example, you can divide memory into chunks and keep track of which chunks are used and which are not.
There are memory blocks, they have an occupied or not occupied. On the heap, there are very complex data structures which organise it. But the answer to your question is too broad.
I have a C++ application that I am trying to iron the memory leaks out of and I realized I don't fully understand the difference between virtual and physical memory.
Results from top (so 16.8g = virtual, 111m = physical):
4406 um 20 0 16.8g 111m 4928 S 64.7 22.8 36:53.65 client
My process holds 500 connections, one for each user, and at these numbers it means there is about 30 MB of virtual overhead for each user. Without going into the details of my application, the only way this could sound remotely realistic, adding together all the vectors, structs, threads, functions on the stack, etc., is if I have no idea what virtual memory actually means. No -O optimization flags, btw.
So my questions are:
what operations in C++ would inflate virtual memory so much?
Is it a problem if my task is using gigs of virtual memory?
The stack and heap function variables, vectors, etc. - do those necessarily increase the use of physical memory?
Would removing a memory leak (via delete or free() or such) necessarily reduce both physical and virtual memory usage?
Virtual memory is what your program deals with. It consists of all of the addresses returned by malloc, new, et al. Each process has its own virtual-address space. Virtual address usage is theoretically limited by the address size of your program: 32-bit programs have 4GB of address space; 64-bit programs have vastly more. Practically speaking, the amount of virtual memory that a process can allocate is less than those limits.
Physical memory are the chips soldered to your motherboard, or installed in your memory slots. The amount of physical memory in use at any given time is limited to the amount of physical memory in your computer.
The virtual-memory subsystem maps virtual addresses that your program uses to physical addresses that the CPU sends to the RAM chips. At any particular moment, most of your allocated virtual addresses are unmapped; thus physical memory use is lower than virtual memory use. If you access a virtual address that is allocated but not mapped, the operating system invisibly allocates physical memory and maps it in. When you don't access a virtual address, the operating system might unmap the physical memory.
To take your questions in turn:
what operations in C++ would inflate virtual memory so much?
new, malloc, static allocation of large arrays. Generally anything that requires memory in your program.
Is it a problem if my task is using gigs of virtual memory?
It depends upon the usage pattern of your program. If you allocate vast tracks of memory that you never, ever touch, and if your program is a 64-bit program, it may be okay that you are using gigs of virtual memory.
Also, if your memory use grows without bound, you will eventually run out of some resource.
The stack and heap function variables, vectors, etc. - do those necessarily increase the use of physical memory?
Not necessarily, but likely. The act of touching a variable ensures that, at least momentarily, it (and all of the memory "near" it) is in physical memory. (Aside: containers like std::vector may be allocated on either stack or heap, but the contained objects are allocated on the heap.)
Would removing a memory leak (via delete or free() or such) necessarily reduce both physical and virtual memory usage?
Physical: probably. Virtual: yes.
Virtual memory is the address space used by the process. Each process has a full view of the 64 bit (or 32, depending on the architecture) addressable bytes of a pointer, but not every byte maps to something real. The operating system manages the table that maps virtual address to real physical memory pages -- or whatever that address really is (no matter it seems to be memory for your application). For instance, for your application an address may point to some function, but in reality it has not yet been loaded from disk, and when you call it, it generates a page fault interruption, that the kernel treats by loading the appropriated section from the executable and mapping it to the address space of your application, so it can be executed.
From a Linux perspective (and I believe most modern OS's):
Allocating memory inflates virtual memory. Actually using the allocated memory inflates physical memory usage. Do it too much and it will be swapped to disk, and eventually your process will be killed.
mmaping files will increase only virtual memory usage, this includes the size of the executables: the larger they are, the more virtual memory used.
The only problem of using up virtual memory is that you may have it depleted. It is mainly an issue on 32 bits system, where you only have 4gb of it (and 1gb is reserved for the kernel, so application data only have 3gb).
Function calls, that allocates variables on stack, may increase physical memory usage, but you (usually) won't leak this memory.
Allocated heap variables takes up virtual memory, but will only actually get the physical memory if you read/write on them.
Freeing or deleting variables does not necessarily reduces virtual/physical memory consumption, it depends on the allocator internals, but usually does.
You can set following environment variables to control internal memory allocations by malloc. After setting it, it will answer all four questions. If you want to know other options please refer :
http://man7.org/linux/man-pages/man3/mallopt.3.html
export MALLOC_MMAP_THRESHOLD_=8192
export MALLOC_ARENA_MAX=4