Related
In my program I declare an initialized global variable (as an array).
But it only affects the size of executable file, the memory usage by the program was not affected.
My program is like that
char arr[1014*1024*100] = {1};
int _tmain(int argc, _TCHAR* argv[])
{
while (true)
{
}
return 0;
}
Size of executable file is 118MB but memory usage when running program was only 0.3MB
Can anyone explain for me?
Most operating systems used demand-paged virtual memory.
This means that when you load a program, the executable file for that program isn't allow loaded into memory immediately. Instead, virtual memory pages are set up to map the file to memory. When (and if) you actually refer to an address, that causes a page fault, which the OS then handles by reading the appropriate part of the file into physical memory, then letting the instruction re-execute.
In your case, you don't refer to arr, so the OS never pulls that data into memory.
If you were to look at the virtual address space used by your program (rather than the physical memory you're apparently now looking at), you'd probably see address space allocated for all of arr. The virtual address space isn't often very interesting or useful to examine though, so most things that tell you about memory usage will tell you only about the physical RAM being used to store actual data, not the virtual address space that's allocated but never used.
Even if you do refer to the data, the OS can be fairly clever: depending on how often you refer to the data (and whether you modify it), only part of the data may ever be loaded into RAM at any given time. If it's been modified, the modified portions can be written to the paging file to make room in RAM for data that's being used more often. If it's not modified, it can be discarded (because the original data can be re-loaded from the original file on disk whenever it's needed).
The reason your memory in use while your executable is executing is significantly smaller than the space required on your hard-drive (or solid-state drive) to store the executable is because you're not pulling the array itself into memory.
In your program, you never access or call your array—let alone bring into memory all at once in parallel. Because of that, the memory needed to run your executable is incredibly small when compared to the size of the executable (which has to store your massively large array).
I hope that makes sense. The difference between the two is that one is executing and one is stored on your computer's internal disk. Something is only brought into execution when it's brought into memory.
Processes in OS have their own virtual address spaces. Say, I allocate some dynamic memory using malloc() function call in a c program and subtract some positive value(say 1000) from the address returned by it. Now, I try to read what is written on that location which should be fine but what about writing to that location?
virtual address space also has some read only chunk of memory. How does it protect that?
TL;DR No, it's not allowed.
In your case, when you got a valid non-NULL pointer to a memory address returned by malloc(), only the requested size of memory is allocated to your process and you're allowed to use (read and / or write) into that much space only.
In general, any allocated memory (compile-time or run-time) has an associated size with it. Either overrunning or underruning the allocated memory area is considered invalid memory access, which invokes undefined behavior.
Even if, the memory is accessible and inside the process address space, there's nothing stopping the OS/ memory manager to return the pointer to that particular address, so, at best, either your previous write will be overwritten or you will be overwriting some other value. The worst case, as mentioned earlier, UB.
Say, I allocate some dynamic memory using malloc() function call in a c program and subtract some positive value(say 1000) from the address returned by it. Now, I try to read what is written on that location which should be fine but what about writing to that location?
What addresses you can read/write/execute from are based on a processes current memory map, which is set up by the operating system.
On my linux box, if I run pmap on my current shell, I see something like this:
evaitl#bb /proc/13151 $ pmap 13151
13151: bash
0000000000400000 976K r-x-- bash
00000000006f3000 4K r---- bash
00000000006f4000 36K rw--- bash
00000000006fd000 24K rw--- [ anon ]
0000000001f25000 1840K rw--- [ anon ]
00007ff7cce36000 44K r-x-- libnss_files-2.23.so
00007ff7cce41000 2044K ----- libnss_files-2.23.so
00007ff7cd040000 4K r---- libnss_files-2.23.so
00007ff7cd041000 4K rw--- libnss_files-2.23.so
00007ff7cd042000 24K rw--- [ anon ]
...
[many more lines here...]
Each line has a base address, a size, and the permissions. These are considered memory segments. The last line either says what is being mapped in. bash is my shell. anon means this is allocated memory, perhaps for bss, maybe heap from malloc, or it could be a stack.
Shared libraries are also mapped in, that is where the the libnns_files lines come from.
When you malloc some memory, it will come from an anonymous program segment. If there isn't enough space in the current anon segment being used for the heap, the OS will increase its size. The permissions in those segments will almost certainly be rw.
If you try to read/write outside of space you allocated, behavior is undefined. In this case that means that you may get lucky and nothing happens, or you may trip over an unmapped address and get a SIGSEGV signal.
Now, I try to read what is written on that location which should be fine
It is not fine. According to the C++ standard, reading uninitialized memory has undefined behaviour.
but what about writing to that location?
Not fine either. Reading or writing unallocated memory also has undefined behaviour.
Sure, the memory address that you ended up in might be allocated - it's possible. But even if it happens to be, the pointer arithmetic outside of bounds of the allocation is already UB.
virtual address space also has some read only chunk of memory. How does it protect that?
This one is out of scope of C++ (and C) since it does not define virtual memory at all. This may differ across operating systems, but at least one approach is that when the process requests memory from the OS, it sends flags that specify the desired protection type. See prot argument in the man page of mmap as an example. The OS in turn sets up the virtual page table accordingly.
Once the protection type is known, the OS can raise an appropriate signal if the protection has been violated, and possibly terminate the process. Just like it does when a process tries to access unmapped memory. The violations are typically detected by the memory management unit of the CPU.
Processes in OS have their own virtual address spaces. Say, I allocate
some dynamic memory using malloc() function call in a c program and
subtract some positive value(say 1000) from the address returned by
it. Now, I try to read what is written on that location which should
be fine but what about writing to that location?
No, it should not be fine, since only the memory region allocated by malloc() is guaranteed to be accessible. There is no guarantee that the virtual address space is contiguous, and thus the memory addresses before and after your region are accessible (i.e. mapped to virtual address space).
Of course, no one is stopping you from doing so, but the behaviour will be really undefined. If you access non-mapped memory address, it will generate a page fault exception, which is a hardware CPU exception. When it is handled by the operating system, it will send SIGSEGV signal or access violation exception to your application (depending ot the OS).
virtual address space also has some read only chunk of memory. How
does it
protect that?
First it's important to note that virtual memory mapping is realized partly by an external hardware component, called a memory management unit. It might be integrated in the CPU chip, or not. Additionally to being able to map various virtual memory addresses to physical ones, it supports also marking these addresses with different flags, one of which enables and disables writing protection.
When the CPU tries to write on virtual address, marked as read-only, thus write-protected, (for examble by MOV instruction), the MMU fires a page fault exception on the CPU.
Same goes for trying to access a non-present virtual memory pages.
In the C language, doing arithmetic on a pointer to produce another pointer that does not point into (or one-past-the-end) the same object or array of objects is undefined behavior: from 6.5.6 Additive Operators:
If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated
(for the purposes of this clause, a non-array object is treated as an array of length 1)
You could get unlucky and the compiler could produce still produce a pointer you're allowed to do things with and then doing things with them will do things — but precisely what those things are is anybody's guess and will be unreliable and often difficult to debug.
If you're lucky, the compiler produces a pointer into memory that "does not belong to you" and you get a segmentation fault to alert you to the problem as soon as you try to read or write through it.
How the system behaves when you read/write an unmapped memory address depends basically on your operating system implementation. Operating systems normally behave differently when you try to access an unmapped virtual address. What happens when you try one access to an unmapped (or mapped for not-memory ---for example to map a file in memory) the operating system is taking the control (by means of a trap) and what happens then is completely operating system dependant. Suppose you have mapped the video framebuffer somewhere in your virtual address... then, writing there makes the screen change. Suppose you have mapped a file, then reading/writing that memory means reading or writing a file. Suppose you (the process running) try to access a swapped zone (due to physical memory lack your process has been partially swapped) your process is stopped and work for bringing that memory from secondary storage is begun, and then the instruction will be restarted. For example, linux generates a SIGSEGV signal when you try to access memory not allocated. But you can install a signal handler to be called upon receiving this signal and then, trying to access unallocated memory means jumping into a piece of code in your own program to deal with that situation.
But think that trying to access memory that has not been correctly acquired, and more in a modern operating system, normally means that your program is behaving incorrectly, and normally it will crash, letting the system to take the control and it will be killed.
NOTE
malloc(3) is not a system call, but a library function that manages a variable size allocation segment on your RAM, so what happens if you try to access even the first address previous to the returned one or past the last allocated memory cell, means undefined behaviour. It does not mean you have accessed unallocated memory. Probably you will be reading a perfectly allocated piece of memory in your code or in your data (or the stack) without knowing. malloc(3) tends to ask the operating system for continous large amounts of memory to be managed for many malloc calls between costly asking the operating system for more memory. See sbrk(2) or memmap(2) system calls manpages for getting more on this.
For example, either linux or bsd unix allocate an entry in the virtual address space of each process at page 0 (for the NULL address) to make the null pointer invalid access, and if you try to read or write to this address (or all in that page) you'll get a signal (or your process killed) Try this:
int main()
{
char *p = 0; /* p is pointing to the null address */
p[0] = '\n'; /* a '\n' is being written to address 0x0000 */
p[1] = '\0'; /* a '\0' is being written to address 0x0001 */
}
This program should fail at runtime on all modern operating systems (try to compile it without optimization so the compiler doesn't eliminate the code in main, as it does effectively nothing) because you are trying to access an already allocated (for specific purposes) page of memory.
The program on my system (mac OS X, a derivative from BSD unix) just does the following:
$ a.out
Segmentation fault: 11
NOTE 2
Many modern operating systems (mostly unix derived) implement a type of memory access called COPY ON WRITE. This means that you can access that memory and modify it as you like, but the first time you access it for writing, a page fault is generated (normally, this is implemented as you receiving a read only page, letting the fault to happen and making the individual page copy to store your private modifications) This is very effective on fork(2), that normally are followed by an exec(2) syscall (only the pages modified by the program are actually copied before the process throws them all, saving a lot of computer power)
Another case is the stack growing example. Stack grows automatically as you enter/leave stack frames in your program, so the operating system has to deal with the page faults that happen when you PUSH something on the stack and that push crosses a virtual page and goes into the unknown. When this happens, the OS automatically allocates a page and converts that region (the page) into more valid memor (read-write normally).
Technically, a process has a logical address. However, that often gets conflated into a virtual address space.
The number of virtual addresses that can be mapped into that logical address space can be limited by:
Hardware
System resources (notably page file space)
System Parameters (e.g., limiting page table size)
Process quotas
Your logical address space consists of an array of pages that are mapped to physical page frames. Not every page needs to have such a mapping (or even is likely to).
The logical address space is usually divided into two (or more) areas: system (common to all processes) and user (created for each process).
Theoretically, there is nothing in the user space to being a process with, only the system address space exists.
If the system does not use up its entire range of logical addresses (which is normal), unused addresses cannot be accessed at all.
Now your program starts running. The O/S has mapped some pages into your logical address space. Very little of that address space it likely to be mapped. Your application can map more pages into the unmapped pages of logical address space.
Say, I allocate some dynamic memory using malloc() function call in a c program and subtract some positive value(say 1000) from the address returned by it. Now, I try to read what is written on that location which should be fine but what about writing to that location?
The processor uses a page table to map logical pages to physical page frames. If you do you say a number of things can happen:
There is no page table entry for the address => Access violation. Your system may not set up a page table that can span the entire logical address space.
There is a page table entry for the address but it is marked invalid => Access Violation.
You are attempting to access a page that is not accessible in your current processor mode (e.g., user mode access to a page that only allows kernel mode access) => Access Violation.
virtual address space also has some read only chunk of memory. How does it protect that?
You are attempting to access a page that in a manner not permitted to the page (e.g., write to readonly page, execute to a no execute page) => Access Violation The access allowed to a page is defined in the page table.
[Ignoring page faults]
If you make it though those tests, you can access the random memory address.
It does not. It's actually you duty as a programmer to handle this
int A[10000000]; //This gives a segmentation fault
int *A = (int*)malloc(10000000*sizeof(int));//goes without any set fault.
Now my question is, just out of curiosity, that if ultimately we are able to allocate higher space for our data structures, say for example, BSTs and linked lists created using the pointers approach in C have no as such memory limit(unless the total size exceeds the size of RAM for our machine) and for example, in the second statement above of declaring a pointer type, why is that we can't have an array declared of higher size(until it reaches the memory limit!!)...Is this because the space allocated is contiguous in a static sized array?.But then from where do we get the guarantee that in the next 1000000 words in RAM no other piece of code would be running...??
PS: I may be wrong in some of the statements i made..please correct in that case.
Firstly, in a typical modern OS with virtual memory (Linux, Windows etc.) the amount of RAM makes no difference whatsoever. Your program is working with virtual memory, not with RAM. RAM is just a cache for virtual memory access. The absolute limiting factor for maximum array size is not RAM, it is the size of the available address space. Address space is the resource you have to worry about in OSes with virtual memory. In 32-bit OSes you have 4 gigabytes of address space, part of which is taken up for various household needs and the rest is available to you. In 64-bit OSes you theoretically have 16 exabytes of address space (less than that in practical implementations, since CPUs usually use less than 64 bits to represent the address), which can be perceived as practically unlimited.
Secondly, the amount of available address space in a typical C/C++ implementation depends on the memory type. There's static memory, there's automatic memory, there's dynamic memory. The address space limits for each memory type are pre-set in advance by the compiler. Which raises the question: where are you declaring your large array? Which memory type? Automatic? Static? You provided no information, but this is absolutely necessary. If you are attempting to declare it as a local variable (automatic memory), then no wonder it doesn't work, since automatic memory (aka "stack memory") has very limited address space assigned to it. Your array simply does not fit. Meanwhile, malloc allocates dynamic memory, which normally has the largest amount of address space available.
Thirdly, many compilers provide you with options that control the initial distribution of address space between different kinds of memory. You can request a much larger stack size for your program by manipulating such options. Quite possibly you can request a stack so large, than your local array will fit in it without any problems. But in practice, for obvious reasons, it makes very little sense to declare huge arrays as local variables.
Assuming local variables, this is because on modern implementations automatic variables will be allocated on the stack which is very limited in space. This link gives some of the common stack sizes:
platform default size
=====================================
SunOS/Solaris 8172K bytes
Linux 8172K bytes
Windows 1024K bytes
cygwin 2048K bytes
The linked article also notes that the stack size can be changed for example in Linux, one possible way from the shell before running your process would be:
ulimit -s 32768 # sets the stack size to 32M bytes
While malloc on modern implementations will come from the heap, which is only limited to the memory you have available to the process and in many cases you can even allocate more than is available due to overcommit.
I THINK you're missing the difference between total memory, and your programs memory space. Your program runs in an environment created by your operating system. It grants it a specific memory range to the program, and the program has to try to deal with that.
The catch: Your compiler can't 100% know the size of this range.
That means your compiler will successfully build, and it will REQUEST that much room in memory when the time comes to make the call to malloc (or move the stack pointer when the function is called). When the function is called (creating a stack frame) you'll get a segmentation fault, caused by the stack overflow. When the malloc is called, you won't get a segfault unless you try USING the memory. (If you look at the manpage for malloc() you'll see it returns NULL when there's not enough memory.)
To explain the two failures, your program is granted two memory spaces. The stack, and the heap. Memory allocated using malloc() is done using a system call, and is created on the heap of your program. This dynamically accepts or rejects the request and returns either the start address, or NULL, depending on a success or fail. The stack is used when you call a new function. Room for all the local variables is made on the stack, this is done by program instructions. Calling a function can't just FAIL, as that would break program flow completely. That causes the system to say "You're now overstepping" and segfault, stopping the execution.
In programming languages like C and C++, people often refer to static and dynamic memory allocation. I understand the concept but the phrase "All memory was allocated (reserved) during compile time" always confuses me.
Compilation, as I understand it, converts high level C/C++ code to machine language and outputs an executable file. How is memory "allocated" in a compiled file ? Isn't memory always allocated in the RAM with all the virtual memory management stuff ?
Isn't memory allocation by definition a runtime concept ?
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount ?
This is one of the pages where the phrase is used under the heading "Static allocation".
Back To Basics: Memory allocation, a walk down the history
Memory allocated at compile-time means the compiler resolves at compile-time where certain things will be allocated inside the process memory map.
For example, consider a global array:
int array[100];
The compiler knows at compile-time the size of the array and the size of an int, so it knows the entire size of the array at compile-time. Also a global variable has static storage duration by default: it is allocated in the static memory area of the process memory space (.data/.bss section). Given that information, the compiler decides during compilation in what address of that static memory area the array will be.
Of course that memory addresses are virtual addresses. The program assumes that it has its own entire memory space (From 0x00000000 to 0xFFFFFFFF for example). That's why the compiler could do assumptions like "Okay, the array will be at address 0x00A33211". At runtime that addresses are translated to real/hardware addresses by the MMU and OS.
Value initialized static storage things are a bit different. For example:
int array[] = { 1 , 2 , 3 , 4 };
In our first example, the compiler only decided where the array will be allocated, storing that information in the executable.
In the case of value-initialized things, the compiler also injects the initial value of the array into the executable, and adds code which tells the program loader that after the array allocation at program start, the array should be filled with these values.
Here are two examples of the assembly generated by the compiler (GCC4.8.1 with x86 target):
C++ code:
int a[4];
int b[] = { 1 , 2 , 3 , 4 };
int main()
{}
Output assembly:
a:
.zero 16
b:
.long 1
.long 2
.long 3
.long 4
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
popq %rbp
ret
As you can see, the values are directly injected into the assembly. In the array a, the compiler generates a zero initialization of 16 bytes, because the Standard says that static stored things should be initialized to zero by default:
8.5.9 (Initializers) [Note]:
Every object of static storage duration is zero-initialized at
program startup before any other initial- ization takes place. In some
cases, additional initialization is done later.
I always suggest people to disassembly their code to see what the compiler really does with the C++ code. This applies from storage classes/duration (like this question) to advanced compiler optimizations. You could instruct your compiler to generate the assembly, but there are wonderful tools to do this on the Internet in a friendly manner. My favourite is GCC Explorer.
Memory allocated at compile time simply means there will be no further allocation at run time -- no calls to malloc, new, or other dynamic allocation methods. You'll have a fixed amount of memory usage even if you don't need all of that memory all of the time.
Isn't memory allocation by definition a runtime concept?
The memory is not in use prior to run time, but immediately prior to execution starting its allocation is handled by the system.
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount?
Simply declaring the static will not increase the size of your executable more than a few bytes. Declaring it with an initial value that is non-zero will (in order to hold that initial value). Rather, the linker simply adds this 1KB amount to the memory requirement that the system's loader creates for you immediately prior to execution.
Memory allocated in compile time means that when you load the program, some part of the memory will be immediately allocated and the size and (relative) position of this allocation is determined at compile time.
char a[32];
char b;
char c;
Those 3 variables are "allocated at compile time", it means that the compiler calculates their size (which is fixed) at compile time. The variable a will be an offset in memory, let's say, pointing to address 0, b will point at address 33 and c at 34 (supposing no alignment optimization). So, allocating 1Kb of static data will not increase the size of your code, since it will just change an offset inside it. The actual space will be allocated at load time.
Real memory allocation always happens in run time, because the kernel needs to keep track of it and to update its internal data structures (how much memory is allocated for each process, pages and so on). The difference is that the compiler already knows the size of each data you are going to use and this is allocated as soon as your program is executed.
Remember also that we are talking about relative addresses. The real address where the variable will be located will be different. At load time the kernel will reserve some memory for the process, lets say at address x, and all the hard coded addresses contained in the executable file will be incremented by x bytes, so that variable a in the example will be at address x, b at address x+33 and so on.
Adding variables on the stack that take up N bytes doesn't (necessarily) increase the bin's size by N bytes. It will, in fact, add but a few bytes most of the time.
Let's start off with an example of how adding a 1000 chars to your code will increase the bin's size in a linear fashion.
If the 1k is a string, of a thousand chars, which is declared like so
const char *c_string = "Here goes a thousand chars...999";//implicit \0 at end
and you then were to vim your_compiled_bin, you'd actually be able to see that string in the bin somewhere. In that case, yes: the executable will be 1 k bigger, because it contains the string in full.
If, however you allocate an array of ints, chars or longs on the stack and assign it in a loop, something along these lines
int big_arr[1000];
for (int i=0;i<1000;++i) big_arr[i] = some_computation_func(i);
then, no: it won't increase the bin... by 1000*sizeof(int)
Allocation at compile time means what you've now come to understand it means (based on your comments): the compiled bin contains information the system requires to know how much memory what function/block will need when it gets executed, along with information on the stack size your application requires. That's what the system will allocate when it executes your bin, and your program becomes a process (well, the executing of your bin is the process that... well, you get what I'm saying).
Of course, I'm not painting the full picture here: The bin contains information about how big a stack the bin will actually be needing. Based on this information (among other things), the system will reserve a chunk of memory, called the stack, that the program gets sort of free reign over. Stack memory still is allocated by the system, when the process (the result of your bin being executed) is initiated. The process then manages the stack memory for you. When a function or loop (any type of block) is invoked/gets executed, the variables local to that block are pushed to the stack, and they are removed (the stack memory is "freed" so to speak) to be used by other functions/blocks. So declaring int some_array[100] will only add a few bytes of additional information to the bin, that tells the system that function X will be requiring 100*sizeof(int) + some book-keeping space extra.
On many platforms, all of the global or static allocations within each module will be consolidated by the compiler into three or fewer consolidated allocations (one for uninitialized data (often called "bss"), one for initialized writable data (often called "data"), and one for constant data ("const")), and all of the global or static allocations of each type within a program will be consolidated by the linker into one global for each type. For example, assuming int is four bytes, a module has the following as its only static allocations:
int a;
const int b[6] = {1,2,3,4,5,6};
char c[200];
const int d = 23;
int e[4] = {1,2,3,4};
int f;
it would tell the linker that it needed 208 bytes for bss, 16 bytes for "data", and 28 bytes for "const". Further, any reference to a variable would be replaced with an area selector and offset, so a, b, c, d, and e, would be replaced by bss+0, const+0, bss+4, const+24, data+0, or bss+204, respectively.
When a program is linked, all of the bss areas from all the modules are be concatenated together; likewise the data and const areas. For each module, the address of any bss-relative variables will be increased by the size of all preceding modules' bss areas (again, likewise with data and const). Thus, when the linker is done, any program will have one bss allocation, one data allocation, and one const allocation.
When a program is loaded, one of four things will generally happen depending upon the platform:
The executable will indicate how many bytes it needs for each kind of data and--for the initialized data area, where the initial contents may be found. It will also include a list of all the instructions which use a bss-, data-, or const- relative address. The operating system or loader will allocate the appropriate amount of space for each area and then add the starting address of that area to each instruction which needs it.
The operating system will allocate a chunk of memory to hold all three kinds of data, and give the application a pointer to that chunk of memory. Any code which uses static or global data will dereference it relative to that pointer (in many cases, the pointer will be stored in a register for the lifetime of an application).
The operating system will initially not allocate any memory to the application, except for what holds its binary code, but the first thing the application does will be to request a suitable allocation from the operating system, which it will forevermore keep in a register.
The operating system will initially not allocate space for the application, but the application will request a suitable allocation on startup (as above). The application will include a list of instructions with addresses that need to be updated to reflect where memory was allocated (as with the first style), but rather than having the application patched by the OS loader, the application will include enough code to patch itself.
All four approaches have advantages and disadvantages. In every case, however, the compiler will consolidate an arbitrary number of static variables into a fixed small number of memory requests, and the linker will consolidate all of those into a small number of consolidated allocations. Even though an application will have to receive a chunk of memory from the operating system or loader, it is the compiler and linker which are responsible for allocating individual pieces out of that big chunk to all the individual variables that need it.
The core of your question is this: "How is memory "allocated" in a compiled file? Isn't memory always allocated in the RAM with all the virtual memory management stuff? Isn't memory allocation by definition a runtime concept?"
I think the problem is that there are two different concepts involved in memory allocation. At its basic, memory allocation is the process by which we say "this item of data is stored in this specific chunk of memory". In a modern computer system, this involves a two step process:
Some system is used to decide the virtual address at which the item will be stored
The virtual address is mapped to a physical address
The latter process is purely run time, but the former can be done at compile time, if the data have a known size and a fixed number of them is required. Here's basically how it works:
The compiler sees a source file containing a line that looks a bit like this:
int c;
It produces output for the assembler that instructs it to reserve memory for the variable 'c'. This might look like this:
global _c
section .bss
_c: resb 4
When the assembler runs, it keeps a counter that tracks offsets of each item from the start of a memory 'segment' (or 'section'). This is like the parts of a very large 'struct' that contains everything in the entire file it doesn't have any actual memory allocated to it at this time, and could be anywhere. It notes in a table that _c has a particular offset (say 510 bytes from the start of the segment) and then increments its counter by 4, so the next such variable will be at (e.g.) 514 bytes. For any code that needs the address of _c, it just puts 510 in the output file, and adds a note that the output needs the address of the segment that contains _c adding to it later.
The linker takes all of the assembler's output files, and examines them. It determines an address for each segment so that they won't overlap, and adds the offsets necessary so that instructions still refer to the correct data items. In the case of uninitialized memory like that occupied by c (the assembler was told that the memory would be uninitialized by the fact that the compiler put it in the '.bss' segment, which is a name reserved for uninitialized memory), it includes a header field in its output that tells the operating system how much needs to be reserved. It may be relocated (and usually is) but is usually designed to be loaded more efficiently at one particular memory address, and the OS will try to load it at this address. At this point, we have a pretty good idea what the virtual address is that will be used by c.
The physical address will not actually be determined until the program is running. However, from the programmer's perspective the physical address is actually irrelevant—we'll never even find out what it is, because the OS doesn't usually bother telling anyone, it can change frequently (even while the program is running), and a main purpose of the OS is to abstract this away anyway.
An executable describes what space to allocate for static variables. This allocation is done by the system, when you run the executable. So your 1kB static variable won't increase the size of the executable with 1kB:
static char[1024];
Unless of course you specify an initializer:
static char[1024] = { 1, 2, 3, 4, ... };
So, in addition to 'machine language' (i.e. CPU instructions), an executable contains a description of the required memory layout.
Memory can be allocated in many ways:
in application heap (whole heap is allocated for your app by OS when the program starts)
in operating system heap (so you can grab more and more)
in garbage collector controlled heap (same as both above)
on stack (so you can get a stack overflow)
reserved in code/data segment of your binary (executable)
in remote place (file, network - and you receive a handle not a pointer to that memory)
Now your question is what is "memory allocated at compile time". Definitely it is just an incorrectly phrased saying, which is supposed to refer to either binary segment allocation or stack allocation, or in some cases even to a heap allocation, but in that case the allocation is hidden from programmer eyes by invisible constructor call. Or probably the person who said that just wanted to say that memory is not allocated on heap, but did not know about stack or segment allocations.(Or did not want to go into that kind of detail).
But in most cases person just wants to say that the amount of memory being allocated is known at compile time.
The binary size will only change when the memory is reserved in the code or data segment of your app.
You are right. Memory is actually allocated (paged) at load time, i.e. when the executable file is brought into (virtual) memory. Memory can also be initialized on that moment. The compiler just creates a memory map. [By the way, stack and heap spaces are also allocated at load time !]
I think you need to step back a bit. Memory allocated at compile time.... What can that mean? Can it mean that memory on chips that have not yet been manufactured, for computers that have not yet been designed, is somehow being reserved? No. No, time travel, no compilers that can manipulate the universe.
So, it must mean that the compiler generates instructions to allocate that memory somehow at runtime. But if you look at it in from the right angle, the compiler generates all instructions, so what can be the difference. The difference is that the compiler decides, and at runtime, your code can not change or modify its decisions. If it decided it needed 50 bytes at compile time, at runtime, you can't make it decide to allocate 60 -- that decision has already been made.
If you learn assembly programming, you will see that you have to carve out segments for the data, the stack, and code, etc. The data segment is where your strings and numbers live. The code segment is where your code lives. These segments are built into the executable program. Of course the stack size is important as well... you wouldn't want a stack overflow!
So if your data segment is 500 bytes, your program has a 500 byte area. If you change the data segment to 1500 bytes, the size of the program will be 1000 bytes larger. The data is assembled into the actual program.
This is what is going on when you compile higher level languages. The actual data area is allocated when it is compiled into an executable program, increasing the size of the program. The program can request memory on the fly, as well, and this is dynamic memory. You can request memory from the RAM and the CPU will give it to you to use, you can let go of it, and your garbage collector will release it back to the CPU. It can even be swapped to a hard disk, if necessary, by a good memory manager. These features are what high level languages provide you.
I would like to explain these concepts with the help of few diagrams.
This is true that memory cannot be allocated at compile time, for sure.
But, then what happens in fact at compile time.
Here comes the explanation.
Say, for example a program has four variables x,y,z and k.
Now, at compile time it simply makes a memory map, where the location of these variables with respect to each other is ascertained.
This diagram will illustrate it better.
Now imagine, no program is running in memory.
This I show by a big empty rectangle.
Next, the first instance of this program is executed.
You can visualize it as follows.
This is the time when actually memory is allocated.
When second instance of this program is running, the memory would look like as follows.
And the third ..
So on and so forth.
I hope this visualization explains this concept well.
There is very nice explanation given in the accepted answer. Just in case i will post the link which i have found useful.
https://www.tenouk.com/ModuleW.html
One among the many thing what a compiler does is that create and maintain a SYMTAB(Symbol Table under the section.symtab). This will be purely created and maintained by compilers using any Data Structure(List, Trees...etc) and not for the developers eyes. Any access request made by the developers this is where it will hit first.
Now about the Symbol Table,
We only need to know about the two columns Symbol Name and the Offset.
Symbol Name column will have the variable names and the offset column will have the offset value.
Lets see this with an example:
int a , b , c ;
Now we all know that the register Stack_Pointer(sp) points to the Top of the Stack Memory. Let that be sp = 1000.
Now the Symbol Name column will have three values in it a then b and then c. Reminding you all that variable a will be at the top of the stack memory.
So a's equivalent offset value will be 0.
(Compile Time Offset_Value)
Then b and its equivalent offset value will be 1. (Compile Time Offset_Value)
Then c and its equivalent offset value will be 2. (Compile Time Offset_Value)
Now calculating a's Physical address (or) Runtime Memory Address = (sp + offset_value of a)
= (1000 + 0) = 1000
Now calculating b's Physical address (or) Runtime Memory Address = (sp - offset_value of b)
= (1000 - 1) = 996
Now calculating c's Physical address (or) Runtime Memory Address = (sp - offset_value of c)
= (1000 - 2) = 992
Therefore at the time of the compilation we will only be having the offset values and only during the runtime the actual physical addresses are calculated.
Note:
Stack_Pointer value will be assigned only after the program is loaded. Pointer Arithmetic happens between the Stack_Pointer register and the variables offset to calculate the variables Physical Address.
"POINTERS AND POINTER ARITHMETIC, WAY OF THE PROGRAMMING WORLD"
Share what I learned about this question.
You can understand this issue in two steps:
First, the compilation step: the compiler generates the binary. In Linux system, binary is a file in ELF (Executable and Linkable Format) format. ELF file contains several sections, including .bss and .data
.data
Initialized data, with read/write access rights
.bss
Uninitialized data, with read/write access rights (=WA)
.data and .bss just map to the segments of process's memory layout, which contains static variables.
second, the loading step. When the binary file get executed, the ELF file will be loaded into process's memory. The loader can find static variables' information from ELF file.
Simply speaking, the compiler and the loader follow the same standard to communicate with each other, and the standard is ELF format.
I'm writing a simple program that accesses the memory of another process. I have been using a memory editor to find the addresses of the variables I want my program to retrieve and use with the ReadProcessMemory function. So far, there have been no problems, but I am unsure whether the addresses of the values may change depending on the environment the other program is being run on.
Aside from alterations to the program itself, should I be concerned about this? I have noticed that my memory editor saves the addresses relative to the location of the .exe (such as program.exe+198F6C), and I would love to implement my program like this, but I could not find any method for retrieving the current address of program.exe in C++.
Yes, they change.
The OS loads the process into different offsets each time it launches, and anything allocated with new or malloc is very likely to get different addresses each time the code is run.
There are two issues here: location of variables inside a process's memory space, and the location of a process in physical memory. The first should concern you, the second should not.
Local variables (as well as global/static variables) will have the same address relative to the program location in memory. Dynamically allocated variables (new/malloc) will have different addresses each time.
When I say "memory", I mean the virtual memory space of a specific process: the address 0x100 in one process doesn't equal 0x100 in another process, and in general is different than cell number 0x100 in your RAM.
The actual address isn't usually interesting, because both ReadProcessMemory and your memory editor only work with those relative addresses. You don't need the location of program.exe.
If you're interested in local variables, you can count on ReadProcessMemory returning a meaningful result each time. If you need memory which has been dynamically allocated, you need to find a local pointer, get the address of the allocated memory from it, and call ReadProcessMemory again.
Yes, they will change. Write a program that outputs the memory address of a few variables and run it a few times. Your output should differ each time, especially on other machines.
You are also going to run into concurrency problems with multiple accesses of the same memory area.
Correct order - W1a, W1b,R1a,R1b,W2a,W2b,R2a,R2b
Incorrect order - W1a,W1b,R1a,W2a,W2b,R1b,R2a,R2b
To solve this problem you need to look at IPC, Inter Processor Communication:
http://en.wikipedia.org/wiki/Inter-process_communication