What is stored in Code memory and Data memory - c++

Can some one please explain difference between Code and Data memory. I know code is stored in Flash and Data is stored in RAM but i am confused.
#include <iostream>
using namespace std;
int main()
{
int a =10, b=20;
int c = a+b;
return 0;
}
Here a,b,c are stored in data memory(RAM), but whats get stored in Code memory? Is this entire code is stored in Code memory? if yes, then does this mean we are storing a,b,c in both data and code memory.

In your example, many scenarios based on the optimization level of your compiler.
Constants placed in "code memory"
In the code below:
int a =10, b=20;
int c = a+b;
return 0;
The variables a and b are constants, they don't change. A compiler could optimize this and optimize them to be:
int c = 10 + 20;
So the values 10 and 20 can be placed into code memory, eliminating the variables a and b.
Registers not Memory
The compiler is allowed to assign the variables a and b to registers. Registers are within the processor, so don't take up any RAM or memory space. Registers are not part of the code space either.
(This can happen because there are no statements that require the addresses of a or b).
All code dropped
On higher optimization settings, the compiler can delete all your code and replace with a return 0.
The variables a and b are not changed.
The variable c is changed but not used by any other statements.
Your program has no effect (nothing is printed, there are no external actions like writing to hardware).
Thus your program can be reduced down to return 0;.
Code Memory vs. Data Memory
In general, processor instructions are placed in a segment you will call "code memory". This may actually reside in RAM and not in Flash or ROM. For example, on a PC, your code could be loaded from the hard drive into RAM and executed in RAM. Similarly with Flash, your code could be loaded from Flash into RAM and executed in RAM.
Constants, like numbers, can be placed into a Read-Only segment or in the Code Segment. Many processors can load constants from the Code Segment (see ARM and Intel assembly instructions). The Read-Only segment can live on a Read Only device, (ROM or Flash) or may live in RAM (or on a device like hard drive). All you can guarantee, is that the code will not write to the Read-Only segment.
Data Memory is different. The C++ language has at least 3 areas of "data" memory (where variables live): 1) Local (a.k.a. stack), where short lifetime variables reside; 2) Dynamic memory (a.k.a. heap), allocated by using new or malloc and 3) Automatic/Global variables. These memory areas can be placed anywhere, as long as the memory has read and write capabilities. They don't need to be fast, only read & write (for example, the hard drive can be used as data memory).
Memory organization is more complicated than having Code, Stack and Heap. In the embedded systems world, memory can be place in non-standard locations and there may be a need to have more detailed memory segments so they can be placed in different areas. For example, an embedded system may want to place the constants into Flash so that they can be changed easily (even though they may be more efficiently accessed in the Code Segment). Some code may want to be placed into the Boot Area of the processor (which is programmed by the processor manufacturer). Some embedded systems may have non-volatile memory (e.g. battery backed memory), which can behave like Read-Only memory.
Trust Your Compiler
Trust in your compiler to place code, data and variables in the most efficient areas as possible. Your compiler knows your platform and will make the best decisions for you. If you need to change your compiler's settings, you can, but you should really know what you are doing and why you need to change them. Most PC platforms load code from a hard drive (or SSD) into RAM and execute the code from RAM. Embedded systems are different and depend on the hardware devices. Code may be run from flash because the platform has minimal RAM. Some may store the code compressed in a serial access read-only device and have to decompress into RAM before executing. In these situations, the compilers are configured for these specializations. So, trust in your compiler and let it place the code and data into the correct segments and locations.

A quick oversimplification:
Code memory stores the sequence of machine language instructions compiled from your C++ piece of code (the ROM).
Actual data that is created and manipulated by the program is instead stored in the RAM, which can be understood as made of stack and heap: data is stored in the slower, but larger heap, while its addresses there are retained by pointers in the stack. The stack is hosted in the faster memory registers.
Pointers in the stack retrieve data in the heap when told so by the current instruction line in the ROM, or more in general when needed.

Related

Optimal Way of Using mlockall() for Real-time Application (nanosecond sensitive)

I am reading mlockall()'s manpage: http://man7.org/linux/man-pages/man2/mlock.2.html
It mentions
Real-time processes that are using mlockall() to prevent delays on page
faults should reserve enough locked stack pages before entering the time-
critical section, so that no page fault can be caused by function calls. This
can be achieved by calling a function that allocates a sufficiently large
automatic variable (an array) and writes to the memory occupied by this array in
order to touch these stack pages. This way, enough pages will be mapped for the
stack and can be locked into RAM. The dummy writes ensure that not even copy-
on-write page faults can occur in the critical section.
I am a bit confused by this statement:
This can be achieved by calling a function that allocates a sufficiently large
automatic variable (an array) and writes to the memory occupied by this array in
order to touch these stack pages.
All the automatic variables (variables on stack) are created "on the fly" on the stack when the function is called. So how can I achieve what the last statement says?
For example, let's say I have this function:
void foo() {
char a;
uint16_t b;
std::deque<int64_t> c;
// do something with those variables
}
Or does it mean before I call any function, I should call a function like this in main():
void reserveStackPages() {
int64_t stackPage[4096/8 * 1024 * 1024];
memset(stackPage, 0, sizeof(stackPage));
}
If yes, does it make a difference if I first allocate the stackPage variable on heap, write and then free? Probably yes, because heap and stack are 2 different region in the RAM?
std::deque exists above is just to bring up another related question -- what if I want to reserve memory for things using both stack pages and heap pages. Will calling "heap" version of reserveStackPages() help?
The goal is to minimize all the jitters in the application (yes, I know there are many other things to look at such as TLB miss, etc; just trying to deal with one kind of jitter at once, and slowly progressing into all).
Thanks in advance.
P.S. This is for a low latency trading application if it matters.
You generally don't need to use mlockall, unless you code (more or less hard) real-time applications (I actually never used it).
If you do need it, you'll better code in C (not in genuine C++) the most real-time parts of your code, because you surely want to understand the details of memory allocation. Notice that unless you dive into std::deque implementation, you don't exactly know where it is sitting (probably most of the data is heap allocated, even if your c is an automatic variable).
You should first understand in details the virtual address space of your process. For that, proc(5) is useful: from inside your process, you'll read /proc/self/maps (see this), from outside (e.g. some terminal) you'll do cat /proc/1234/maps for a process of pid 1234. Or use pmap(1).
because heap and stack are 2 different regions in the RAM?
In fact, your process' address space contains many segments (listed in /proc/1234/maps), much more that two. Typically every dynamically linked shared library (such as libc.so) brings a few segments.
Try cat /proc/self/maps and cat /proc/$$/maps in your terminal to get a better intuition about virtual address spaces. On my machine, the first gives 19 segments of the cat process -each displayed as a line- and the second 97 segments of the zsh (my shell) process.
To ensure that your stack has enough space, you indeed could call a function allocating a large enough automatic variable, like your reserveStackPages. Beware that call stacks are practically of limited size (a few megabytes usually, see also setrlimit(2)).
If you really need mlockall (which is unlikely) you might consider linking statically your program (to have less segments in your virtual address space).
Look also into madvise(2) (and perhaps mincore(2)). It is generally much more useful than mlockall. BTW, in practice, most of your virtual memory is in RAM (unless your system experiments thrashing, and then you'll see it immediately).
Read also Operating Systems: Three Easy Pieces to understand the role of paging.
PS. Nano-second sensitive applications does not make much sense (because of cache misses that the software does not control).

Why the C++ global variable not affect to memory usage of program

In my program I declare an initialized global variable (as an array).
But it only affects the size of executable file, the memory usage by the program was not affected.
My program is like that
char arr[1014*1024*100] = {1};
int _tmain(int argc, _TCHAR* argv[])
{
while (true)
{
}
return 0;
}
Size of executable file is 118MB but memory usage when running program was only 0.3MB
Can anyone explain for me?
Most operating systems used demand-paged virtual memory.
This means that when you load a program, the executable file for that program isn't allow loaded into memory immediately. Instead, virtual memory pages are set up to map the file to memory. When (and if) you actually refer to an address, that causes a page fault, which the OS then handles by reading the appropriate part of the file into physical memory, then letting the instruction re-execute.
In your case, you don't refer to arr, so the OS never pulls that data into memory.
If you were to look at the virtual address space used by your program (rather than the physical memory you're apparently now looking at), you'd probably see address space allocated for all of arr. The virtual address space isn't often very interesting or useful to examine though, so most things that tell you about memory usage will tell you only about the physical RAM being used to store actual data, not the virtual address space that's allocated but never used.
Even if you do refer to the data, the OS can be fairly clever: depending on how often you refer to the data (and whether you modify it), only part of the data may ever be loaded into RAM at any given time. If it's been modified, the modified portions can be written to the paging file to make room in RAM for data that's being used more often. If it's not modified, it can be discarded (because the original data can be re-loaded from the original file on disk whenever it's needed).
The reason your memory in use while your executable is executing is significantly smaller than the space required on your hard-drive (or solid-state drive) to store the executable is because you're not pulling the array itself into memory.
In your program, you never access or call your array—let alone bring into memory all at once in parallel. Because of that, the memory needed to run your executable is incredibly small when compared to the size of the executable (which has to store your massively large array).
I hope that makes sense. The difference between the two is that one is executing and one is stored on your computer's internal disk. Something is only brought into execution when it's brought into memory.

What does "Memory allocated at compile time" really mean?

In programming languages like C and C++, people often refer to static and dynamic memory allocation. I understand the concept but the phrase "All memory was allocated (reserved) during compile time" always confuses me.
Compilation, as I understand it, converts high level C/C++ code to machine language and outputs an executable file. How is memory "allocated" in a compiled file ? Isn't memory always allocated in the RAM with all the virtual memory management stuff ?
Isn't memory allocation by definition a runtime concept ?
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount ?
This is one of the pages where the phrase is used under the heading "Static allocation".
Back To Basics: Memory allocation, a walk down the history
Memory allocated at compile-time means the compiler resolves at compile-time where certain things will be allocated inside the process memory map.
For example, consider a global array:
int array[100];
The compiler knows at compile-time the size of the array and the size of an int, so it knows the entire size of the array at compile-time. Also a global variable has static storage duration by default: it is allocated in the static memory area of the process memory space (.data/.bss section). Given that information, the compiler decides during compilation in what address of that static memory area the array will be.
Of course that memory addresses are virtual addresses. The program assumes that it has its own entire memory space (From 0x00000000 to 0xFFFFFFFF for example). That's why the compiler could do assumptions like "Okay, the array will be at address 0x00A33211". At runtime that addresses are translated to real/hardware addresses by the MMU and OS.
Value initialized static storage things are a bit different. For example:
int array[] = { 1 , 2 , 3 , 4 };
In our first example, the compiler only decided where the array will be allocated, storing that information in the executable.
In the case of value-initialized things, the compiler also injects the initial value of the array into the executable, and adds code which tells the program loader that after the array allocation at program start, the array should be filled with these values.
Here are two examples of the assembly generated by the compiler (GCC4.8.1 with x86 target):
C++ code:
int a[4];
int b[] = { 1 , 2 , 3 , 4 };
int main()
{}
Output assembly:
a:
.zero 16
b:
.long 1
.long 2
.long 3
.long 4
main:
pushq %rbp
movq %rsp, %rbp
movl $0, %eax
popq %rbp
ret
As you can see, the values are directly injected into the assembly. In the array a, the compiler generates a zero initialization of 16 bytes, because the Standard says that static stored things should be initialized to zero by default:
8.5.9 (Initializers) [Note]:
Every object of static storage duration is zero-initialized at
program startup before any other initial- ization takes place. In some
cases, additional initialization is done later.
I always suggest people to disassembly their code to see what the compiler really does with the C++ code. This applies from storage classes/duration (like this question) to advanced compiler optimizations. You could instruct your compiler to generate the assembly, but there are wonderful tools to do this on the Internet in a friendly manner. My favourite is GCC Explorer.
Memory allocated at compile time simply means there will be no further allocation at run time -- no calls to malloc, new, or other dynamic allocation methods. You'll have a fixed amount of memory usage even if you don't need all of that memory all of the time.
Isn't memory allocation by definition a runtime concept?
The memory is not in use prior to run time, but immediately prior to execution starting its allocation is handled by the system.
If I make a 1KB statically allocated variable in my C/C++ code, will that increase the size of the executable by the same amount?
Simply declaring the static will not increase the size of your executable more than a few bytes. Declaring it with an initial value that is non-zero will (in order to hold that initial value). Rather, the linker simply adds this 1KB amount to the memory requirement that the system's loader creates for you immediately prior to execution.
Memory allocated in compile time means that when you load the program, some part of the memory will be immediately allocated and the size and (relative) position of this allocation is determined at compile time.
char a[32];
char b;
char c;
Those 3 variables are "allocated at compile time", it means that the compiler calculates their size (which is fixed) at compile time. The variable a will be an offset in memory, let's say, pointing to address 0, b will point at address 33 and c at 34 (supposing no alignment optimization). So, allocating 1Kb of static data will not increase the size of your code, since it will just change an offset inside it. The actual space will be allocated at load time.
Real memory allocation always happens in run time, because the kernel needs to keep track of it and to update its internal data structures (how much memory is allocated for each process, pages and so on). The difference is that the compiler already knows the size of each data you are going to use and this is allocated as soon as your program is executed.
Remember also that we are talking about relative addresses. The real address where the variable will be located will be different. At load time the kernel will reserve some memory for the process, lets say at address x, and all the hard coded addresses contained in the executable file will be incremented by x bytes, so that variable a in the example will be at address x, b at address x+33 and so on.
Adding variables on the stack that take up N bytes doesn't (necessarily) increase the bin's size by N bytes. It will, in fact, add but a few bytes most of the time.
Let's start off with an example of how adding a 1000 chars to your code will increase the bin's size in a linear fashion.
If the 1k is a string, of a thousand chars, which is declared like so
const char *c_string = "Here goes a thousand chars...999";//implicit \0 at end
and you then were to vim your_compiled_bin, you'd actually be able to see that string in the bin somewhere. In that case, yes: the executable will be 1 k bigger, because it contains the string in full.
If, however you allocate an array of ints, chars or longs on the stack and assign it in a loop, something along these lines
int big_arr[1000];
for (int i=0;i<1000;++i) big_arr[i] = some_computation_func(i);
then, no: it won't increase the bin... by 1000*sizeof(int)
Allocation at compile time means what you've now come to understand it means (based on your comments): the compiled bin contains information the system requires to know how much memory what function/block will need when it gets executed, along with information on the stack size your application requires. That's what the system will allocate when it executes your bin, and your program becomes a process (well, the executing of your bin is the process that... well, you get what I'm saying).
Of course, I'm not painting the full picture here: The bin contains information about how big a stack the bin will actually be needing. Based on this information (among other things), the system will reserve a chunk of memory, called the stack, that the program gets sort of free reign over. Stack memory still is allocated by the system, when the process (the result of your bin being executed) is initiated. The process then manages the stack memory for you. When a function or loop (any type of block) is invoked/gets executed, the variables local to that block are pushed to the stack, and they are removed (the stack memory is "freed" so to speak) to be used by other functions/blocks. So declaring int some_array[100] will only add a few bytes of additional information to the bin, that tells the system that function X will be requiring 100*sizeof(int) + some book-keeping space extra.
On many platforms, all of the global or static allocations within each module will be consolidated by the compiler into three or fewer consolidated allocations (one for uninitialized data (often called "bss"), one for initialized writable data (often called "data"), and one for constant data ("const")), and all of the global or static allocations of each type within a program will be consolidated by the linker into one global for each type. For example, assuming int is four bytes, a module has the following as its only static allocations:
int a;
const int b[6] = {1,2,3,4,5,6};
char c[200];
const int d = 23;
int e[4] = {1,2,3,4};
int f;
it would tell the linker that it needed 208 bytes for bss, 16 bytes for "data", and 28 bytes for "const". Further, any reference to a variable would be replaced with an area selector and offset, so a, b, c, d, and e, would be replaced by bss+0, const+0, bss+4, const+24, data+0, or bss+204, respectively.
When a program is linked, all of the bss areas from all the modules are be concatenated together; likewise the data and const areas. For each module, the address of any bss-relative variables will be increased by the size of all preceding modules' bss areas (again, likewise with data and const). Thus, when the linker is done, any program will have one bss allocation, one data allocation, and one const allocation.
When a program is loaded, one of four things will generally happen depending upon the platform:
The executable will indicate how many bytes it needs for each kind of data and--for the initialized data area, where the initial contents may be found. It will also include a list of all the instructions which use a bss-, data-, or const- relative address. The operating system or loader will allocate the appropriate amount of space for each area and then add the starting address of that area to each instruction which needs it.
The operating system will allocate a chunk of memory to hold all three kinds of data, and give the application a pointer to that chunk of memory. Any code which uses static or global data will dereference it relative to that pointer (in many cases, the pointer will be stored in a register for the lifetime of an application).
The operating system will initially not allocate any memory to the application, except for what holds its binary code, but the first thing the application does will be to request a suitable allocation from the operating system, which it will forevermore keep in a register.
The operating system will initially not allocate space for the application, but the application will request a suitable allocation on startup (as above). The application will include a list of instructions with addresses that need to be updated to reflect where memory was allocated (as with the first style), but rather than having the application patched by the OS loader, the application will include enough code to patch itself.
All four approaches have advantages and disadvantages. In every case, however, the compiler will consolidate an arbitrary number of static variables into a fixed small number of memory requests, and the linker will consolidate all of those into a small number of consolidated allocations. Even though an application will have to receive a chunk of memory from the operating system or loader, it is the compiler and linker which are responsible for allocating individual pieces out of that big chunk to all the individual variables that need it.
The core of your question is this: "How is memory "allocated" in a compiled file? Isn't memory always allocated in the RAM with all the virtual memory management stuff? Isn't memory allocation by definition a runtime concept?"
I think the problem is that there are two different concepts involved in memory allocation. At its basic, memory allocation is the process by which we say "this item of data is stored in this specific chunk of memory". In a modern computer system, this involves a two step process:
Some system is used to decide the virtual address at which the item will be stored
The virtual address is mapped to a physical address
The latter process is purely run time, but the former can be done at compile time, if the data have a known size and a fixed number of them is required. Here's basically how it works:
The compiler sees a source file containing a line that looks a bit like this:
int c;
It produces output for the assembler that instructs it to reserve memory for the variable 'c'. This might look like this:
global _c
section .bss
_c: resb 4
When the assembler runs, it keeps a counter that tracks offsets of each item from the start of a memory 'segment' (or 'section'). This is like the parts of a very large 'struct' that contains everything in the entire file it doesn't have any actual memory allocated to it at this time, and could be anywhere. It notes in a table that _c has a particular offset (say 510 bytes from the start of the segment) and then increments its counter by 4, so the next such variable will be at (e.g.) 514 bytes. For any code that needs the address of _c, it just puts 510 in the output file, and adds a note that the output needs the address of the segment that contains _c adding to it later.
The linker takes all of the assembler's output files, and examines them. It determines an address for each segment so that they won't overlap, and adds the offsets necessary so that instructions still refer to the correct data items. In the case of uninitialized memory like that occupied by c (the assembler was told that the memory would be uninitialized by the fact that the compiler put it in the '.bss' segment, which is a name reserved for uninitialized memory), it includes a header field in its output that tells the operating system how much needs to be reserved. It may be relocated (and usually is) but is usually designed to be loaded more efficiently at one particular memory address, and the OS will try to load it at this address. At this point, we have a pretty good idea what the virtual address is that will be used by c.
The physical address will not actually be determined until the program is running. However, from the programmer's perspective the physical address is actually irrelevant—we'll never even find out what it is, because the OS doesn't usually bother telling anyone, it can change frequently (even while the program is running), and a main purpose of the OS is to abstract this away anyway.
An executable describes what space to allocate for static variables. This allocation is done by the system, when you run the executable. So your 1kB static variable won't increase the size of the executable with 1kB:
static char[1024];
Unless of course you specify an initializer:
static char[1024] = { 1, 2, 3, 4, ... };
So, in addition to 'machine language' (i.e. CPU instructions), an executable contains a description of the required memory layout.
Memory can be allocated in many ways:
in application heap (whole heap is allocated for your app by OS when the program starts)
in operating system heap (so you can grab more and more)
in garbage collector controlled heap (same as both above)
on stack (so you can get a stack overflow)
reserved in code/data segment of your binary (executable)
in remote place (file, network - and you receive a handle not a pointer to that memory)
Now your question is what is "memory allocated at compile time". Definitely it is just an incorrectly phrased saying, which is supposed to refer to either binary segment allocation or stack allocation, or in some cases even to a heap allocation, but in that case the allocation is hidden from programmer eyes by invisible constructor call. Or probably the person who said that just wanted to say that memory is not allocated on heap, but did not know about stack or segment allocations.(Or did not want to go into that kind of detail).
But in most cases person just wants to say that the amount of memory being allocated is known at compile time.
The binary size will only change when the memory is reserved in the code or data segment of your app.
You are right. Memory is actually allocated (paged) at load time, i.e. when the executable file is brought into (virtual) memory. Memory can also be initialized on that moment. The compiler just creates a memory map. [By the way, stack and heap spaces are also allocated at load time !]
I think you need to step back a bit. Memory allocated at compile time.... What can that mean? Can it mean that memory on chips that have not yet been manufactured, for computers that have not yet been designed, is somehow being reserved? No. No, time travel, no compilers that can manipulate the universe.
So, it must mean that the compiler generates instructions to allocate that memory somehow at runtime. But if you look at it in from the right angle, the compiler generates all instructions, so what can be the difference. The difference is that the compiler decides, and at runtime, your code can not change or modify its decisions. If it decided it needed 50 bytes at compile time, at runtime, you can't make it decide to allocate 60 -- that decision has already been made.
If you learn assembly programming, you will see that you have to carve out segments for the data, the stack, and code, etc. The data segment is where your strings and numbers live. The code segment is where your code lives. These segments are built into the executable program. Of course the stack size is important as well... you wouldn't want a stack overflow!
So if your data segment is 500 bytes, your program has a 500 byte area. If you change the data segment to 1500 bytes, the size of the program will be 1000 bytes larger. The data is assembled into the actual program.
This is what is going on when you compile higher level languages. The actual data area is allocated when it is compiled into an executable program, increasing the size of the program. The program can request memory on the fly, as well, and this is dynamic memory. You can request memory from the RAM and the CPU will give it to you to use, you can let go of it, and your garbage collector will release it back to the CPU. It can even be swapped to a hard disk, if necessary, by a good memory manager. These features are what high level languages provide you.
I would like to explain these concepts with the help of few diagrams.
This is true that memory cannot be allocated at compile time, for sure.
But, then what happens in fact at compile time.
Here comes the explanation.
Say, for example a program has four variables x,y,z and k.
Now, at compile time it simply makes a memory map, where the location of these variables with respect to each other is ascertained.
This diagram will illustrate it better.
Now imagine, no program is running in memory.
This I show by a big empty rectangle.
Next, the first instance of this program is executed.
You can visualize it as follows.
This is the time when actually memory is allocated.
When second instance of this program is running, the memory would look like as follows.
And the third ..
So on and so forth.
I hope this visualization explains this concept well.
There is very nice explanation given in the accepted answer. Just in case i will post the link which i have found useful.
https://www.tenouk.com/ModuleW.html
One among the many thing what a compiler does is that create and maintain a SYMTAB(Symbol Table under the section.symtab). This will be purely created and maintained by compilers using any Data Structure(List, Trees...etc) and not for the developers eyes. Any access request made by the developers this is where it will hit first.
Now about the Symbol Table,
We only need to know about the two columns Symbol Name and the Offset.
Symbol Name column will have the variable names and the offset column will have the offset value.
Lets see this with an example:
int a , b , c ;
Now we all know that the register Stack_Pointer(sp) points to the Top of the Stack Memory. Let that be sp = 1000.
Now the Symbol Name column will have three values in it a then b and then c. Reminding you all that variable a will be at the top of the stack memory.
So a's equivalent offset value will be 0.
(Compile Time Offset_Value)
Then b and its equivalent offset value will be 1. (Compile Time Offset_Value)
Then c and its equivalent offset value will be 2. (Compile Time Offset_Value)
Now calculating a's Physical address (or) Runtime Memory Address = (sp + offset_value of a)
= (1000 + 0) = 1000
Now calculating b's Physical address (or) Runtime Memory Address = (sp - offset_value of b)
= (1000 - 1) = 996
Now calculating c's Physical address (or) Runtime Memory Address = (sp - offset_value of c)
= (1000 - 2) = 992
Therefore at the time of the compilation we will only be having the offset values and only during the runtime the actual physical addresses are calculated.
Note:
Stack_Pointer value will be assigned only after the program is loaded. Pointer Arithmetic happens between the Stack_Pointer register and the variables offset to calculate the variables Physical Address.
"POINTERS AND POINTER ARITHMETIC, WAY OF THE PROGRAMMING WORLD"
Share what I learned about this question.
You can understand this issue in two steps:
First, the compilation step: the compiler generates the binary. In Linux system, binary is a file in ELF (Executable and Linkable Format) format. ELF file contains several sections, including .bss and .data
.data
Initialized data, with read/write access rights
.bss
Uninitialized data, with read/write access rights (=WA)
.data and .bss just map to the segments of process's memory layout, which contains static variables.
second, the loading step. When the binary file get executed, the ELF file will be loaded into process's memory. The loader can find static variables' information from ELF file.
Simply speaking, the compiler and the loader follow the same standard to communicate with each other, and the standard is ELF format.

How is a program loaded in ROM?

When you run a program on windows, it is loaded into computer memory organized as:
Data Segment
Stack
Code segment
The data segment may contain data which is read only or has read and write access.
E.g:
char *c = "Hello World";
The string Hello World is said to be stored in read only section of memory. Is it stored in physical memory which is sometimes called RAM or is it stored in ROM which is read only? How can it be written if it is read only?
A PC has only one area of memory that really qualifies as ROM, and that is where the BIOS is stored. All of Windows and the programs loaded in Windows will be in RAM.
The x86 processor memory management is able to mark blocks of memory as read only, but the linker and OS have to work together to enable this. It happens after the program is loaded into memory.
It is stored in RAM. The Operating System, in cooperation with the processor itself, is able to protect regions of memory so that any attempt to write to them from user code causes an exception.
In many embedded systems, they have RAM and some type of Read Only Memory, often times called Flash (it can be programmed multiple times without being removed from the Printed Circuit Board).
Simple embedded applications place the executable and read-only data sections in Flash and execute out of Flash. The read/write variables are placed into RAM. Let us consider this model for your example code fragment:
char * c = "Hello World!";
In the above statement, the variable c lives in RAM because the default setting for variables is read & write access. If you specified that the variable was constant, it would live in ROM {Actually, it would represent a location in ROM.}: enter code here
char * const c = "Hello World!"; // A constant pointer that lives in ROM.
The compiler treats the literal text "Hello World!" a little more complex. The actual text lives in ROM, either in the executable area or in a data area; depends on the translator. Many compilers will allocate memory in RAM and copy the literal into RAM and make the variable c point to the copy in RAM. This is because the literal was not specified as constant.
To avoid the copying of the literal into RAM, declare the variable pointing to constant data:
const char * c = "Hello World"; // A pointer to constant data.
The definition above still allows the pointer to point to different things during execution. If you want to refer to one instance of a text literal throughout the program, declare a constant pointer to constant data:
const * char * const c = "Hello World!"; // A constant pointer to constant data
This technique allows executables to load into RAM (for faster execution) and still access read-only data from ROM (which frees up SRAM for true read/write variables).
On most PCs, everything lives either on Non-volatile memory (hard drive, BIOS, etc) or in RAM. Common method is to load programs from ROM (includes hard drive) and execute in RAM. When loading the executable into RAM, the OS usually loads the Read-Only data into RAM also. The read-only data may be guarded by the OS so an exception is generated when the application writes to this area.
It's stored in RAM.
Char pointers which are initialised with a string, like in your example are readonly.
You can cast a const variable so that you can "write" to it but the behaviour of your program will not be predictable.
Readonly regions of RAM are protected by the Operating system.
This probably varies depending on what kind of system you're on. The answer above would be for your standard PC. Embedded systems might actually write the constant data to some kind of non volatile memory.
If it helps, consider that there's two different ways in which memory can be read-only.
At the hardware level, you're referring to ROM. Once it's written to/created, the physical properties of it don't allow it's value to be changed.
At the software level, the OS (or other level higher than the user's program space) prevents the values stored in RAM from being modified.
When you're talking about read-only memory for program values, you're generally talking about the second type. The values are initialized when the program starts, and then the OS doesn't allow them to be modified.
Note: The above is an extremely simplistic explanation, but it does give the general idea.
It is stored int RAM memory. ]:>

Is runtime stack kept in data segment of memory?

I'm very curious of the stack memory organization after I experiment what's going on in the background and obviously saw it's matching with tiny knowledge I acquired from books. Just wanted to check if what I've understood is correct.
I have a fundamental program -- has 2 functions, first one is foo and the other is main (the entry point).
void foo(){
// do something here or dont
}
int main(){
int i = 0;
printf("%p %p %p\n",foo, &i, main);
system("PAUSE");
return EXIT_SUCCESS;
};
The output of the program is shown below, main's local variable i is located totally in a unrelated position. integer is a value type but checked it again with a char * pointer local to main and obtain similar results.
00401390 0022FF44 00401396
Press any key to continue . . .
I mainly understand that code and variables are allocated into different segments of memory (code segment/data segment). So basically is it right to say call stack collapses basic information about the execution of functions (their local variables, parameters, returning points) and keep them in the data segment?
A little caveat at the start: all of these answers are somewhat affected by the operating system and hardware architecture. Windows does things fairly radically differently from UNIX-like languages, real-time operating systems and old small-system UNIX.
But the basic answer as #Richie and #Paul have said, is "yes." When your compiler and linker get through with the code, it's broken up into what are known as "text" and "data" segments in UNIX. A text segment contains instructions and some kinds of static data; a data segment contains, well, data.
A big chunk of the data segment is then allocated for stack and heap space. Other chunks can be allocated to things like static or extern data structures.
So yes, when the program runs, the program counter is busily fetching instructions from a different segment than the data. Now we get into some architecture dependencies, but in general if you have segmented memory your instructions are constructed in such a way that fetching a byte from the segments is as efficient as possible, In the old 360 architecture, they had base registers, in x86 have a bunch of hair that grew as the address space went to the old 8080's to modern processors, but all of the instructions are very carefully optimized because, as you can imagine, fetching instructions and their operands are very intensively used.
Now we et to more modern architectures with virtual memory and memory management units. Now the machine has specific hardware that let's the program treat the address space as a big flat range of addresses; the various segments simply get placed in that bit virtual address space. The MMU's job is to take a virtual address and translate it to a physical address, including what to do if that virtual address doesn't happen to be in physical memory at all at the moment. Again, the MMU hardware is very heavily optimized, but that doesn't mean there is no performance cost associated. But as processors have gotten faster and programs have goten bigger, it's become less and less important.
Yes, that's exactly right. Code and data live in different parts of memory, with different permissions. The stack holds parameters, return addresses and local ("automatic") variables, and lives with the data.
Yes.
Imagine that your code memory is ROM, and your data memory is RAM(a common small chip architecture). Then you see the stack must be in data memory.
Well, I can speak for SPARC:
Yes. When you run the program, the program is read twice (at least in SPARC). The program gets loaded into memory, and any array/stack allocations get loaded afterwards. In the second pass through the program, the stacks get allocated into separate memory.
I am not sure for CISC based processors, but I suspect it doesn't vary too much.
Your program exhibits undefined behavior specifically because:
you fail to include <stdio.h> or <cstdio> depending on the language you are compiling your code as
printf and all variable argument functions do not have the capacity to type-check their arguments. Hence it is obligatory on your part to pass correctly typed arguments. You really should do:
system() has no declaration in scope. Include <stdlib.h> or <cstdlib> as the case maybe.
Write your code as:
#include <stdio.h>
int main() {
/* ... */
printf("%p %p %p\n", (void *)foo, (void *)&i, (void *)main);
/* ... */
}
Also note that:
The definition of void foo() is not a prototype in C, but in C++. However, if you were to write void foo(void) you'd get a prototype in both languages.
system() is implementation dependent -- your code may not behave as expected across platforms.
The language proper(C or C++) does not put any restrictions on how to organize memory. It does not even have the concept of a stack or a heap. These are defined by implementations as they deem fit. You should ideally consult documentation provided by your implementation to get a fair idea of what they do.