I wonder where constant variables are stored. Is it in the same memory area as global variables? Or is it on the stack?
How they are stored is an implementation detail (depends on the compiler).
For example, in the GCC compiler, on most machines, read-only variables, constants, and jump tables are placed in the text section.
Depending on the data segmentation that a particular processor follows, we have five segments:
Code Segment - Stores only code, ROM
BSS (or Block Started by Symbol) Data segment - Stores initialised global and static variables
Stack segment - stores all the local variables and other informations regarding function return address etc
Heap segment - all dynamic allocations happens here
Data BSS (or Block Started by Symbol) segment - stores uninitialised global and static variables
Note that the difference between the data and BSS segments is that the former stores initialized global and static variables and the later stores UNinitialised ones.
Now, Why am I talking about the data segmentation when I must be just telling where are the constant variables stored... there's a reason to it...
Every segment has a write protected region where all the constants are stored.
For example:
If I have a const int which is local variable, then it is stored in the write protected region of stack segment.
If I have a global that is initialised const var, then it is stored in the data segment.
If I have an uninitialised const var, then it is stored in the BSS segment...
To summarize, "const" is just a data QUALIFIER, which means that first the compiler has to decide which segment the variable has to be stored and then if the variable is a const, then it qualifies to be stored in the write protected region of that particular segment.
Consider the code:
const int i = 0;
static const int k = 99;
int function(void)
{
const int j = 37;
totherfunc(&j);
totherfunc(&i);
//totherfunc(&k);
return(j+3);
}
Generally, i can be stored in the text segment (it's a read-only variable with a fixed value). If it is not in the text segment, it will be stored beside the global variables. Given that it is initialized to zero, it might be in the 'bss' section (where zeroed variables are usually allocated) or in the 'data' section (where initialized variables are usually allocated).
If the compiler is convinced the k is unused (which it could be since it is local to a single file), it might not appear in the object code at all. If the call to totherfunc() that references k was not commented out, then k would have to be allocated an address somewhere - it would likely be in the same segment as i.
The constant (if it is a constant, is it still a variable?) j will most probably appear on the stack of a conventional C implementation. (If you were asking in the comp.std.c news group, someone would mention that the standard doesn't say that automatic variables appear on the stack; fortunately, SO isn't comp.std.c!)
Note that I forced the variables to appear because I passed them by reference - presumably to a function expecting a pointer to a constant integer. If the addresses were never taken, then j and k could be optimized out of the code altogether. To remove i, the compiler would have to know all the source code for the entire program - it is accessible in other translation units (source files), and so cannot as readily be removed. Doubly not if the program indulges in dynamic loading of shared libraries - one of those libraries might rely on that global variable.
(Stylistically - the variables i and j should have longer, more meaningful names; this is only an example!)
Depends on your compiler, your system capabilities, your configuration while compiling.
gcc puts read-only constants on the .text section, unless instructed otherwise.
Usually they are stored in read-only data section (while global variables' section has write permissions). So, trying to modify constant by taking its address may result in access violation aka segfault.
But it depends on your hardware, OS and compiler really.
offcourse not , because
1) bss segment stored non inilized variables it obviously another type is there.
(I) large static and global and non constants and non initilaized variables it stored .BSS section.
(II) second thing small static and global variables and non constants and non initilaized variables stored in .SBSS section this included in .BSS segment.
2) data segment is initlaized variables it has 3 types ,
(I) large static and global and initlaized and non constants variables its stord in .DATA section.
(II) small static and global and non constant and initilaized variables its stord in .SDATA1 sectiion.
(III) small static and global and constant and initilaized OR non initilaized variables its stord in .SDATA2 sectiion.
i mention above small and large means depents upon complier for example small means < than 8 bytes and large means > than 8 bytes and equal values.
but my doubt is local constant are where it will stroe??????
This is mostly an educated guess, but I'd say that constants are usually stored in the actual CPU instructions of your compiled program, as immediate data. So in other words, most instructions include space for the address to get data from, but if it's a constant, the space can hold the value itself.
This is specific to Win32 systems.
It's compiler dependence but please aware that it may not be even fully stored. Since the compiler just needs to optimize it and adds the value of it directly into the expression that uses it.
I add this code in a program and compile with gcc for arm cortex m4, check the difference in the memory usage.
Without const:
int someConst[1000] = {0};
With const:
const int someConst[1000] = {0};
Global and constant are two completely separated keywords. You can have one or the other, none or both.
Where your variable, then, is stored in memory depends on the configuration. Read up a bit on the heap and the stack, that will give you some knowledge to ask more (and if I may, better and more specific) questions.
It may not be stored at all.
Consider some code like this:
#import<math.h>//import PI
double toRadian(int degree){
return degree*PI*2/360.0;
}
This enables the programmer to gather the idea of what is going on, but the compiler can optimize away some of that, and most compilers do, by evaluating constant expressions at compile time, which means that the value PI may not be in the resulting program at all.
Just as an an add on ,as you know that its during linking process the memory lay out of the final executable is decided .There is one more section called COMMON at which the common symbols from different input files are placed.This common section actually falls under the .bss section.
Some constants aren't even stored.
Consider the following code:
int x = foo();
x *= 2;
Chances are that the compiler will turn the multiplication into x = x+x; as that reduces the need to load the number 2 from memory.
I checked on x86_64 GNU/Linux system. By dereferencing the pointer to 'const' variable, the value can be changed. I used objdump. Didn't find 'const' variable in text segment. 'const' variable is stored on stack.
'const' is a compiler directive in "C". The compiler throws error when it comes across a statement changing 'const' variable.
Related
I saw that constexpr variables are highly discussed on stack overflow. But there is one thing no one talks about:
Where are constexpr variables stored?
Everyone knows the memory location tables of C and C++ programs.
stack
heap
static
text
Operating Systems (like Linux)
For operating systems, executable code and all the static variables get copied from the hard drive disc into the allocated areas of text, static ect. in the RAM. From there the program starts as a process.
Embedded Systems (like Atmel Controller)
For an embedded system, this is different. Here the executable code and the literals are permanently stored in a flash Memory; only the static Variables get copied into the RAM.
text or static area
I understand the benefits of constexpr compared to #defines, but for an embedded system programmer, there is always the performance question. On embedded systems, RAM is an expensive resource. So, I need to know if constexpr variables get stored in the text or in the static area. Or more precisely, are they permanently stored in the flash memory or do they get really created as variable into the RAM of an embedded system?
constexpr does not change the storage class of a variable. So the presence of constexpr does not change where the compiler can put the variable.
There are however some important characteristics of constexpr variables. Any variable could have these characteristics, but the constexpr qualifier specifically requires:
The variable cannot be changed (is implicitly const).
The variable is initialized with a compile-time constant expression.
Given these facts, the compiler can put the storage basically anywhere. If the type matches a register type and the value is of sufficient size that it can be loaded via a single "load literal" opcode, the compiler could convert every use of the variable into simply loading the literal value into a register.
Of course, if you start doing things like getting the address of the variable, it may have to take up actual storage. But even that depends on how good the compiler is at inlining.
I was reading this great post about memory layout of C programs. It says that default initialized global variables resides in the BSS segment, and if you explicitly provide a value to a global variable then it will reside in the data segment.
I've tested the following programs in C and C++ to examine this behaviour.
#include <iostream>
// Both i and s are having static storage duration
int i; // i will be kept in the BSS segment, default initialized variable, default value=0
int s(5); // s will be kept in the data segment, explicitly initialized variable,
int main()
{
std::cout<<&i<<' '<<&s;
}
Output:
0x488020 0x478004
So, from the output it clearly looks like both variable i & s resides in completely different segments. But if I remove the initializer (initial value 5 in this program) from the variable S and then run the program, it gives me the below output.
Output:
0x488020 0x488024
So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.
This behaviour is also the same in C.
#include <stdio.h>
int i; // i will be kept in the BSS segment, default initialized variable, default value=0
int s=5; // s will be kept in the data segment, explicitly initialized variable,
int main(void)
{
printf("%p %p\n",(void*)&i,(void*)&s);
}
Output:
004053D0 00403004
So, again we can say by looking at the output (means examining the address of variables), both variable i and s resides in completely different segments. But again if I remove the initializer (initial value 5 in this program) from the variable S and then run the program it gives me the below output.
Output:
004053D0 004053D4
So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.
Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments? Why is there a distinction about where the global variable resides between default initialized and explicitly initialized variables? If I am not wrong, the C and C++ standards never talk about the stack, heap, data segment, code segment, BSS segment and all such things which are implementation-specific. So, is it possible for a C++ implementation to store explicitly initialized and default initialized variables in the same segments instead of keeping it in different segments?
Neither language C or C++ has any notion of "segments", and not all OSs do either, so your question is inevitably dependent on the platform and compiler.
That said, common implementations will treat initialized vs. uninitialized variables differently. The main difference is that uninitialized (or default 0-initialized) data does not have to be actually saved with the compiled module, but only declared/reserved for later use at run time. In practical "segment" terms, initialized data is saved to disk as part of the binary, while uninitialized data is not, instead it's allocated at startup to satisfy the declared "reservations".
The really short answer is "because it takes up less space". (As noted by others, the compiler doesn't have to do this!)
In the executable file, the data section will contain data that has its value store in the relative place. This means for every byte of initialized data, that data section contains one byte.
For zero-initialized globals, there is no reason to store a lot of zeros. Instead, just store the size of the whole set of data in one single size-value. So instead of storing 4132 bytes of zero in the data seciton, there is just a "BSS is 4132 bytes long" - and it's up to the OS/runtime to set up so that it is zero. - in some cases, the runtime of the compiler will memset(BSSStart, 0, BSSSize) or similar. In for example Linux, all "unused" memory is filled with zero anyway when the process is created, so setting BSS to zero is just a matter of allocating the memory in the first place.
And of course, shorter executable files have several benefits: Less space taken up on your hard-disk, faster loading time [extra bonus if the OS pre-fills the allocated memory with zero], faster compile time as the compiler/linker doesn't have to write the data to disk.
So there is an entirely practical reason for this.
By definition, BSS is not a different segment, it is a part of data-segment.
In C and C++, statically-allocated objects without an explicit
initializer are initialized to zero, an implementation may also assign
statically-allocated variables and constants initialized with a value
consisting solely of zero-valued bits to the BSS section.
A reason to store them in BSS is, those types of variables with uninitialized or default values can be obtained in run-time without wasting space in the binary files rather than the variables which are placed in data-segment.
Quoting from C++ Primer:
The address of an object defined outside of any function is a constant expression, and so may be used to initialize a constexpr pointer.
In fact, each time I compile and run the following piece of code:
#include <iostream>
using namespace std;
int a = 1;
int main()
{
constexpr int *p = &a;
cout << "p = " << p << endl;
}
I always get the output:
p = 0x601060
Now, how is that possible? How can the address of an object (global or not) be known at compile time and be assigned to a constexpr? What if that part of the memory is being used for something else when the program is executed?
I always assumed that the memory is managed so that a free portion is allocated when a program is executed, but doesn't matter what particular part of the memory. However, since here we have a constexpr pointer, the program will always require a specific portion, that has to be free to allow the program execution. This doesn't make sense to me, could someone explain this behaviour please? Thanks.
EDIT: After reading your answers and a few articles online, I realized that I missed the whole concept of virtual memory... now it makes sense. It's quite surprising that neither C++ Primer nor Accelerated C++ mention this concept (maybe they will do it in later chapters, I'm still reading...).
However, quoting again C++ Primer:
A constant expression is an expression whose value cannot change and that can be evaluated at compile time.
Given that the linker has a major role in computing the fixed address of global objects, the book would have been more precise if it said "constant expression can be evaluated at link time", not "at compile time".
It's not actually true that the address of an object is known at compile time. What is known at compile time is the offset. When the program is compiled, the address is not emitted into the object file, but a marker to indicate the offset and the section.
To be simplistic about it, the linker then comes along, measures the size of each section, stitches them together and calculates the address of each marker in each object file now that it has a concrete 'base address' for each section.
Of course it's not quite that simple. A linker can also emit a map of the locations of all these adjusted values in its output, so that a loader or load-time linker can re-adjust them just prior to run time.
The point is, logically, for all intents and purposes, the address is a constant from the program's point of view. It's just that the constant isn't given a value until link/load time. When that value is available, every reference to that constant is overwritten by the linker/loader.
If your question is "why is it always the same address?" It's because your OS uses a standard virtual memory layout layered over the virtual memory manager. Addresses in a process are not real memory addresses - they are logical memory addresses. The piece of silicon at that 'address' is mapped in by the virtual memory management circuitry. Thus each process can use the "same" address, while actually using a different area of the memory chips.
I could go on about paging memory in and out, which is related, but it's a long topic. Further reading is encouraged.
It works because global variables are in static storage.
This is because the space for the global/static variable is allocated at compile time within the binary your compiler generates, in a region next to the program's machine code called the "data" segment. When the binary is copied and loaded into memory, the data segment becomes read-write.
This Wikipedia article includes a nice diagram of where the "data" segment fits into the virtual address space:
https://en.wikipedia.org/wiki/Data_segment
Automatic variables are not stored in the data segment because they may be instantiated as many times as their parent function is called. Moreover, they may be allocated at any depth of the stack. Thus it is not possible to know the address of an automatic variable at compile time in the general case.
This is not the case for global variables, which are clearly unique throughout the lifetime of the program. This allows the compiler to assign a fixed address for the variable which is separate from the stack.
I have seen code where variables with the register keyword are passed by reference into functions.
Version 1:
inline static void swap(register int &a, register int &b)
{
register int t = a;
a = b;
b = t;
}
Version 2:
inline static void swap(register int a, register int b)
{
register int t = a;
a = b;
b = t;
}
What are the differences between the two versions?
To my understanding, a and b are kept in registers so the reference operator shouldn't have any effect as the changes made to the values in these registers should persist across the caller-callee boundary, without the use of the reference operator.
In C programs, you cannot take the address of a register variable.
register int x;
int * p = &x; // Compiler error
This is sometimes useful in macros to prevent clients from taking the address of something that should only be used as a value.
The use of register is deprecated in the C++11 standard (see [depr.register]). In C++ it is legal to take the address of a register variable, but it not legal in the latest revision of the C++11 standard to declare an alignment for a register variable with alignas. See 7.6.2 Alignment speciļ¬er
Other than preventing the use of alignas() and causing a syntax error when used outside local, register does nothing in C++. Since it's deprecated and because I can't imagine any reason you would want to prevent the alignment of variable used inside a macro, you should avoid using register in C++ code.
To answer the question: In C++ there is no difference between your code and the equivalent code with register removed, so your "two versions" are different in the obvious way.
For a C++ program, the memory of a computer is like a succession of memory cells, each one byte in size, and each with a unique address. These single-byte memory cells are ordered in a way that allows data representations larger than one byte to occupy memory cells that have consecutive addresses.
This way, each cell can be easily located in the memory by means of its unique address. For example, the memory cell with the address 1776 always follows immediately after the cell with address 1775 and precedes the one with 1777, and is exactly one thousand cells after 776 and exactly one thousand cells before 2776.
When a variable is declared, the memory needed to store its value is assigned a specific location in memory (its memory address). Generally, C++ programs do not actively decide the exact memory addresses where its variables are stored. Fortunately, that task is left to the environment where the program is run - generally, an operating system that decides the particular memory locations on runtime. However, it may be useful for a program to be able to obtain the address of a variable during runtime in order to access data cells that are at a certain position relative to it.
these are the some silly question ..i want to ask..please help me to comprehend it
const int i=100; //1
///some code
long add=(long)&i; //2
Doubt:for the above code..will compiler first go through the whole code
for deciding whether memory should be allocated or not..or first it ll store the
variable in read only memory place and then..allocate stroage as well at 2
doubt:why taking address of variable enforce compiler to store variable on memory..even
though rom or register too have address
In your code example, add contains the address, not the value, of i. I believe you may have thought that i was not stored in normal memory unless/until you take its address. This is not the case.
const does not mean the value is stored in ROM. It is stored in normal memory (often the stack) just like any other variable. const means the compiler will go to some lengths to prevent you from modifying the value.
const is not, and was never intended, to be some sort of security mechanism. If you obtain the address of the memory and want to modify it, you can do so. Of course this is almost always a bad idea, but if you really need to do it, it is possible.
I never wrote a compiler implementing this, but I think that it would be simple to just handle the variable as a normal variable but using the constant value where the variable value is used and using the address of the variable if the address is used.
If at the end of the scope of the variable no one took the address then I can just drop it instead of doing a real allocation because for all other uses the constant value has been used instead of compiling a variable loading operation.
constant values (not the only use for const, but the one used here) are not 'stored in normal memory' (nor in ROM, of course). the compiler simply uses the value (100 in this case) whenever the code uses the variable.
Of course, if the value isn't stored anywhere, there's no meaning of an address for the constant.
Other uses of const are stored in 'normal memory', and you can take their address, but the result is a 'pointer to const value', so it's (in principle) unusable for modification of the value. A hard cast would of course change that, so they trigger a nasty compiler warning.
also, remember that the C/C++ compiler operates totally at compile time (by definition!), it's nothing unusual that some use at a later part affects the code generation of an early part.
A very obvious example is the declaration of stack variables: the compiler has to take into account all the variables declared at any given level to be able to generate the stack allocation at the block entry.
I am a little confused about what you are asking but looking at your code:
i = 100 with a address of 0x?????????????
add = whatever the address is stored as a long int
There is no (dynamic) memory allocation in this code. The two local variables are created on stack. The address of i is taken and brutally cast into long, which is then assigned to the second variable.