these are the some silly question ..i want to ask..please help me to comprehend it
const int i=100; //1
///some code
long add=(long)&i; //2
Doubt:for the above code..will compiler first go through the whole code
for deciding whether memory should be allocated or not..or first it ll store the
variable in read only memory place and then..allocate stroage as well at 2
doubt:why taking address of variable enforce compiler to store variable on memory..even
though rom or register too have address
In your code example, add contains the address, not the value, of i. I believe you may have thought that i was not stored in normal memory unless/until you take its address. This is not the case.
const does not mean the value is stored in ROM. It is stored in normal memory (often the stack) just like any other variable. const means the compiler will go to some lengths to prevent you from modifying the value.
const is not, and was never intended, to be some sort of security mechanism. If you obtain the address of the memory and want to modify it, you can do so. Of course this is almost always a bad idea, but if you really need to do it, it is possible.
I never wrote a compiler implementing this, but I think that it would be simple to just handle the variable as a normal variable but using the constant value where the variable value is used and using the address of the variable if the address is used.
If at the end of the scope of the variable no one took the address then I can just drop it instead of doing a real allocation because for all other uses the constant value has been used instead of compiling a variable loading operation.
constant values (not the only use for const, but the one used here) are not 'stored in normal memory' (nor in ROM, of course). the compiler simply uses the value (100 in this case) whenever the code uses the variable.
Of course, if the value isn't stored anywhere, there's no meaning of an address for the constant.
Other uses of const are stored in 'normal memory', and you can take their address, but the result is a 'pointer to const value', so it's (in principle) unusable for modification of the value. A hard cast would of course change that, so they trigger a nasty compiler warning.
also, remember that the C/C++ compiler operates totally at compile time (by definition!), it's nothing unusual that some use at a later part affects the code generation of an early part.
A very obvious example is the declaration of stack variables: the compiler has to take into account all the variables declared at any given level to be able to generate the stack allocation at the block entry.
I am a little confused about what you are asking but looking at your code:
i = 100 with a address of 0x?????????????
add = whatever the address is stored as a long int
There is no (dynamic) memory allocation in this code. The two local variables are created on stack. The address of i is taken and brutally cast into long, which is then assigned to the second variable.
Related
Quoting from C++ Primer:
The address of an object defined outside of any function is a constant expression, and so may be used to initialize a constexpr pointer.
In fact, each time I compile and run the following piece of code:
#include <iostream>
using namespace std;
int a = 1;
int main()
{
constexpr int *p = &a;
cout << "p = " << p << endl;
}
I always get the output:
p = 0x601060
Now, how is that possible? How can the address of an object (global or not) be known at compile time and be assigned to a constexpr? What if that part of the memory is being used for something else when the program is executed?
I always assumed that the memory is managed so that a free portion is allocated when a program is executed, but doesn't matter what particular part of the memory. However, since here we have a constexpr pointer, the program will always require a specific portion, that has to be free to allow the program execution. This doesn't make sense to me, could someone explain this behaviour please? Thanks.
EDIT: After reading your answers and a few articles online, I realized that I missed the whole concept of virtual memory... now it makes sense. It's quite surprising that neither C++ Primer nor Accelerated C++ mention this concept (maybe they will do it in later chapters, I'm still reading...).
However, quoting again C++ Primer:
A constant expression is an expression whose value cannot change and that can be evaluated at compile time.
Given that the linker has a major role in computing the fixed address of global objects, the book would have been more precise if it said "constant expression can be evaluated at link time", not "at compile time".
It's not actually true that the address of an object is known at compile time. What is known at compile time is the offset. When the program is compiled, the address is not emitted into the object file, but a marker to indicate the offset and the section.
To be simplistic about it, the linker then comes along, measures the size of each section, stitches them together and calculates the address of each marker in each object file now that it has a concrete 'base address' for each section.
Of course it's not quite that simple. A linker can also emit a map of the locations of all these adjusted values in its output, so that a loader or load-time linker can re-adjust them just prior to run time.
The point is, logically, for all intents and purposes, the address is a constant from the program's point of view. It's just that the constant isn't given a value until link/load time. When that value is available, every reference to that constant is overwritten by the linker/loader.
If your question is "why is it always the same address?" It's because your OS uses a standard virtual memory layout layered over the virtual memory manager. Addresses in a process are not real memory addresses - they are logical memory addresses. The piece of silicon at that 'address' is mapped in by the virtual memory management circuitry. Thus each process can use the "same" address, while actually using a different area of the memory chips.
I could go on about paging memory in and out, which is related, but it's a long topic. Further reading is encouraged.
It works because global variables are in static storage.
This is because the space for the global/static variable is allocated at compile time within the binary your compiler generates, in a region next to the program's machine code called the "data" segment. When the binary is copied and loaded into memory, the data segment becomes read-write.
This Wikipedia article includes a nice diagram of where the "data" segment fits into the virtual address space:
https://en.wikipedia.org/wiki/Data_segment
Automatic variables are not stored in the data segment because they may be instantiated as many times as their parent function is called. Moreover, they may be allocated at any depth of the stack. Thus it is not possible to know the address of an automatic variable at compile time in the general case.
This is not the case for global variables, which are clearly unique throughout the lifetime of the program. This allows the compiler to assign a fixed address for the variable which is separate from the stack.
Recently I was rereading the Effective C++ by Scott Meyers (3-rd edition). And according to Meyers:
"Also, though good compilers won’t set
aside storage for const objects of integral types (unless you create a
pointer or reference to the object), sloppy compilers may, and you may
not be willing to set aside memory for such objects."
Here in my code I can print the address of const variable, but I have not created a pointer or reference on it. I use Visual Studio 2012.
int main()
{
const int x = 8;
std::cout<<x<<" "<<&x<<std::endl;
}
The output is:
8 0015F9F4
Can anybody explain my the mismatch between the book and my code? Or I have somewhere mistaken?
By using the address-of operator on a variable, you are in fact creating a pointer. The pointer is a temporary object, not a declared variable, but it's very much there.
Furthermore there is a declared variable of pointer type that points to your variable: the argument to the overloaded operator << that you used to print the pointer.
std::cout<<x<<" "<<&x<<std::endl;
You tried to get the address of the variable x,so the compiler thinks it is necessary to generate codes to set aside storage for const objects.
By &x, you ODR-used the variable, which makes allocating actual storage for x necessary.
A good compiler (when using optimizations) will try to replace any compile-time constant by its value in your code to avoid making a memory access. However, if you do request the address of a constant (like you do) it can't do the optimization of not allocating memory to it.
However, one important thing to note is that it doesn't mean the research and replace wasn't done in your code. As you are not supposed to change the value of the constant, the compiler will assume it is safe to do a "research and replace" on it. If you do change the value with a const_cast you will get undefined behavior. It tends to work fine if you compile in debug but usually fails if your compiler optimizes the code.
In C++,for basic data type constants, the compiler will put it in the symbol table without allocating storage space, and ADT(Abstract Data Type)/UDT(User Defined Type) const object will need to allocate storage space (large objects). There are some cases also need to allocate storage space, such as forcing declared as extern symbolic constants or take the address of symbolic constants,etc.
Quite likely this has been asked/answered before, but not sure how to phrase it best, a link to a previously answered question would be great.
If you define something like
char myChar = 'a';
I understand that this will take up one byte in memory (depending on implementation and assuming no unicode and so on, the actual number is unimportant).
But I would assume the compiler/computer would also need to keep a table of variable types, addresses (i.e. pointers), and possibly more. Otherwise it would have the memory reserved, but would not be able to do anything with it. So that's already at least a few more bytes of memory consumed per variable.
Is this a correct picture of what happens, or am I misunderstanding what happens when a program gets compiled/executed? And if the above is correct, is it more to do with compilation, or execution?
The compiler will keep track of the properties of a variable - its name, lifetime, type, scope, etc. This information will exist in memory only during compilation. Once the program has been compiled and the program is executed, however, all that is left is the object itself. There is no type information at run-time (except if you use RTTI, then there will be some, but only because you required it for your program to function - such as is required for dynamic_casting).
Everything that happens in the code that accesses the object has been compiled into a form that treats it exactly as a single byte (because it's a char). The address that the object is located at can only be known at run-time anyway. However, variables with automatic storage duration (like local variables), are typically located simply by some fixed offset from the current stack frame. That offset is hard-baked into the executable.
Wether a variable contains extra information depends on the type of the variable and your compiler options. If you use RTTI, extra information is stored. If you compile with debug information then there will also extra overhead be added.
For native datatypes like your example of char there is usually no overhead, unless you have structs which also can cotnain padding bytes. If you define classes, there may be a virtual table associated with your class. However, if you dynamically allocate memory, then there usually will be some overhead along with your allocated memory.
Somtimes a variable may not even exist, because the optimizer realizes that there is no storage needed for it, and it can wrap it up in a register.
So in total, you can not rely on counting your used variables and sum their size up to calculate the amount of memory it requires because there is not neccessarily a 1:1: relation.
Some types can be detected in compile type, say in this code:
void foo(char c) {...}
it is obvious what type of variable c in compile time is.
In case of inheritance you cannot know the real type of the variable in the compile type, like:
void draw(Drawable* drawable); // where drawable can be Circle, Line etc.
But C++ compiler can help to determine the type of the Drawable using dynamic_cast. In this case it uses pointer to a virtual method tables, associated with an object to determine the real type.
While browsing open source code (from OpenCV), I came across the following type of code inside a method:
// copy class member to local variable for optimization
int foo = _foo; //where _foo is a class member
for (...) //a heavy loop that makes use of foo
From another question on SO I've concluded that the answer to whether or not this actually needs to be done or is done automatically by the compiler may be compiler/setting dependent.
My question is if it would make any difference if _foo were a static class member? Would there still be a point in this manual optimization, or is accessing a static class member no more 'expensive' than accessing a local variable?
P.S. - I'm asking out of curiosity, not to solve a specific problem.
Accessing a property means de-referencing the object, in order to access it.
As the property may change during the execution (read threads), the compiler will read the value from memory each time the value is accessed.
Using a local variable will allow the compiler to use a register for the value, as it can safely assume the value won't change from the outside. This way, the value is read only once from memory.
About your question concerning the static member, it's the same, as it can also be changed by another thread, for instance. The compiler will also need to read the value each time from memory.
I think a local variable is more likely to participate in some optimization, precisely because it is local to the function: this fact can be used by the compiler, for example if it sees that nobody modifies the local variable, then the compiler may load it once, and use it in every iteration.
In case of member data, the compiler may have to work more to conclude that nobody modifies the member. Think about multi-threaded application, and note that the memory model in C++11 is multi-threaded, which means some other thread might modify the member, so the compiler may not conclude that nobody modifies it, in consequence it has to emit code for load member for every expression which uses it, possibly multiple times in a single iteration, in order to work with the updated value of the member.
In this example the the _foo will be copied into new local variable. so both cases the same.
Statis values are like any other variable. its just stored in different memory segment dedicated for static memory.
Reading a static class member is effectively like reading a global variable. They both have a fixed address. Reading a non-static one means first reading the this-pointer, adding an offset to the result and then reading that address. In other words, reading a non-static one requires more steps and memory accesses.
I understand pointers are used to point to objects so you would have to the same around in a program. But were and how are pointer names stored. Would it be an overkill to declare a pointer name that occupies more resource than the object it points to for example:
int intOne = 0;
int *this_pointer_is_pointing_towards_intOne = &intOne;
I know this is a ridiculous example but i was just trying to get the idea across.
Edit: the name of the pointer has to be stored somewhere taking more bytes than the address of the pointed at object.
The length of the variable name doesn't have any effect on the size of your program, just the length of time it takes to write the program.
The name of local variables are only needed for the compiler to find the variables you want to refer to. After compiling, those names usually are erased and completely replaced by numeric symbols or equivalents. This happens for all names that have no linkage practically (of course if you do a debug build, things may be different). So, the same is true for function parameters.
The name of global variables, for example, can't be erased, because you may use it from another unit in your program, and the linker has to be able to look it up. But after your program has been linked, even the name of those can be erased.
And after all, these do not occupy runtime memory. Those names are stored in a reallocation table for the purpose of linking (see the strip program how to remove those names).
But anyway, we are talking about a few bytes which are already wasted by alignment and whatnot. Compare that to the hell-long names of template instantiations. Try out this:
readelf -sW /usr/lib/libboost_*-mt.so | awk '{ print length($0), $0 }' | sort -n
Pointer names are not stored. The pointer name (or any variable name for that matter) are not compiled into the final binary (provided you don't compile with symbols set to on).
Pointers are simply integers (or longs) that are stored in memory which, in turn, point to the item they are pointing to in some location in memory.
Microsoft uses the "p" prefix to indicate a pointer:
int intOne = 0;
int* pIntOne = &intOne;
They actually use Hungarian on everything.
It works pretty well once you get used to seeing it. Many people think it is ugly at first.
Whatever you decide, I think it's valuable to denote that something is a pointer type in its name, though I wouldn't go quite so far as your example. :)
I agree with previous posts but I would like to point out something around it. Sometimes people overuse pointers thinking that their use will automatically provide small memory footprints. This is not always the case. Consider this piece of code:
void myfunc(const char *var) {
// Function body
}
A pointer to char will take 4 bytes in a 32 bits architecture while the char itself would usually take 1 byte. (Here var is assumed to point to a single byte, not to a string.) Can you see the point? On the other hand, you should always use pointers (or references) for complex objects:
void myfunc(const string &str) {
// Function body
}
Of course, in case you want to modify the variable inside your function you should remove the const keyword.
I don't see a reason why it should be preceded with anything at all. Your compiler/editor will do the job for you.
Well I understand what you are asking here. Its good to remember that the variable names only affect the size of the text file and the time it took you to write that. But do remember that variable names and pointer names and function names etc.. all get lost in translation to machine code by the GNU GCC compiler. The final thing that will be stored in .o or .exe will be 0s and 1s, so you shouldn't worry about names being to big :)
Well pointers are most often used for things that are allocated dynamically, and/or functions.
In the case of dynamic allocation, you can just precede it with ptr, if you're insistent on using this notation. In the case of functions, fun works just as well.
There are two critical points here that I would like to atempt to distill out:
The length of a variable name has no impact on the space allocated for the data in the final compiled C/C++ program as the compiler comes up with its own names for variables.
N.B. Some compilers, especially the older ones, may only recognise the first few letters of a variable name so having very long overly complicated variable names in your source code may lead to clashes
Also, if you create symbol files for debugging they will contain all your variable names so will become unecessarily large but this might not significantly slow down debugging, I've never checked!
The space needed to store a pointer may indeed be larger than the space needed than the data pointed to but there are still circumstances in which you may wish to use them in any case. For instance in an architecture such as COM where the practise is that functions return only a result code (success or failure) and all data that is changed is done through pointers passed on the stack:
/*
pszString could be pointing to only one character which
is less space than the pointer
*/
HRESULT OneLetterSplat(char * pszString)
{
*pszString = 'a';
return SUCCESS
}