As we all know, C++'s memory model can be divided to five blocks: stack, heap, free blocks, global/static blocks, const blocks. I can understand the first three blocks and I also know variables like static int xx are stored in the 4th blocks, and also the "hello world"-string constant, but what is stored in the 5th blocks-const blocks? and , like int a = 10, where does the "10" stored? Can someone explain this to me?
Thanks a lot.
There is a difference between string literals and primitive constants. String literals are usually stored with the code in a separate area (for historical reasons this block is often called the "text block"). Primitive constants, on the other hand, are somewhat special: they can be stored in the "text" block as well, but their values can also be "baked" into the code itself. For example, when you write
// Global integer constant
const int a = 10;
int add(int b) {
return b + a;
}
the return expression could be translated into a piece of code that does not reference a at all. Instead of producing binary code that looks like this
LOAD R0, <stack>+offset(b)
LOAD R1, <address-of-a>
ADD R0, R1
RET
the compiler may produce something like this:
LOAD R0, <stack>+offset(b)
ADD R0, #10 ; <<== Here #10 means "integer number 10"
RET
Essentially, despite being stored with the rest of the constants, a is cut out of the compiled code.
As far as integer literals constants go, they have no address at all: they are always "baked" into the code: when you reference them, instructions that load explicit values are generated, in the same way as shown above.
and , like int a = 10, where does the "10" stored?
It's an implementation detail. Will likely be a part of generated code and be turned into something like
mov eax, 10
in assembly.
The same will happen to definitions like
const int myConst = 10;
unless you try to get address of myConst like that:
const int *ptr = &myConst;
in which case the compiler will have to put the value of 10 into the dedicated block of memory (presumably 5th in your numeration).
Related
We know that if I have a variable int a{3}, I can get a's address by &a. And I can get the the address &a's pointed value by *&a, which returns the integer 3. So right now we already got the number 3, we can't do something like &3 to get the literal's address as that will generate an error. But the problem is that we can successfully using something like *&*&a to get the value 3 back. As I stated, *&a already returns a number 3 and you can't continue the chain on it. Why it works when writing as *&*&a?
There's a significant difference in between using a literal and using a variable on machine code (or assembler) level:
Typically, the literal can be encoded in machine code directly, whereas it might be necessary to load the value of another variable from memory first, if a such one is involved.
Actually, your variant int b = *&a; is quite close to what a pure load/store architecture (lacking any kind of indirect addressing) would need to do (assuming both variables are located in memory, 0xadda being address of variable a, 0xdaad the one of variable b):
MOV Rx, ADDA; // move address of a into some register
LD Ry, Rx; // load value at address in register into a second one
// (maybe there's a direct addressing mode, then both operations
// could be a single one)
MOV Rx, DAAD;
ST Rx, Ry;
In comparison, int b = 3; is a bit simpler on the same machine:
MOV Rx, DAAD;
MOV Ry, 3; // 3 directly encoded in bit pattern
ST Rx, Ry;
OK, maybe you have some appropriate addressing mode, such that int b = a; can be encoded in one single instruction:
MOV #DAAD, #ADDA // # indicating indirect access via address
Still you cannot discuss away that there's an additional memory access necessary comparing to using the literal...
Perhaps even more interesting for you: How would you want to get the address of some value being 'mystically' encoded in some bit pattern of some machine code command (the &3 part)?
For example:
In the file demo.c,
#inlcude<stdio.h>
int a = 5;
int main(){
int b=5;
int c=a;
printf("%d", b+c);
return 0;
}
For int a = 5, does the compiler translate this into something like store 0x5 at the virtual memory address, for example, Ox0000000f in the const area so that for int c = a, it is translated to something like movl 0x0000000f %eax?
Then for int b = 5, the number 5 is not put into the const area, but translated directly to a immediate in the assembly instruction like mov $0x5 %ebx.
It depends. Your program has several constants:
int a = 5;
This is a "static" initialization (which occurs when the program text and data is loaded before running). The value is stored in the memory reserved by a which is in a read-write data "program section". If something changes a, the value 5 is lost.
int b=5;
This is a local variable with limited scope (only by main()). The storage could well be a CPU register or a location on the stack. The instructions generated for most architectures will place the value 5 in an instruction as "immediate data", for an x86 example:
mov eax, 5
The ability for instructions to hold arbitrary constants is limited. Small constants are supported by most CPU instructions. "Large" constants are not usually directly supported. In that case the compiler would store the constant in memory and load it instead. For example,
.psect rodata
k1 dd 3141592653
.psect code
mov eax k1
The ARM family has a powerful design for loading most constants directly: any 8-bit constant value can be rotated any even number of times. See this page 2-25.
One not-as-obvious but totally different item is in the statement:
printf("%d", b+c);
The string %d is, by modern C semantics, a constant array of three char. Most modern implementations will store it in read-only memory so that attempts to change it will cause a SEGFAULT, which is a low level CPU error which usually causes the program to instantly abort.
.psect rodata
s1 db '%', 'd', 0
.psect code
mov eax s1
push eax
In OP's program, a is an "initialized" "global". I expect that it is placed in the initialized part of the data segment. See https://en.wikipedia.org/wiki/File:Program_memory_layout.pdf, http://www.cs.uleth.ca/~holzmann/C/system/memorylayout.gif (from more info on Memory layout of an executable program (process)). The location of a is decided by the compiler- linker duo.
On the other hand, being automatic (stack) variables, b and c are expected in the stack segment.
Being said that, the compiler/linker has the liberty to perform any optimization as long as the observed behavior is not violated (What exactly is the "as-if" rule?). For example, if a is never referenced, then it may be optimized out completely.
I am writing in c++ for the Nintendo DS (With 4MB of RAM). I have a button class that stores data like the x,y location and length. Which of the following would take less memory?
.
Method 1, class variables length, x, y, and halfPoint
Button::Button(int setX, int setY, int setLength)
{
x = setX;
y = setY;
length = setLength;
halfPoint = length/2;
}
//access variable with buttonName.halfPoint
Method 2, class variables length, x and y
Button::Button(int setX, int setY, int length)
{
x = setX;
y = setY;
length = setLength;
}
int Button::getHalfPoint()
{
return length/2;
}
//access variable with buttonName.getHalfPoint()
Any help is appreciated. (And in the real code I calculate a location much more complex than the half point)
The getHalfPoint() method will take up less room if there are a lot of Buttons. Why? Because member functions are actually just implemented by the compiler as regular functions with an implied first argument of a pointer to the object. So your function is rewritten by the compiler as:
int getHalfPoint(Button* this)
{
return this->length/2;
}
(It is a bit more complicated, because of name mangling, but this will do for an explanation.)
You should carefully consider the extra amount of computation that will have to be done to avoid storing 4 extra bytes, however. And as Cameron mentions, the compiler might add extra space to the object anyway, depending upon the architecture (I think that is likely to happen with RISC architectures).
Well, that depends!
The method code exists exactly once in memory, but a member variable exists once for each object instance.
So you'll have to count the number of instances you create (multiplied by the sizeof the variable), and compare that to the size of the compiled method (using a tool like e.g. objdump).
You'll also want to compare the size of your Button with and without the extra variable, because it's entirely possible that the compiler pads it to the same length anyway.
I suggest you declare the getHalfPoint method inside your class. This will make the compiler inline the code.
There is a possibility that the code in the function is one assembly instruction, and depending on your platform, take the size of 4 bytes or less. In this case, there is probably no benefit to have a variable represent the half of another variable. Research "right shifting". Also, to take full advantage, make the variable unsigned int. (Right shifting a signed integer is not defined.)
The inline capability means that the content of the function will be pasted wherever there is a call to the function. This reduces the overhead of a function call (such as the branch instruction, pushing and popping arguments). The reduction of a branch instruction may even speed up the program because there is no flushing of the instruction cache or pipeline.
I have a struct which holds values that are used as the arguments of a for loop:
struct ARGS {
int endValue;
int step;
int initValue;
}
ARGS * arg = ...; //get a pointer to an initialized struct
for (int i = arg->initValue; i < arg->endValue; i+=arg->step) {
//...
}
Since the values of initValue and step are checked each iteration, would it be faster if I move them to local values before using in the for loop?
initValue = arg->initValue;
endValue = arg->endValue;
step = arg->step;
for (int i = initValue; i < endValue; i+=step) {
//...
}
The clear cut answer is that in 99.9% of the cases it does not matter, and you should not be concerned with it. Now, there might be different micro differences that won't matter to mostly anyone. The gory details depend on the architecture and optimizer. But bear with me, understand not no mean very very high probability that there is no difference.
// case 1
ARGS * arg = ...; //get a pointer to an initialized struct
for (int i = arg->initValue; i < endValue; i+=arg->step) {
//...
}
// case 2
initValue = arg->initValue;
step = arg->step;
for (int i = initValue; i < endValue; i+=step) {
//...
}
In the case of initValue, there will not be a difference. The value will be loaded through the pointer and stored into the initValue variable, just to store it in i. Chances are that the optimizer will skip initValue and write directly to i.
The case of step is a bit more interesting, in that the compiler can prove that the local variable step is not shared by any other thread and can only change locally. If the pressure on registers is small, it can keep step in a register and never have to access the real variable. On the other hand, it cannot assume that arg->step is not changing by external means, and is required to go to memory to read the value. Understand that memory here means L1 cache most probably. A L1 cache hit on a Core i7 takes approximately 4 cpu cycles, which roughly means 0.5 * 10-9 seconds (on a 2Ghz processor). And that is under the worst case assumption that the compiler can maintain step in a register, which may not be the case. If step cannot be held on a register, you will pay for the access to memory (cache) in both cases.
Write code that is easy to understand, then measure. If it is slow, profile and figure out where the time is really spent. Chances are that this is not the place where you are wasting cpu cycles.
This depends on your architecture. If it is a RISC or CISC processor, then that will affect how memory is accessed, and on top of that the addressing modes will affect it as well.
On the ARM code I work with, typically the base address of a structure will be moved into a register, and then it will execute a load from that address plus an offset. To access a variable, it will move the address of the variable into the register, then execute the load without an offset. In this case it takes the same amount of time.
Here's what the example assembly code might look like on ARM for accessing the second int member of a strcture compared to directly accessing a variable.
ldr r0, =MyStruct ; struct {int x, int y} MyStruct
ldr r0, [r0, #4] ; load MyStruct.y into r0
ldr r1, =MyIntY ; int MyIntX, MyIntY
ldr r1, [r1] ; directly load MyIntY into r0.
If your architecture does not allow addressing with offsets, then it would need to move the address into a register and then perform the addition of the offset.
Additionally, since you've tagged this as C++ as well, if you overload the -> operator for the type, then this will invoke your own code which could take longer.
The problem is that the two version are not identical. If the code in the ... part modifies the values in arg then the two options will behave differently (the "optimized" one will use the step and end value using the original values, not the updated ones).
If the optimizer can prove by looking at the code that this is not going to happen then the performance will be the same because moving things out of loops is a common optimization performed today. However it's quite possible that something in ... will POTENTIALLY change the content of the structure and in this case the optimizer must be paranoid and the generated code will reload the values from the structure at each iteration. How costly it will be depends on the processor.
For example if the arg pointer is received as a parameter and the code in ... calls any external function for which the code is unknown to the compiler (including things like malloc) then the compiler must assume that MAY BE the external code knows the address of the structure and MAY BE it will change the end or step values and thus the optimizer is forbidden to move those computations out of the loop because doing so would change the behavior of the code.
Even if it's obvious for you that malloc is not going to change the contents of your structure this is not obvious at all for the compiler, for which malloc is just an external function that will be linked in at a later step.
Can someone help me get a better understanding of creating variables in C++? I'll state my understanding and then you can correct me.
int x;
Not sure what that does besides declare that x is an integer on the stack.
int x = 5;
Creates a new variable x on the stack and sets it equal to 5. So empty space was found the stack and then used to house that variable.
int* px = new int;
Creates an anonymous variable on the heap. px is the memory address of the variable. Its value is 0 because, well, the bits are all off at that memory address.
int* px = new int;
*px = 5;
Same thing as before, except that the value of the integer at memory address px is set to 5. (Does this happen in 1 step???? Or does the program create an integer with value 0 on the heap and then set it to 5?
I know that everything I wrote above probably sounds naive, but I really am trying to understand this stuff.
Others have answered this question from the point of view of how the C++ standard works. My only additional comment there would be with global or static variables. So if you have
int bar ()
{
static int x;
return x;
}
then x doesn't live on the stack. It will be initialised to zero at the "start of time" (this is done in a function called crt0, at least with GCC: look up "BSS" segments for more information) and bar will return zero.
I'd massively recommend looking at the assembled code to see how a compiler actually treats what you write. For example, consider this tiny snippet:
int foo (int a)
{
int x, y;
x = 3;
y = a;
return x + y;
}
I made sure to use the values of x and y (by returning their sum) to ensure the compiler didn't just elide them completely. If you stick that code in a file called tmp.cc and then compile it with
$ g++ -O2 -c -o tmp.o tmp.cc
then ask for the disassembled code with objdump, you get:
$ objdump -d tmp.o
tmp.o: file format elf32-i386
Disassembly of section .text:
00000000 <_Z3fooi>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 83 c0 03 add $0x3,%eax
7: c3 ret
Whoah! What happened to x and y? Well, the point is that the C and C++ standards merely require the compiler to generate code that has the same behaviour as what your program asks for. In fact, this program loads 32 bits from the stack (this is the contents of a, a fact dictated by the ABI on my particular platform) and sticks it in the eax register. Then it adds three and returns. Another important fact about the ABI on my laptop (and probably yours too) is that the return value of a function sits in eax. Notice, the function didn't allocate any memory on the stack at all!
In fact, I also put bar (from above) in my tmp.cc. Here's the resulting code:
00000010 <_Z3barv>:
10: 31 c0 xor %eax,%eax
12: c3 ret
"Huh, what happened to x?", I hear you say :-) Well, the compiler spotted that nothing in the code required x to actually exist, and it always had the value zero. So the function basically got transformed into
int bar ()
{
return 0;
}
Magic!
When a new variable is created, it does not have a value. It can be anything, pretty much depending on what was in that piece of stack or heap before. int x; will give you a warning if you try to use the value without setting it to something first. E.g. int y = x; will cause a warning unless you give x an explicit value first.
Creating an int on the heap works pretty much the same way: int *p = new int; calls the default constructor, which does nothing, leaving the value of *p up to chance until you set it to something explicit. If you want to make sure your heap value is initialized, use int *p = new int(5); to tell the constructor what value to copy into the memory it allocates.
Unless you initialize an int variable to zero explicitly, it is pretty much never initialized for you unless it is a global, namespace, or class static.
In VS2010 specifically(other compilers may treat it differently), an int is not given a default value of 0. You can see this by trying to print out a non-initialized int. It does allocate memory with a size of int but it is not initialized(just junk).
In both of your cases, the memory is allocated FIRST, and then the value is set. If a value is not set, you have a non-initialized part of memory that will have "junk data" inside of it and you will get a compiler warning and possibly an error when running it.
Yes, it has an address in memory but there is no valid(known) data inside of it unless you specifically set it. It vary well could be anything that the compiler recognizes as available memory to be overwritten. Since it is unknown and not reliable, it is considered junk and useless and why compilers warn you about it.
Compilers WILL set static int and global int to 0.
EDIT: Due to Peter Schneider's comment.