If I declare an array in the global scope, it uses up memory to store it. However, if I declare an array (I am using two types, one is a char array, while the other is an int array) inside a function (such as setup()) will the memory be freed automatically once the array goes out of scope?
I believe this happens for some variables such as int or byte. I just wanted to know if this applies to arrays as well.
Also, since I read that for programs containing lots of strings, it is best to store them in program space, does a call such as
lcd.print("Hello")
still use up the memory for the "Hello" string after the function ends (assuming that the print function does not store it someplace else)?
To the second question:
The F() macro will store strings in the progmen instead of using RAM, so you do not have this problem anymore:
lcd.print(F("Hello"));
As to your 1st question:
Yes. All variables declared inside a function are only valid inside until the function returns and are released automatically then. This has some implications:
You must not use a pointer to a locally declared variable after the variable went out of scope, for instance, after the function returned. (Don't return a pointer to a local array from your function!) - It is however perfecly legal to pass that pointer to other functions when calling them from within the declaring block/function.
Local variables are stored on the local stack so that there needs to be enough room left for the stack to grow by the corresponding number of bytes when the function is called.
The amount of memory used by those variables is not accounted for in the calculation of "used" RAM at compile time.
Related
Lets take a concrete example:
I want to use a vector of strings and have the choice of defining that vector as one of the following:
std::vector<string> myVector;
or
std::vector<string>* myPtrVector = new vector<string>;
I understand that myPtrVector is a memory address, but what is myVector? When I call myVector.push_back(someString), does it also use the memory address?
To consolidate my questions:
1). I would like to know how the computer treats normal (non-pointer) variables and how they are different from pointer variables.
2). I would also like to know if there is any advantage to declaring data structures (vectors, maps, stacks, ect) as pointers.
Thank y'all in advance for answering! I am new here and if you have any advice for how I should have asked my question differently I would be happy to hear it.
myVector is a local variable. Depending on the context of its use it may be stored in different places but the important thing is that it is automagically destroyed when its scope ends (e.g. if function local, when the function returns). Internally vector allocates from the free store, meaning when you push_back something it adds it to non-local memory (and allocates more if needed). Internally it will use a pointer to get at that memory, but you don't need to worry about that because it guarentees that it cleans up after itself.
As a final note, accessing a vector through a pointer is almost certainly slower than just moving it around or similar. Accessing through a pointer means you'll likely get a cache miss and you have to deal with managing the lifetime of it as well
I will use example with this simple structure. We will asume that size of int is 4 bytes - this structure will have 8 bytes in size.
struct Point {
int x;
int y;
Point(char x, int y){
this->x = x;
this->y = y;
}
}
Now we will create 2 variables inside our nice function. One will be normal variable , another will be pointer
void niceFunction(){
Point normalVariable(10, 5);
Point* pointer = new Point(10, 5);
}
niceFunction(); // we call our function
What is difference ?
First variable contains our whole structure. If you use sizeof(normalVariable), you will get 8 bytes as result.
Second contains address to our structure. When you use sizeof(pointer) you will get 4 bytes - our structure is somewhere in dynamic memory and this pointer only contain address to our structure.
Allocation
Allocation is act when we request memory for our structure. System must find free memory and give it to us. We will fill this memory with our stuff (in our cases , we use numbers 10 and 5).
In first case (normal variable) , allocation is faster , because system use stack for normal variables and thus knows where your stuff is stored and where is free area. This has one problem - stack has limited size and if you allocate to many structures in it , memory will be depleted and your program will be terminated by os - error called Stack Overflow
In second case (pointer), system must find free area in dynamic memory and inject your structure there. This takes some time. Also you have unlimited memory (you are limited only by size of your ram memory and restrictions that are from operating system)
Dealocation
Dealocation is act of returning memory to system. Returned memory will be resued again when needed
In first case , dealocation is automatical. Because local variables exist only inside of function , there isn't any reason to dealocate them from stack once our function ends.
In second case (pointer), our structure is somewhere in memory and thus system cannot know how to dealocate it. We must explicitly state that we want to dealocate our structure
void niceFunction(){
Point normalVariable(10, 5);
Point* pointer = new Point(10, 5);
delete pointer; // this will dealocate our structure
}
If we forget to dealocate structure in dynamic memory , it will stay here until end of the program.
Argument passing / Copying
Imagine function that will have 2 parameters (point a and b) and it will return you their distance. We will use normal variables in first and pointer in second.
float distanceNormal(Point a, Point b){
// body of function is irelevant for us
}
float distancePointer(Point* a, Point* b){
// body of function is irelevant for us
}
In first case , when we pass our points into function , system will copy them. That means it will allocate new memory in stack inside function and it will copy your whole structure into that allocated memory. This can be inefective if your structures are massive or if their constructor is doing expensive operations. This also means that two normal variables cannot share one structure - they can have same values , but they are not same structure.
In second case, nothing is copied. Instead, system will only pass address to your structure in dynamic memory. This also lead to one effect - when you modify structure using pointer. It will change original structure , because they are both have same address
Point* pointer = new Point(10, 5); // values of our point are (10, 5)
void changeX(Point* point){
point->x = 50;
}
changeX(pointer); // after this call, values of our point are (50, 5)
This is exploited by methods of structures/objects. Do you see that weird variable called this in constructor? That is also pointer* - it is pointer to our structure. This allows us manipulate values of structure directly.
Conclusion
You can see that both of them have advantages and disantvantages. When to use them ?
Use normal variables when:
you don't use its content outside of function. When you have something that is used only inside of function , there isn't any reason to allocate it in dynamic memory. Stack is better option for this
you want automatic dealocation: in our case , normal variable will be dealocated automaticaly. Second one(pointer) must be dealocated manualy.
you don't want share it into function: When you use pointer, you are working with original and changes will be aplied to original. When you use normal variable, you are working with copy and original structure will be unaffected
Use pointers when:
you want to work with original: if you want to manipulate original structure inside function, use pointer.
you want to avoid copying: if you have structure and you have function that is called 1000 times, it is ineffective to use normal variable. Use pointer to avoid copying.
you want to avoid automatic allocation: sometimes , automatic allocation isn't good. Like you allocated structure for your game that has hundreds of bytes and after end of function , it will be dealocated. This never happend with pointers , you must dealocate them manualy using delete.
Golden rule is: Use normal variables where you can, use pointers where you must
I'm currently adapting some example Arduino code to fit my needs. The following snippet confuses me:
// Dont put this on the stack:
uint8_t buf[RH_RF95_MAX_MESSAGE_LEN];
What does it mean to put the buf variable on the stack? How can I avoid doing this? What bad things could happen if I did it?
The program stack has a limited size (even on desktop computers, it's typically capped in megabytes, and on an Arduino, it may be much smaller).
All function local variables for functions are stored there, in a LIFO manner; the variables of your main method are at the bottom of the stack, the variables of the functions called in main on top of that, and so on; space is (typically) reserved on entering a function, and not reclaimed until a function returns. If a function allocates a truly huge buffer (or multiple functions in a call chain allocate slightly smaller buffers) you can rapidly approach the stack limit, which will cause your program to crash.
It sounds like your array is being allocated outside of a function, putting it at global scope. The downside to this is there is only one shared buffer (so two functions can't use it simultaneously without coordinating access, while a stack buffer would be independently reserved for each function), but the upside is that it doesn't cost stack to use it; it's allocated from a separate section of program memory (a section that's usually unbounded, or at least has limits in the gigabyte, rather than megabyte range).
So to answer your questions:
What does it mean to put the buf variable on the stack?
It would be on the stack if it:
Is declared in function scope rather than global scope, and
Is not declared as static (or thread_local, though that's more complicated than you should care about right now); if it's declared static at function scope, it's basically global memory that can only be referenced directly in that specific function
How can I avoid doing this?
Don't declare huge non-static arrays at function scope.
What bad things could happen if I did it?
If the array is large enough, you could suffer a stack overflow from running out of available stack space, crashing your program.
I have created a project file and in int main, I make a call to a function. In the function, there is a character array, corr[40], which stores the user's input letter by letter.(its a hangman game).After the function is executed, and then the program goes back to main. If the function is called again, then the array has the inputs of the previous call and is not erased. And hence only a few characters of the previous input are overwritten by new ones.
So I want to know how to allocate memory from heap to the array(using a pointer)? Or is there any other way I can correct this issue?
You've got a char[40] as a local variable in a function. Since that's not a class type, there is no constructor. The initial values will depend on whatever used to be in that memory location before. That might very well be all or some of the previous letters.
If you want the array to be zero each time, you can just use std::fill(std::begin(foo), std::end(foo), 0);
Note that using heap memory is no solution. There's still no constructor to initialize the heap memory, so that too would have any old value. Using std::string, which does have a constructor, is a solution.
Q1. In Java, all objects, arrays and class variables are stored on the heap? Is the same true for C++? Is data segment a part of Heap?
What about the following code in C++?
class MyClass{
private:
static int counter;
static int number;
};
MyClass::number = 100;
Q2. As far as my understanding goes, variables which are given a specific value by compiler are stored in data segment, and unintialized global and static variables are stored in BSS (Block started by symbol). In this case, MyClass::counter being static is initialized to zero by the compiler and so it is stored at BSS and MyClass::number which is initialized to 100 is stored in the data segment. Am I correct in making the conclusion?
Q3. Consider following piece of codes:
void doHello(MyClass &localObj){
// 3.1 localObj is a reference parameter, where will this get stored in Heap or Stack?
// do something
}
void doHelloAgain(MyClass localObj){
// 3.2 localObj is a parameter, where will this get stored in Heap or Stack?
// do something
}
int main(){
MyClass *a = new MyClass(); // stored in heap
MyClass localObj;
// 3.3 Where is this stored in heap or stack?
doHello(localObj);
doHelloAgain(localObj);
}
I hope I have made my questions clear to all
EDIT:
Please refer this article for some understanding on BSS
EDIT1: Changed the class name from MyInstance to MyClass as it was a poor name. Sincere Apologies
EDIT2: Changed the class member variable number from non-static to static
This is somewhat simplified but mostly accurate to the best of my knowledge.
In Java, all objects are allocated on the heap (including all your member variables). Most other stuff (parameters) are references, and the references themselves are stored on the stack along with native types (ints, longs, etc) except string which is more of an object than a native type.
In C++, if you were to allocate all objects with the "new" keyword it would be pretty much the same situation as java, but there is one unique case in C++ because you can allocate objects on the stack instead (you don't always have to use "new").
Also note that Java's heap performance is closer to C's stack performance than C's heap performance, the garbage collector does some pretty smart stuff. It's still not quite as good as stack, but much better than a heap. This is necessary since Java can't allocate objects on the stack.
Q1
Java also stores variables on the stack but class instances are allocated on the heap. In C++ you are free to allocate your class instances either on the stack or on the heap. By using the new keyword you allocate the instance on the heap.
The data segment is not part of the heap, but is allocated when the process starts. The heap is used for dynamic memory allocations while the data segment is static and the contents is known at compile time.
The BSS segment is simply an optimization where all the data belongning to the data segment (e.g. string, constant numbers etc.) that are not initialized or initialized to zero are moved to the BSS segment. The data segment has to be embedded into the executable and by moveing "all the zeros" to the end they can be removed from the executable. When the executable is loaded the BSS segment is allocated and initialized to zero, and the compiler is still able to know the addresses of the various buffers, variables etc. inside the BSS segment.
Q2
MyClass::number is stored where the instance of MyClass class is allocated. It could be either on the heap or on the stack. Notice in Q3 how a points to an instance of MyClass allocated on the heap while localObj is allocated on the stack. Thus a->number is located on the heap while localObj.number is located on the stack.
As MyClass::number is an instance variable you cannot assign it like this:
MyClass::number = 100;
However, you can assign MyClass::counter as it is static (except that it is private):
MyClass::counter = 100;
Q3
When you call doHello the variable localObj (in main) is passed by reference. The variable localObj in doHello refers back to that variable on the stack. If you change it the changes will be stored on the stack where localObj in main is allocated.
When you call doHelloAgain the variable localObj (in main) is copied onto the stack. Inside doHelloAgain the variable localObj is allocated on the stack and only exists for the duration of the call.
In C++, objects may be allocated on the stack...for example, localObj in your Q3 main routine.
I sense some confusion about classes versus instances. "MyInstance" makes more sense as a variable name than a class name. In your Q1 example, "number" is present in each object of type MyInstance. "counter" is shared by all instances. "MyInstance::counter = 100" is a valid assignment, but "MyInstance::number = 100" is not, because you haven't specified
which object should have its "number" member assigned to.
Q1. In Java, all objects, arrays and
class variables are stored on the
heap? Is the same true for C++? Is
data segment a part of Heap?
No, the data section is separate from the heap. Basically, the data section is allocated at load time, everything there has a fixed location after that. In addition, objects can be allocated on the stack.
The only time objects are on the heap is if you use the new keyword, or if you use something from the malloc family of functions.
Q2. As far as my understanding goes,
variables which are given a specific
value by compiler are stored in data
segment, and unintialized global and
static variables are stored in BSS
(Block started by symbol). In this
case, MyInstance::counter being static
is initialized to zero by the compiler
and so it is stored at BSS and
MyInstance::number which is
initialized to 100 is stored in the
data segment. Am I correct in making
the conclusion?
Yes, your understanding of the BSS section is correct. However, since number isn't static the code:
MyInstance::number = 100;
isn't legal, it needs to be either made static or initialized in the constructor properly. If you initialize it in the constructor, it will exist wherever the owning object is allocated. If you make it static, it will end up in the data section... if anywhere. Often static const int variables can be inlined directly into the code used such that a global variable isn't needed at all.
Q3. Consider following piece of codes: ...
void doHello(MyInstance &localObj){
localObj is a reference to the passed object. As far as you know, there is no storage, it refers to wherever the variable being passed is. In reality, under the hood, a pointer may be passed on the stack to facilitate this. But The compiler may just as easily optimize that out if it can.
void doHelloAgain(MyInstance localObj){
a copy of the passed parameter is placed on the stack.
MyInstance localObj;
// 3.3 Where is this stored in heap or stack?
localObj is on the stack.
All memory areas in C++ are listed here
I have this situation:
{
float foo[10];
for (int i = 0; i < 10; i++) {
foo[i] = 1.0f;
}
object.function1(foo); // stores the float pointer to a const void* member of object
}
object.function2(); // uses the stored void pointer
Are the contents of the float pointer unknown in the second function call? It seems that I get weird results when I run my program. But if I declare the float foo[10] to be const and initialize it in the declaration, I get correct results. Why is this happening?
For the first question, yes using foo once it goes out of scope is incorrect. I'm not sure if it's defined behavior in the spec or not but it's definitely incorrect to do so. Best case scenario is that your program will immediately crash.
As for the second question, why does making it const work? This is an artifact of implementation. Likely what's happenning is the data is being written out to the data section of the DLL and hence is valid for the life of the program. The original sample instead puts the data on the stack where it has a much shorter lifetime. The code is still wrong, it just happens to work.
Yes, foo[] is out of scope when you call function2. It is an automatic variable, stored on the stack. When the code exits the block it was defined in, it is deallocated. You may have stored a reference (pointer) to it elsewhere, but that is meaningless.
In both cases you are getting undefined behaviour. Anything might happen.
You are storing a pointer to the locally declared array, but once the scope containing the array definition is exited the array - and all its members are destroyed.
The pointer that you have stored now no longer points to a float or even a valid memory address that could be used for a float. It might be an address that is reused for something else or it might continue to contain the original data unchanged. Either way, it is still not valid to attempt to dereference the pointer, either for reading or writing a float value.
For any declaration like this:
{
type_1 variable_name_1;
type_2 variable_name_2;
type_3 variable_name_3;
}
declaration, the variables are allocated on the stack.
You can print out the address of each variable:
printf("%p\n", variable_name )
and you'll see that addresses increase by small amount roughly (but not always exactly equal to), the amount of space each variable needs to store its data.
The memory used by stack variables is recycled when the '}' is reached and the variables go out of scope. This is done nice an efficiently just by subtracting some number from a special pointer called the 'stack pointer', which says where the data for new stack variables will have their data allocated. By incrementing and decrementing the stack pointer, programs have an extremely fast way of working out were the memory for variables will live. Its such and important concept that every major processor maintains a special piece of memory just for the stack pointer.
The memory for your array is also pushed and popped from the program's data stack and your array pointer is a pointer into the program's stack memory. While the language specification says accessing the data owned by out-of-scope variables has undefined consequences, the result is typically easy to predict. Usually, your array pointer will continue to hold its original data until new stack variables are allocated and assigned data (i.e. the memory is reused for other purposes).
So don't do it. Copy the array.
I'm less clear about what the standard says about constant arrays (probably the same thing -- the memory is invalid when the original declaration goes out of scope). However, your different behavior is explainable if your compiler allocated a chunk of memory for constants that is initialized when your program starts, and later, foo is made to point to that data when it comes into scope. At least, if I were writing a compiler, that's probably what I'd do as its both very fast and leads to using the smallest amount of memory. This theory is easily testable in the following way:
void f()
{
const float foo[2] = {99, 101};
fprintf( "-- %f\n", foo[0] );
const_cast<foo*>(foo)[0] = 666;
}
Call foo() twice. If the printed value changed between calls (or an invalid memory access exception is thrown), its a fair bet that the data for foo is allocated in special area for constants that the above code wrote over.
Allocating the memory in a special area doesn't work for non-const data because recursive functions may cause many separate copies of a variable to exist on the stack at the same time, each of which may hold different data.
It's undefined behavior in both cases. You should consider the stack based variable deallocated when control leaves the block.
What's happening is currently you're probably just setting a pointer (can't see the code, so I can't be sure). This pointer will point to the object foo, which is in scope at that point. But when it goes out of scope, all hell can break loose, and the C standard can make no guarantees about what happens to that data once it goes out of scope. It can be overwritten by anything. It works for a const array because you're lucky. Don't do that.
If you want the code to work correctly as it is, function1() is going to need to copy the data into the object member. Which means you'll also have to know the length of the array, which means you'll have to pass it in or have some nice termination method.
The memory associated with foo goes out of scope and is reclaimed.
Outside the {}, the pointer is invalid.
It is a good idea to make objects manage their own memory rather than refer to an external pointer. In this specific case your object could allocate its own foo internally and copy the data into it. However it really depends on what you are trying to achieve.
For simple problems like this it is better to give a simple answer, not 3 paragraphs about stacks and memory addresses.
There are 2 pairs of braces {}, one is inside the other. The array was declared after the first left brace { so it stops existing before the last brace }
The end
When answering a question you must answer it at the level of the person asking regardless of how well you yourself comprehend the issue or you may confuse the student.
-experienced ESL teacher