Memory allocation and char arrays

Memory allocation and char arrays - c++

I still do not quite understand, what exactly will happen in the situation:
int i = 0;
for(i; i <100; i ++)
{
char some_array[24];
//...
strcpy(some_array,"abcdefg");
}
Will the some_array act as:
some_array = malloc(24);
At the beginning of the cycle and
free(some_array) at the end of the cycle?
Or those variables are gonna be allocated in the stack, and after the function ends destroyed?

some_array is local to the block, so it's created at the beginning of each iteration of the loop, and destroyed again at the end of each iteration of the loop.
In the case of a simple array, "create" and "destroy" don't mean much. If (in C++) you replace it with (for example) an object that prints something out when it's created and destroyed, you'll see those side effects happen though.

There are two ways of storing character strings.
1. Creating some space using malloc and then storing the char string
In this case memory is allocated from heap and will not be freed until you free it explicitly. Here memory is not deallocated even after the scope
2. Creating an array and storing in it
Here memory is allocated from stack and is freed implicitly after the scope

There is no implicit malloc() or free calls introduced by the example shown.
You can demonstrate that by adding a "free(some_array)" statement within the loop body. The result will be a compilation error.
The array is - as far as the program is concerned - created at the start of the block and destroyed at the end. Which means the C programmer must assume it is created and destroyed for every iteration of the loop.
It is up to the compiler as to whether the array is created on the stack - or how it optimises the repeated creation and destruction of the array. In practice, it often will be created on the stack.

Related

auto variables and auto objects memory allocation in a loop

Can anyone tell me how many times variable integer1 allocated and deallocated?
how about class_object? Is it true that both of them allocate and deallocate three times?
for(int i = 0; i < 3; i++){
int integer1;
Class1 class_object(some_parameter);
}

For local variables allocation and deallocation is something compiler specific. Allocation/deallocation for local variables means reserving space on the stack.
Most compilers though will move the allocation and deallocation of the variables out of the loop and reuse the same space for the variable every time.
So there would be one allocation, meaning changing the stack pointer, before the loop and one deallocation, meaning restoring the stack pointer, after the loop. Many compilers will compute the maximum space needed for the function and allocate it all only once on function entry. Stack space can also be reused when the compiler sees that the life time of a variable has ended or that it simply can't be accessed anymore by later code. So talking about allocation and dealocation is rather pointless here.
Aren't you more interested in the number of constrcutions and deconstructions hapening? In that case yes, the constructor for Class1 is called 3 times and the destructor too. But again the compiler can optimize that as long as the result behaves "as if" the constructor/destructor were called.
PS: if the address of something is never taken (or can be optimized away) then the compiler might not even reserve stack space and just keep the variable in a register for the whole lifetime.

For automatic (local stack) variables the compiler reserves some space on the stack.
In this case (if we ignore optimizations) the compiler will reserve space for integer1 and class_object that most probably will be reused in each loop iteration.
For basic data types nothing is done beyond this but for classes the compiler will call the constructor when entering the scope of the variable and call the destructor when the variable goes out of scope.
Most probable both variable get the same address on each loop iteration (but this does not have to be true from the standards point of view).
The term allocation usually refers to requesting some heap memory or other resource from the operating system. Regarding to this definition there is nothing allocated.
But assigning some stack space (or a register) to a automatic variables may also be called allocation most compiler will allocate memory once (by setting the stack frame to a value big enough on entering the routine.
Summary:
At the end it is totally up to the compiler. You are just guaranteed to get a valid object in its scope

Why does my program crash when I increment a pointer and then delete it?

I was solving some programming exercise when I realised that I have a big misunderstanding about pointers. Please could someone explain the reason that this code causes a crash in C++.
#include <iostream>
int main()
{
int* someInts = new int[5];
someInts[0] = 1;
someInts[1] = 1;
std::cout << *someInts;
someInts++; //This line causes program to crash
delete[] someInts;
return 0;
}
P.S I am aware that there is no reason to use "new" here, I am just making the example as small as possible.

It's actually the statement after the one you mark as causing the program to crash that causes the program to crash!
You must pass the same pointer to delete[] as you get back from new[].
Otherwise the behaviour of the program is undefined.

The problem is that with the someInts++; you are passing the address of the second element of an array to your delete[] statement. You need to pass the address of the first (original) element:
int* someInts = new int[5];
int* originalInts = someInts; // points to the first element
someInts[0] = 1;
someInts[1] = 1;
std::cout << *someInts;
someInts++; // points at the second element now
delete[] originalInts;

Without going into the specifics of a specific implementation here, the intuitive reason behind the crash can be explained simply by considering what delete[] is supposed to do:
Destroys an array created by a new[]-expression
You give delete[] a pointer to an array. Among other things, it has to free the memory it allocated to hold the contents of that array after.
How is the allocator know what to free? It uses the pointer you gave it as a key to look up the data structure which contains the bookkeeping information for the allocated block. Somewhere, there is a structure which stores the mapping between pointers to previously allocated blocks and the associated bookkeeping operation.
You may wish this lookup to result in some kind of friendly error message if the pointer you pass to delete [] was not one returned by a corresponding new[], but there is nothing in the standard that guarantees that.
So, it is possible, given a pointer which had not been previously allocated by new[], delete[] ends up looking at something that really is not a consistent bookkeeping structure. Wires get crossed. A crash ensues.
Or, you might wish that delete[] would say "hey, it looks like this pointer points to somewhere inside a region I allocated before. Let me go back and find the pointer I returned when I allocated that region and use that to look up the bookkeeping information" but, again, there is no such requirement in the standard:
For the second (array) form, expression must be a null pointer value or a pointer value previously obtained by an array form of new-expression. If expression is anything else, including if it's a pointer obtained by the non-array form of new-expression, the behavior is undefined. [emphasis mine]
In this case, you are lucky because you found out you did something wrong instantaneously.
PS: This is a hand-wavy explanation

You can increment a pointer within the block and use that incremented pointer to access different parts of the block, that is fine.
However you must pass Delete the pointer you got from New. Not an incremented version of it, not a pointer that was allocated by some other means.
Why? well the cop-out answer is because that is what the standard says.
The practical answer is because to free a block of memory the memory manager needs information about the block. For example where it starts and ends, and whether adjacent chunks are free (normally a memory manager will combine adjacent free chunks) and what arena it belongs to (important for locking in multithreaded memory managers).
This information is typically stored immediately prior to the allocated memory. The Memory manager will subtract a fixed value from your pointer and look for a structure of allocation metadata at that location.
If you pass a pointer that does not point to the start of a block of allocated memory then the memory manager tries to perform the subtraction and read it's control block but what it ends up reading is not a valid control block.
If you are lucky then the code crashes quickly, if you are unlucky then you can end up with subtle memory corruption.

Why can't you free variables on the stack?

The languages in question are C/C++.
My prof said to free memory on the heap when your done using it, because otherwise you can end up with memory that can't be accessed. The problem with this is you might end up with all your memory used up and you are unable to access any of it.
Why doesn't the same concept apply to the stack? I understand that you can always access the memory you used on the stack, but if you keep creating new variables, you will eventually run out of space right? So why can't you free variables on the stack to make room for new variables like you can on the heap?
I get that the compiler frees variables on the stack, but thats at the end of the scope of the variable right. Doesn't it also free a variable on the heap at the end of its scope? If not, why not?

Dynamically allocated objects ("heap objects" in colloquial language) are never variables. Thus, they can never go out of scope. They don't live inside any scope. The only way you can handle them is via a pointer that you get at the time of allocation.
(The pointer is usually assigned to a variable, but that doesn't help.)
To repeat: Variables have scope; objects don't. But many objects are variables.
And to answer the question: You can only free objects, not variables.

The end of the closed "}" braces is where the stack "frees" its memory. So if I have:
{
int a = 1;
int b = 2;
{
int c = 3; // c gets "freed" at this "}" - the stack shrinks
// and c is no longer on the stack.
}
} // a and b are "freed" from the stack at this last "}".
You can think of c as being "higher up" on the stack than "a" and "b", so c is getting popped off before them. Thus, every time you write a "}" symbol, you are effectively shrinking the stack and "freeing" data.

There are already nice answers but I think you might need some more clarification, so I'll try to make this a more detailed answer and also try to make it simple (if I manage to). If something isn't clear (which with me being not a native english speaker and having problems with formulating answers sometimes might be likely) just ask in the comments. Also gonna take the use the Variables vs Objects idea that Kerrek SB uses in his answer.
To make that clearer I consider Variables to be named Objects with an Object being something to store data within your program.
Variables on the stack got automatic storage duration they automatically get destroyed and reclaimed once their scope ends.
{
std::string first_words = "Hello World!";
// do some stuff here...
} // first_words goes out of scope and the memory gets reclaimed.
In this case first_words is a Variable (since it got its own name) which means it is also an Object.
Now what about the heap? Lets describe what you might consider being "something on the heap" as a Variable pointing to some memory location on the heap where an Object is located. Now these things got what's called dynamic storage duration.
{
std::string * dirty = nullptr
{
std::string * ohh = new std::string{"I don't like this"} // ohh is a std::string* and a Variable
// The actual std::string is only an unnamed
// Object on the heap.
// do something here
dirty = ohh; // dirty points to the same memory location as ohh does now.
} // ohh goes out of scope and gets destroyed since it is a Variable.
// The actual std::string Object on the heap doesn't get destroyed
std::cout << *dirty << std::endl; // Will work since the std::string on the heap that dirty points to
// is still there.
delete dirty; // now the object being pointed to gets destroyed and the memory reclaimed
dirty = nullptr; can still access dirty since it's still in its scope.
} // dirty goes out of scope and get destroyed.
As you can see objects don't adhere to scopes and you got to manually manage their memory. That's also a reason why "most" people prefer to use "wrappers" around it. See for example std::string which is a wrapper around a dynamic "String".
Now to clarify some of your questions:
Why can't we destroy objects on the stack?
Easy answer: Why would you want to?
Detailed answer: It would be destroued by you and then destroyed again once it leaves the scope which isn't allowed. Also you should generally only have variables in your scope that you actually need for your computation and how would you destroy it if you actually need that variable to finish your computation? But if you really were to only need a variable for a small time within a computation you could just make a new smaller scope with { } so your variable gets automatically destroyed once it isn't needed anymore.
Note: If you got a lot of variables that you only need for a small part of your computation it might be a hint that that part of the computation should be in its own function/scope.
From your comments: Yeah I get that, but thats at the end of the scope of the variable right. Doesn't it also free a variable on the heap at the end of its scope?
They don't. Objects on the heap got no scope, you can pass their address out of a function and it still persists. The pointer pointing to it can go out of scope and be destroyed but the Object on the heap still exists and you can't access it anymore (memory leak). That's also why it's called manual memory management and most people prefer wrappers around them so that it gets automatically destroyed when it isn't needed anymore. See std::string, std::vector as examples.
From your comments: Also how can you run out of memory on a computer? An int takes up like 4 bytes, most computers have billions of bytes of memory... (excluding embedded systems)?
Well, computer programs don't always just hold a few ints. Let me just answer with a little "fake" quote:
640K [of computer memory] ought to be enough for anybody.
But that isn't enough like we all should know. And how many memory is enough? I don't know but certainly not what we got now. There are many algorithms, problems and other stuff that need tons of memory. Just think about something like computer games. Could we make "bigger" games if we had more memory? Just think about it... You can always make something bigger with more resources so I don't think there's any limit where we can say it's enough.

So why can't you free variables on the stack to make room for new variables like you can on the heap?
All information that "stack allocator" knows is ESP which is pointer to the bottom of stack.
N: used
N-1: used
N-2: used
N-3: used <- **ESP**
N-4: free
N-5: free
N-6: free
...
That makes "stack allocation" very efficient - just decrease ESP by the size of allocation, plus it is locality/cache-friendly.
If you would allow arbitrary deallocations, of different sizes - that will turn your "stack" into "heap", with all associated additional overhead - ESP would not be enough, because you have to remember which space is deallocated and which is not:
N: used
N-1: free
N-2: free
N-3: used
N-4: free
N-5: used
N-6: free
...
Clearly - ESP is not more enough. And you also have to deal with fragmentation problems.
I get that the compiler frees variables on the stack, but thats at the end of the scope of the variable right. Doesn't it also free a variable on the heap at the end of its scope? If not, why not?
One of the reasons is that you don't always want that - sometimes you want to return allocated data to caller of your function, that data should outlive scope where it was created.
That said, if you really need scope-based lifetime management for "heap" allocated data (and most of time it is scope-based, indeed) - it is common practice in C++ to use wrappers around such data. One of examples is std::vector:
{
std::vector<int> x(1024); // internally allocates array of 1024 ints on heap
// use x
// ...
} // at the end of the scope destructor of x is called automatically,
// which does deallocation

Read about function calls - each call pushes data and function address on the stack. Function pops data from stack and eventually pushes its result.
In general, stack is managed by OS, and yes - it can be depleted. Just try doing something like this:
int main(int argc, char **argv)
{
int table[1000000000];
return 0;
}
That should end quickly enough.

Local variables on the stack don't actually get freed. The registers pointing at the current stack are just moved back up and the stack "forgets" about them. And yes, you can occupy so much stack space that it overflows and the program crashes.
Variables on the heap do get freed automatically - by the operating system, when the program exits. If you do
int x;
for(x=0; x<=99999999; x++) {
int* a = malloc(sizeof(int));
}
the value of a keeps getting overwritten and the place in the heap where a was stored is lost. This memory is NOT freed, because the program doesn't exit. This is called a "memory leak". Eventually, you will use up all the memory on the heap, and the program will crash.

The heap is managed by code: Deleting a heap allocation is done by calling the heap manager. The stack is managed by hardware. There is no manager to call.

C pointer array scope and function calls

I have this situation:
{
float foo[10];
for (int i = 0; i < 10; i++) {
foo[i] = 1.0f;
}
object.function1(foo); // stores the float pointer to a const void* member of object
}
object.function2(); // uses the stored void pointer
Are the contents of the float pointer unknown in the second function call? It seems that I get weird results when I run my program. But if I declare the float foo[10] to be const and initialize it in the declaration, I get correct results. Why is this happening?

For the first question, yes using foo once it goes out of scope is incorrect. I'm not sure if it's defined behavior in the spec or not but it's definitely incorrect to do so. Best case scenario is that your program will immediately crash.
As for the second question, why does making it const work? This is an artifact of implementation. Likely what's happenning is the data is being written out to the data section of the DLL and hence is valid for the life of the program. The original sample instead puts the data on the stack where it has a much shorter lifetime. The code is still wrong, it just happens to work.

Yes, foo[] is out of scope when you call function2. It is an automatic variable, stored on the stack. When the code exits the block it was defined in, it is deallocated. You may have stored a reference (pointer) to it elsewhere, but that is meaningless.

In both cases you are getting undefined behaviour. Anything might happen.
You are storing a pointer to the locally declared array, but once the scope containing the array definition is exited the array - and all its members are destroyed.
The pointer that you have stored now no longer points to a float or even a valid memory address that could be used for a float. It might be an address that is reused for something else or it might continue to contain the original data unchanged. Either way, it is still not valid to attempt to dereference the pointer, either for reading or writing a float value.

For any declaration like this:
{
type_1 variable_name_1;
type_2 variable_name_2;
type_3 variable_name_3;
}
declaration, the variables are allocated on the stack.
You can print out the address of each variable:
printf("%p\n", variable_name )
and you'll see that addresses increase by small amount roughly (but not always exactly equal to), the amount of space each variable needs to store its data.
The memory used by stack variables is recycled when the '}' is reached and the variables go out of scope. This is done nice an efficiently just by subtracting some number from a special pointer called the 'stack pointer', which says where the data for new stack variables will have their data allocated. By incrementing and decrementing the stack pointer, programs have an extremely fast way of working out were the memory for variables will live. Its such and important concept that every major processor maintains a special piece of memory just for the stack pointer.
The memory for your array is also pushed and popped from the program's data stack and your array pointer is a pointer into the program's stack memory. While the language specification says accessing the data owned by out-of-scope variables has undefined consequences, the result is typically easy to predict. Usually, your array pointer will continue to hold its original data until new stack variables are allocated and assigned data (i.e. the memory is reused for other purposes).
So don't do it. Copy the array.
I'm less clear about what the standard says about constant arrays (probably the same thing -- the memory is invalid when the original declaration goes out of scope). However, your different behavior is explainable if your compiler allocated a chunk of memory for constants that is initialized when your program starts, and later, foo is made to point to that data when it comes into scope. At least, if I were writing a compiler, that's probably what I'd do as its both very fast and leads to using the smallest amount of memory. This theory is easily testable in the following way:
void f()
{
const float foo[2] = {99, 101};
fprintf( "-- %f\n", foo[0] );
const_cast<foo*>(foo)[0] = 666;
}
Call foo() twice. If the printed value changed between calls (or an invalid memory access exception is thrown), its a fair bet that the data for foo is allocated in special area for constants that the above code wrote over.
Allocating the memory in a special area doesn't work for non-const data because recursive functions may cause many separate copies of a variable to exist on the stack at the same time, each of which may hold different data.

It's undefined behavior in both cases. You should consider the stack based variable deallocated when control leaves the block.

What's happening is currently you're probably just setting a pointer (can't see the code, so I can't be sure). This pointer will point to the object foo, which is in scope at that point. But when it goes out of scope, all hell can break loose, and the C standard can make no guarantees about what happens to that data once it goes out of scope. It can be overwritten by anything. It works for a const array because you're lucky. Don't do that.
If you want the code to work correctly as it is, function1() is going to need to copy the data into the object member. Which means you'll also have to know the length of the array, which means you'll have to pass it in or have some nice termination method.

The memory associated with foo goes out of scope and is reclaimed.
Outside the {}, the pointer is invalid.
It is a good idea to make objects manage their own memory rather than refer to an external pointer. In this specific case your object could allocate its own foo internally and copy the data into it. However it really depends on what you are trying to achieve.

For simple problems like this it is better to give a simple answer, not 3 paragraphs about stacks and memory addresses.
There are 2 pairs of braces {}, one is inside the other. The array was declared after the first left brace { so it stops existing before the last brace }
The end
When answering a question you must answer it at the level of the person asking regardless of how well you yourself comprehend the issue or you may confuse the student.
-experienced ESL teacher

variable allocation in a nested loop question

because obj, the playingCard object is created inside a nested for loop does that mean after the second for loop completes, obj gets deallocated from the stack each time?
and a small side question,
does a compiler use the stack (similar to recursion) to keep track of loops and nested loops?
for(int c = 0;c<nElems;c++) {
for(int z = c + 1;z<nElems;z++) {
playingCard obj;
}
}

It gets constructed and deconstructed every iteration.
However, on the stack, the concept of allocation is (for at least VS and GCC) more hazy. Since the stack is a contiguous block of memory, premanaged by the compiler, there's no real concept of allocating and deallocating in the way that there is for heap allocations (new/delete or malloc/free). The compiler uses the memory it needs on the stack, and simply rolls back the pointer later on.

The scope of the object is in with-in the enclosing braces [ whether it is a function or loop brace ]. so, as soon as the scope ends, the destructor of the object gets called and the object gets deallocated.
Coming to your second question, depends on the compiler to maintain its own strategy to handle the loops and keep track of the objects.

It gets allocated/deallocated on every iteration of your inner loop.
I'm not clear on your side question, but the compiler uses the stack to keep track of all local variables that it can't otherwise just squeeze into registers.

Objects in the stack are allocated or deallocated once (even if they are nested in loops). However, constructors and destructors are called on every iteration.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js