Confused between structures and strucutre pointers - c++

I wanted to know the difference between
struct file_operations {
} a;
and
struct file_operations {
} *a;
As in how are they allocated in memory? How does the compiler know about the memory location of 'a' in the first case? Is it from symbol table? If so how is address to symbol table(or any other table) found?
In the second case I'm assuming the memory address gets stored in a variable of size 32 bits , how is this variables location(address of 'a' in second code) figured out?

In the first case, an instance of the structure is allocated in memory, the ammount of bytes allocated equals the value returned by sizeof(a).
In the second case a pointer is allocated, the ammount of bytes allocated equals the size of a pointer i.e. sizeof(void *).
As you might be guessing, the second case doesn't allow you to access the fields of the structure, becuase the memory the pointer points to, is invalid until you request enough memory from the heap, or until you make it point to an instance like the one in your first example.
Suppose that we had the following structure
struct Data {
int quantity;
double value;
char name[100];
};
If you do the following
struct Data data;
then an instance of struct Data is allocated, you can access it's fields immediately, for example
data.quantity = 1;
data.value = 3.0;
strcpy(data.name, "My Name Is ...");
If you declare a pointer, like
struct Data *pointer;
then you can't access the fields until you make the pointer point to a valid instance of struct Data, otherwise undefined behavior will happen, you can create such instance by just taking the address of the struct Data data; that we already initialized above, like this
pointer = &data;
the lifetime of the pointee restricts the validity lifetime of the pointer, once you go out of the scope where data was declared, then the pointer will point to garbage, because data would have been deallocated.
Another way to make the pointer valid, is using malloc(), i.e. by requestion memory from the system heap, that is done by asking for sizeof(struct Data) bytes, like here1
pointer = malloc(sizeof(struct Data));
after that, you first check that memory was allocated, when there is a problem malloc() is guaranteed to return a special poitner NULL, it's an invalid poitner that helps you check if poitner is actually pointing to valid memory,
if (pointer != NULL)
{
pointer->quantity = 1;
pointer->value = 3.0;
strcpy(pointer->name, "My Name Is ...");
}
In this case the pointer is valid until you decide it isn't, when you do, you must call free() like this
free(pointer);
after that if you try to access the pointer again, undefined behavior will occur.
1You can also make it independent of the type of pointer by using this syntax pointer = malloc(sizeof(*pointer));, since sizeof(*pointer) is equal to sizeof(struct Data).

how are they allocated in memory?
In both cases, it depends on where the variable is defined.
At namespace scope (C++) or file scope (C), it's a global variable, with an address assigned when the program starts. Typically, that's specified by the symbol table, as you say.
At block scope, it's an automatic variable, and memory is typically allocated on the function's stack frame, some time before the program reaches the definition.
At class scope, it's part of the class that contains it.
In the second case I'm assuming the memory address gets stored in a variable of size 32 bits
It's however large a pointer is. On a 32-bit platform, that will be 32 bits. In this case, there is no file_operations object, only a pointer.

Both statements do two things: (1) Define a struct called struct file_operations, (2) Declare an uninitialized variable of that type.
The first allocates space the size of the struct on the stack (or in static storage, if outside a function.) Data members of the structure can then be accessed like a.member1 = 1. When inside a function, a is on the stack. It is just like any other variable declaration, for example int a. If outside a function, it declares a global variable. Its member address can be found using &a. The compiler uses a symbol table while compiling that indicates the type, relative address, etc of each token.
When declaring a global variable and compiling into a library, it also generates a symbol in the binary's symbol table so that it can be linked by the linker.
The second case declares a pointer to a struct file_operations. A pointer is a variable holding a member address, so it has a size of 4 byte in your case. Its type is struct file_operations * which indicates that the data it points to must be of type struct file_operations. The variable here is unintialized. The pointer does not contain a valid address, and dereferencing it would fail. To use it:
struct file_operations a;
struct file_operations* pa = &a;
Would make pa point to a. Then a's members can be access through pa via pa->member1 = 1. The address of the pointer itself &pa is also on the stack (or in static memory when outside a function). &pa is the address of the pointer, aka a pointer to a pointer. pa is the address that the pointer points to.

Related

Only the stack can store local (pointer) variables, not Heap?

I am new to C and C++. I understand that whenever a function is called, its variables get memory allocated on the stack, that includes the case where the variable happens to be a pointer that points to data allocated on the heap via malloc or new (but I heard it is not guaranteed that the storage allocated by malloc is 100% on the Heap, please correct me if I am wrong). For example,
Void fn(){
Member *p = new Member()
}
Or
Void fn() {
int *p = (int*) malloc( sizeof(int) * 10 );
}
Please correct if I am wrong, in both cases, variable p (which holds the address to the object allocated on the heap) is on the stack, and it points to the object on the heap.
So is it correct to say that all the variables we declare are on the stack even though they might point to something on the heap?
Let’s say the address of local variable pointer p is loaded at memory address 001, it has the address of the member object located on Heap, and that address is 002. We can draw a diagram like this.
If that is correct, my next question is, can we have a pointer that is actually located on the heap, and it points to a variable located on Stack? If it is not possible, can that pointer points to a variable located on Heap?
Maybe another way to phrase this question is: in order to access something in heap, we can only access it via pointers on the stack??
A possible diagram could look like this
If that is possible, Can I have an example here?
Yes, you can put your pointer on the free store (heap) and have it point to a variable on the stack. The trick is to create a pointer to a pointer (int**):
int main()
{
int i = 0; // int on the stack
int** ip = new int*; // create an int* (int pointer) on the free store (heap)
// ip (the int**) is still on the stack
*ip = &i;
// Now your free store (heap) located pointer points
// to your stack based variable i
delete ip; // clean up
}
NOTE: The terms "heap" and "stack" are general, well understood, computing terms. In C++ they are referred to in the Standard as the "free store" and (although not directly named) a "stack" is 100% implied (eg. through references to "stack-unwinding") and therefore required.
stack and heap are not specifically defined by the standard. Those are implementation details.
Heap refers to a data structure that many operating systems use to help them safely manage the allocated space for different programs running at the same time. Read more here
Here is a diagram for a simple heap so that you can have a mental model of it:
Keep in mind that this is not exactly what operating systems use. In fact, operating systems use a far more advanced form of the heap data structure that allows them to perform many sorts of complex memory-related tasks. Also, not every OS implements the free store using the heap data structure. Some may use different techniques.
Whereas a stack is much simpler:
can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?
Yes, it's possible but rarely needed:
#include <iostream>
int main( )
{
int a_variable_on_stack { 5 };
int** ptr_on_stack { new int*( &a_variable_on_stack ) };
std::cout << "address of `a_variable_on_stack`: " << &a_variable_on_stack << '\n'
<< "address of ptr on the heap: " << ptr_on_stack << '\n'
<< "value of ptr on the heap: " << *ptr_on_stack << '\n';
std::cin.get( );
}
Possible output:
address of `a_variable_on_stack`: 0x47eb5ffd2c
address of ptr on the heap: 0x1de33cc3810
value of ptr on the heap: 0x47eb5ffd2c
Notice how the address of a_variable_on_stack and value of ptr stored on heap are both 0x47eb5ffd2c. In other words, a pointer on the heap is holding the address of a variable that is on the stack.
In short:
Variables declared within a function are allocated on the stack, and can point to whatever you want (to address of other variables on the stack and to address of other variables on the heap).
Same is for variables declared on the heap. They can point to address of other variables on the heap or to address of variables on the stack. There is no limitation here.
However, variables declared on the stack, are by nature temporary, and when function return this memory is reclaimed. Therefor it is not a good practice to have pointers to variable's address at the stack, unless you know the function did not finish yet (i.e. using local variables address from within the same function or by functions calls from within the same function). A common mistake of novice C/C++ developers, is to return from function, address of variable declared on the stack. When function returns, this memory is reclaimed and will be soon reused for other function calls memory, so accessing this address has undefined behavior.
I am new to C and C++.
Your question is not C or C++ specific, but it is about programming languages in general.
... whenever a function is called, its variables get memory allocated on the stack ...
This is correct: Nearly all compilers do it this way.
However, there are exceptions - for example on SPARC or TriCore CPUs, which have a special feature...
... allocated on the heap via malloc ...
malloc never allocates memory on the stack but on the heap.
... is not guaranteed that the storage allocated by malloc is 100% on the heap ...
Unlike the word "stack", the meaning of the word "heap" differs a bit from situation to situation.
In some cases, the word "heap" is used to specify a certain memory area that is used by malloc and new.
If there is not enough memory in that memory area, malloc (or new) asks the operating system for memory in a different memory area.
However, other people would also call that memory area "heap".
... in both cases, variable p is on the stack, and it points to the object on the heap.
This is correct.
... can we have a pointer that is actually located on the heap, and it points to a variable located on Stack?
Sure:
int ** allocatedMemory;
void myFunction()
{
int variableOnStack;
allocatedMemory = (int **)malloc(sizeof(int *));
*allocatedMemory = &variableOnStack;
...
}
The variable allocatedMemory points to some data on the heap and that data is a pointer to a variable (variableOnStack) on the stack.
However, when the function myFunction() returns, the variable variableOnStack does no longer exist. Let's say the function otherFunction() is called after myFunction():
void otherFunction()
{
int a;
int b;
...
}
Now we don't know if *allocatedMemory points to a, to b or even the "return address" because we don't know which of the two variables is stored at the same address as variableOnStack.
Bad things may happen if we write to **allocatedMemory now...
In order to access something in heap, we can only access it via pointers on the stack??
... diagram "B" ...
To access some data on the heap, you definitely need some pointer that is not stored on the heap.
This pointer can be:
A global or static variable
In my example above, allocatedMemory is a global variable.
Global and static variables are neither stored in a completely different memory area (not heap nor stack)
A local variable on the stack
A local variable in a CPU register
(I already wrote that local variables are not always stored on the stack)
Theoretically, the situation in diagram "B" is possible: Simply overwrite the variable allocatedMemory by NULL (or another pointer).
However, a program cannot directly access data on the heap.
This means that p* (which is some data on the heap) cannot be accessed any more if there is no more pointer "outside" the heap that points to p*.

Memory Leakage when accessing, using pointer, elements from a struct received by argument

A little of context: I have recently developed a generic function to update a display image based just on which parts of it have changed. For that I receive as parameter a known struct which members are arrays of chars (display lines) and I compare it to the currently displayed information in order to update just the necessary, improving the performance. It is an embedded system for arm cortex-m0 in case it matters.
Below is the code code snippet for the struct.
struct displayLines {
char firstLine[13];
char secondLine[13];
char thirdLine[13];
char fourthLine[13];
char fivethLine[13];
char sixthLine[13];
};
typedef struct displayLines st_displayLines;
Based on that struct I use a pointer, initialized with the first element address and operate with it to access the whole struct data (all lines). Below is the code snippet showing how I initialize the pointer.
void updateScreen(st_displayLines st_toDisplay)
{
char *ptrtoUpdate = st_toDisplay.firstLine;
char *ptrDisplayed = st_currentlyDisplayed.firstLine;
//Here is the update code which consumes the pointers.
}
After operating with the same pointer variable (using arithmetic over it) and consuming its data I just return from the function, without using calling free operation over the pointer variable.
The question finally is:
Is this a memory leakage? Or the allocated memory for creating the pointers will be released once I'm outside the function scope?
If it is a leakage, would it hold 1 byte per each pointer variable of my memory every time I call this function? In a manner that (making numbers now) if I have just 10 bytes of memory available, on the 6th call of the function my program would crash?
Memory leaks happen when memory allocated with malloc no longer has a pointer pointing to it and the memory is not free'ed. The code as you've shown it doesn't allocate memory, so there's no leak.
Simply using a pointer variable is no different from using an int. The variable is in scope at the point it is declared and goes out of scope at the end of the enclosing compound statement. So no need to free anything.

Implicit memory allocation and initialization when declaring a pointer [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Declaring a pointer to struct in C++ automatically allocates memory for its members. Am I wrong?
Say if I define the structure Human as:
struct Human{int year, Human* Mom};
does the expression
Human* Bob;
automatically allocate a memory for both Bob and the Human object it is pointing to?
Because I noticed that
Bob == NULL
is false;
Does this mean that the above expression creates a static memory of the object Human?
Also, I noticed that
Bob->year
is NOT initialized to 0 automatically, but
Bob->Mom
is being initialized to NULL, why is that?
Another thing, if I allocate the memory dynamically, e.g.
Human* Bob = new Human;
Then I found that
Bob->Mom
is no longer NULL, how this is happening?
At minimum, the structure declaration should be corrected to:
struct Human { int year; Human *Mom; };
The statement:
Human *Bob;
then creates a storage location, but does not initialize it if it is created inside a function. If it is at global scope, it would be initialized to zero (NULL), but you say Bob == NULL is false, so it must be an uninitialized local variable. It doesn't point at anything. Any use of it other than as the target for an assignment invokes undefined behaviour because the value in Bob is undefined.
No; the definition shown does not allocate storage for Bob to point at.
Your other observations all depend on the quirks of your runtime system. Because the behaviour is undefined, anything can happen and it is 'OK' according to the standard. You need a default constructor for the type to get the values set sensibly when you use:
Human *Bob = new Human;
You have not provided one, and the system doesn't need to provide one, so the object pointed at is uninitialized.
No, it doesn't allocate memory. Bob is a pointer with garbage value, so who knows what it's pointing to.
does the expression Human* Bob; automatically allocate a memory for both Bob and the Human object it is pointing to? Because I noticed that Bob==NULL is false; Does this mean that the above expression creates a static memory of the object Human?
No, it doesn't. Bob is not initialised, so will just point to some random garbage. You must explicitly allocate memory for Bob.
Another thing, if I allocate the memory dynamically, e.g. Human* Bob=new Human; Then I found that Bob->Mom is no longer NULL, how this is happening?
As above, uninitialised memory could have anything in it. Trying to follow the pointer will lead to disaster. Good practice is to initialise all pointers to NULL or to immediately point them at an allocated block of memory.
In your code, the only instruction you have given to the compiler is to create a pointer (either a 32 bit or 64 bit variable) which you intend to point to a structure of type "Human".
Since you have not initialized the pointer, its value is most likely what was in that memory space - that could be a 0 (NULL), or anything else.
When you call Bob->Mom, you are dereferencing the pointer (Bob) and looking at an area in memory that is arbitrary (potentially a space you are not allowed access to). You most likely will get a segmentation fault (on a *nix machine).
What you should do is:
Human* Bob = new Human();
That will create dedicated space for the structure and return the address of that structure and assign it to the pointer Bob. Now, when you dereference Bob, it will actually point to a space which has been allocated specifically for the Human structure.
Its your responsibility to initialize pointer values. Unless you have set them to something (like 0 or NULL), their value is undefined and you may get a different uninitialized pointer value with each corresponding allocation.
In terms of allocation, you have defined a recursive or self-referential data structure. Consider what happens if it in fact did allocate memory for an additional Human for the Mom member; it would then have to allocate another Human for Bob->Mom->Mom and so on...
So, no. Only one Human is allocated. Any native pointer is just a memory address location, nothing more.
You can make pointer initialization easier on yourself if you use a constructor:
struct Human {
int year;
Human *Mom;
Human() : year(0), Mom(NULL) {}
};

the scope of a pointer?

Ok, so I did find some questions that were almost similar but they actually confused me even more about pointers.
C++ Pointer Objects vs. Non Pointer Objects
In the link above, they say that if you declare a pointer it is actually saved on the heap and not on the stack, regardless of where it was declared at. Is this true ?? Or am I misunderstanding ??? I thought that regardless of a pointer or non pointer, if its a global variable, it lives as long as the application. If its a local variable or declared within a loop or function, its life is only as long as the code within it.
The variable itself is stored on the stack or DATA segment, but the memory it points to after being allocated with new is within the heap.
void main()
{
int* p; // p is stored on stack
p = new int[20]; // 20 ints are stored on heap
}
// p no longer exists, but the 20 ints DO EXSIST!
Hope that helps.
void func()
{
int x = 1;
int *p = &x;
}
// p goes out of scope, so does the memory it points to
void func()
{
int *p = new int;
}
// p goes out of scope, the memory it points to DOES NOT
void func()
{
int x = 1;
int **pp = new int*;
*pp = &x;
}
// pp goes out of scope, the pointer it points to does not, the memory it points to does
And so forth. A pointer is a variable that contains a memory location. Like all variables, it can be on the heap or the stack, depending on how it's declared. It's value -- the memory location -- can also exist on the heap or the stack.
Typically, if you statically allocate something, it's on the stack. If you dynamically allocate something (using either new or malloc) then it's on the heap. Generally speaking you can only access dynamically allocated memory using a pointer. This is probably where the confusion arises.
It is necessary to distinguish between the pointer (a variable that holds a memory location) and the object to which the pointer points (the object at the memory address held by the pointer). A pointer can point to objects on the stack or on the heap. If you use new to allocate the object, it will be on the heap. The pointer can, likewise, live on the heap. If you declare it in the body of a function, then it will be a local variable and live in local storage (i.e. on the stack), whereas if it is a global variable, it will live somewhere in your application's data section. You can also have pointers to pointers, and similarly one can allocate a pointer on the heap (and have a pointer-to-a-pointer pointing to that), etc. Note that while I have referenced the heap and stack, the C++ only mentions local/automatic storage and dynamic storage... it does not speak to the implementation. In practice, though, local=stack and dynamic=heap.
A pointer is a variable containing the address of some other object in memory. The pointer variable can be allocated:
on the stack (as a local auto variable in a function or statement block)
statically (as a global variable or static class member)
on the heap (as a new object or as a class object member)
The object that the pointer points to (references) can likewise be allocated in these three places as well. Generally speaking, though, a pointed-to object is allocated using the new operator.
Local variables go out of scope when the program flow leaves the block (or function) that they are declared within, i.e., their presence on the stack disappears. Similarly, member variables of an object disappear when their parent object goes out of scope or is deleted from the heap.
If a pointer goes out of scope or its parent object is deleted, the object that the pointer references still exists in memory. Thus the rule of thumb that any code that allocates (news) an object owns the object and should also delete that object when it's no longer needed.
Auto-pointers take some of the drudgery out of the management of the pointed-to object. An object that has been allocated through an auto_ptr is deleted when that pointer goes out of scope. The object can be assigned from its owning auto_ptr to another auto_ptr, which transfers object ownership to the second pointer.
References are essentially pointers in disguise, but that's a topic for another discussion.
I thought that regardless of a pointer
or non pointer, if its a global
variable, it lives as long as the
application. If its a local variable
or declared within a loop or function,
its life is only as long as the code
within it.
That's true.
they say that if you declare a pointer
it is actually saved on the heap and
not on the stack
That's wrong, partially. You can have a pointer on the heap or the stack. It's a matter of where and how you declare it.
void main()
{
char c = 0x25;
char *p_stack = &c; // pointer on stack
StructWithPointer struct_stack; // stack
StructWithPointer *struct_heap = new StructWithPointer(); // heap, thus its pointer member "p" (see next line) is also on the heap.
struct_heap->p = &c; // pointer on heap points to a stack
}
... and, a compiler might decide to use a register for a pointer!
Looks like you need to grab the classic K&R C book and read through chapters 4 & 5 for thorough understanding of the differences between declaration and definition, scope of a variable and about pointers.

sending address of a variable declared on the stack?

I have a doubt concerning declaring variables, their scope, and if their address could be sent to other functions even if they are declared on the stack?
class A{
AA a;
void f1(){
B b;
aa.f2(&b);
}
};
class AA{
B* mb;
f2(B* b){
mb = b;
//...
}
};
Afterwards, I use my AA::mb pointer in the code.
So things I would like to know are following. When the program exits A::f1() function, b variable since declared as a local variable and placed on the stack, can't be used anymore afterwards.
What happens with the validity of the AA::mb pointer?
It contains the address of the local variable which could not be available anymore, so the pointer isn't valid anymore?
If B class is a std::<vector>, and AA::mb is not a pointer anymore to that vector, but a vector collection itself for example. I would like to avoid copying all of it's contents in AA::f2() to a member AA::mb in line mb = b. Which solution would you recommend since I can't assign a pointer to it, because it'll be destroyed when the program exits AA::f2()
It contains the address of the local variable which could not be available anymore, so the pointer isn't valid anymore?
Yes. It becomes a dangling pointer.
You could try vector::swap, as in:
class AA {
B mb; // not a pointer
f2(B* b){
mb.swap(*b); // swap the content with b, which is just a few pointer assignments.
The address of a variable is a pointer. If the variable was allocated on the stack, then the pointer refers to some address on the stack. When a function returns, the next function (or some future function) that is called creates local variables in the same place on the stack. Nothing happened to the pointer, but the data pointed to has now changed.
When you allocate memory with new or malloc, you are reserving space in the heap. Nothing else should use that space until you call delete or free. Anything that may be referenced once a function returns must be allocated on the heap.