I am trying to learn Pointers behavior and I tried couple of examples.
From my understanding, "Program should throw error when we try to print the pointer while it is not assigned with the address of a value".
I wrote a block of code with pointer variables 'a' and 'b' and directly allocated value to pointer 'a'. I expected this would result in segmentation fault. Also the second pointer 'b' takes the address of pointer 'a'. Why do I see this behavior ?
Below is my Block of Code:
int *a; // What exactly happens behind the scenes here ? What will "a" contain ?
int *b; // Why does "b" take address of "a" ?
*a = 5; // Why don't I get a segmentation fault here ?
cout<<a<<endl;
cout<<*a<<endl;
cout<<b<<endl;
cout<<*b<<endl;
And my output is,
0x246ff20
5
0x246ff20
5
"Program should throw error when we try to print the pointer while it is not assigned with the address of a value".
In fact, when you attempt to use a variable whose value is not initialized, the behavior (i.e. what happens) is is undefined.
In this case, this means that your program could crash with a segmentation fault, or it could print something vaguely meaningful. (Or nasal demons :-) )
What you are probably seeing is some values that happen to be in the memory locations that now correspond to those variables. Those values most likely got there because somewhere earlier in the execution the memory locations were used for something else.
You've got a lot of undefined behaviour there.
int* a;
What is contained in a? Undefined. Completely up to the compiler. In this case it seems to default to 0x246ff20. Why do I say that? Because b is also set to this. You can't rely on this and it may just be coincidence.
It just so happens that the value seems to be valid (pointing to a data page in memory), which doesn't throw a seg fault when written to.
That also answers the rest of your questions. The behaviour you observe is completely undefined.
You should be pointing your pointers to something. For instance:
int a = 5;
int* pA = &a;
int* pB = pA;
int *a; is a pointer that may refer to every where!
when we can use of it that we assign a address to that or create a new of that, for example :
int *a;
int *b = a; // b variable is pointing to a variable (if *a changeed, *b also changed that value)
example 2 :
int *a = new int(5); // a is a variable that allocate a 4 byte memory to it and its first value is 5
cout<<a<<endl; // physycal address of 'a' variable is print
cout<<*a<<endl; // value of 'a' variable (that is in 'a' address) is print
Everything in C/C++ has a value by default. That value, however, is completely undefined.
Some OS-compiler combinations might zero it (ie, make your pointers null pointers), some might insert debug 'messages' into them (like baadf00d), but most merely leave whatever was previously in the memory there.
Whatever the case of what is in those values, they will be treated by the compiler as if you did define them. Accessing undefined values is "legitimate" within C/C++ in the sense it is not explicitly disallowed, but not a good thing to do. Some compilers possess warning flags which will tell you if you are using an uninitialized variable, but these are not always foolproof, as it can be difficult for compilers to determine when some variables are initialized.
Whenever you define a value, you should give it a default value. For pointers, that would be NULL or nullptr. In more complex situations, where a "zero-value" might be legitimate (like the number of child objects to an object), you may wish to use a second variable, such as bool initialized, to store if the values are legitimate or junk.
De-referencing an undefined pointer like you are doing, much less writing to it like you are, will probably be caught by the OS and result in your program being killed for memory access violations. I'm somewhat surprised you got any output.
A handy little tip, when using pointers, always set the pointer to null before you set it to a value, this way you always know that the pointer isn't pointing an obscure memory address or pointing to the same thing as another pointer.
And always use delete after using dynamic pointers
Related
I have a difficulties understanding pointers and how/when they fail. So I made a tiny program which creates a pointer, assigns a value to it and then prints that value. Compiles fine with both gcc and clang and does not give any warnings when using the -Wall switch. Why does it segfault and why does it not segfault when I assign a different value to the pointer first? I thought I had initialized the pointer to somewhere. When I just declare the pointer without initialization, then I rightfully get a compiler warning. However, here I do not get a compiler warning, but it still segfaults.
#include<iostream>
int main(){
int *b = (int*) 12; //pointer gets initialized, so it points to somewhere
//int a = 13; //works fine when uncommenting this and the next line
//b = &a;
*b = 11;
std::cout << "*b = " << *b << "\n";
return 0;
}
A pointer is a variable that save a memory address.
You "can" have any memory address in your pointer, but trying to read from memory space outside of where your application are allowed to read, will trigger the OS to kill you application with a segfault error.
If you allow me the metaphor:
You can write on a paper the address of any person in your country. But if you try to enter that house without permission, you most probably will get stopped by the police.
Back to code:
int *b = (int*) 123; // ok, you can save what you want.
std::cout << *b << std::endl; // SEGFAULT: you are not allowed to read at 123.
When you uncomment the two lines of code:
int a = 13;
b = &a;
Basically, b is not any-more pointing to that forbidden 123 address, but to the address of a. a is in your own code, and it memory is not forbidden to you, so accessing *b after this is allowed.
Reading at any hard-coded address is not forbidden by C++ (in fact it is useful in some situations), but your OS may not allow you to mess with that memory.
On the other hand, the compiler is able to detect a variable being used without initialization, and can warn this case. This has nothing to do with raw pointers.
A pointer is a variable to store an address.
int *b = (int*) 12;
This declares b as a pointer to a value of type int, and initializes it with the address 12. Do you know what resides at address 12? No, you don't. Thus, you should not use that address.
*b = 11;
This stores an integer 11 at the address pointed to by the pointer b. However, since pointer b points at address 12, the integer 11 overwrites something at that address (we don't even know what does it overwrite, because we don't know what is there to begin with). This could corrupt heap or stack or program's code or just cause an access violation, anything can happen.
But if you first do this:
b = &a;
Then pointer b now points at the address at which the variable a is stored. Thus, subsequently writing 11 at that address will just overwrite a's value (from 13 to 11), a perfectly valid operation, no problems.
I just need it to point to some location that can hold an int
Indeed, but this is not enough. Not only shall the location be able to hold an int, that location must also be available for your program to store such an int. This means the location needs to be either that of an existing object (e.g., of an existing variable of type int) or a newly allocated memory chunk capable of holding an int. For example:
b = new int;
This will dynamically allocate the memory needed to store an int, and assign the address of the newly allocated memory to the pointer b.
Remember that after you're done with such dynamically allocated memory, you should deallocate it:
delete b;
Otherwise, there will be a memory leak (at least until the whole process exits anyway).
Nowadays, in modern C++, there is rarely a need to use raw pointers to manage dynamic memory allocations/deallocations manually. You'd use standard containers and/or smart pointers for that purpose instead.
I have a question regarding the code snippet below:
double d = 20.1;
double* pd = new double;
...
pd = &d;
delete pd;
The last line throws an error in Visual C++ 2015. Does this mean that the pointer "pd" points to the stack address of "d" and its original pointed-to address in the heap (right-hand side of the equal sign in the second line) has leaked?
Yes, you leaked the double originally pointer by pd, but that's not what's causing the error.
The error is a result of you trying to delete a stack allocated address. That is is strictly undefined behavior. delete will only work for pointers which were created with new.
Variables hold values. Pointer values are just that, values. Not names. Not variables.
When you manipulate a pointer value, for example to pass it to delete, you only care about the value.
It's essentially equivalent to:
int one=1;
one=2;
std::cout << one; // you won't be surprised to see 2, will you?
The value currently in the variable matters; not some previous value. Not the name.
Your program tries to deallocate (delete) an object that wasn't allocated by you (the user), but by the compiler.
Does this mean that the pointer "pd" points to the stack address of "d" and its original pointed-to address in the heap (right-hand side of the equal sign in the second line) has leaked?
Yes, that is exactly what this means.
I get a bad feeling about this code
widget* GetNewWidget()
{
widget* theWidget = (widget *) malloc(sizeof(widget));
return theWidget;
}
Firstly, one should never cast the result of malloc() (nor, I suspect, use it in C++ (?)).
Secondly, won't theWidget be allocated on the stack?
If so, won't the caller trying to access after this function returns be undefined behaviour?
Can someone point to an authoritative URL explaining this?
[Update] I am thinking of this question Can a local variable's memory be accessed outside its scope?
In summary: this code is perfectly fine
Returning a pointer is like returning an int: the very act of returning creates a bitwise copy.
Step, by step, the code works as follows:
malloc(sizeof(widget));
Allocates a block of memory on the heap[1], starting at some address (let's call it a), and sizeof(widget) bytes long.
widget* theWidget = (widget *) malloc(sizeof(widget));
Stores the address a on the stack[2] in the variable theWidget. If malloc allocated a block at address0x00001248, then theWidget now contains the value 0x00001248, as if it were an integer.
return theWidget;
Now causes the value of a to be returned, i.e., the value 0x00001248 gets written to wherever the return value is expected.
At no point is the address of theWidget used. Hence, there is no risk of accessing a dangling pointer to theWidget. Note that if your code would return &theWidget;, there would have been an issue.
[1] Or it might fail, and return NULL
[2] Or it might keep it in a register
On the stack you just allocated a pointer, it's not related to the object itself. :)
I never use malloc (it's a C thing, you shouldn't use it in C++), thus i am not sure, but i hardly believe it's undefined behaviour.
If you would write this: widget* theWidget = new widget(); it should work correctly.
Even better if you use smart pointers if you have C++11
std::unique_ptr<widget> GetNewWidget()
{
std::unique_ptr<widget> theWidget(std::make_unique<widget>());
return theWidget;
}
Or in this case you can write even smaller code, like this:
std::unique_ptr<widget> GetNewWidget()
{
return std::make_unique<widget>();
}
The above version will clean out the memory as soon as unique pointer go out of scope. (unless you move it to another unique_ptr) It's worth some time to read about memory management in C++11.
Here is a code snippet I was working with:
int *a;
int p = 10;
*(a+0) = 10;
*(a+1) = 11;
printf("%d\n", a[0]);
printf("%d\n", a[1]);
Now, I expect it to print
10
11
However, a window appears that says program.exe has stopped working.
The if I comment out the second line of code int p = 10; and then tun the code again it works.
Why is this happening? (What I wanted to do was create an array of dynamic size.)
There are probably at least 50 duplicates of this, but finding them may be non-trivial.
Anyway, you're defining a pointer, but no memory for it to point at. You're writing to whatever random address the pointer happened to contain at startup, producing undefined behavior.
Also, your code won't compile, because int *a, int p = 10; isn't syntactically correct -- the comma needs to become a semicolon (or you can get rid of the second int, but I wouldn't really recommend that).
In C, you probably want to use an array instead of a pointer, unless you need to allocate the space dynamically (oops, rereading, you apparently do want to -- so you need to use malloc to allocate the space, like a = malloc(2); -- but you also want to check the return value to before you use it -- at least in theory, malloc can return a null pointer). In C++, you probably want to use a std::vector instead of an array or pointer (it'll manage dynamic allocation for you).
No memory is being allocated for a, it's just an uninitialized pointer to an int (so there are two problems).
Therefore when data is stored in that location, the behavior is undefined. That means you may sometimes not even get a segmentation fault/program crash, or you may -> undefined. (Since C doesn't do any bounds checking, it won't alert you to these sort of problems. Unfortunately, one of the strength of C is also one of its major weaknesses, it will happily do what you ask of it)
You're not even allocating memory so you're accessing invalid memory...
Use malloc to allocate enough memory for your array:
int* a = (int*) malloc(sizeof(int)*arraySize);
//Now you can change the contents of the array
You will need to use malloc to assign memory that array.
If you want the size to by dynamic you will need to use realloc every time you wish to increase the size of the array without destroying the data that is already there
You have to allocate storage for the array. Use malloc if you're in C, new if it's C++, or use a C++ std::vector<int> if you really need the array size to be dynamic.
First a has not been initialized. What does it point to? Nothing, hopefully zero, but you do not know.
Then you are adding 1 to it and accessing that byte. IF a were 0, a+1 would be 1. What is in memory location 1?
Also you are bumping the address by one addressable memory unit. This may or may not be the size of an integer on that machine.
struct MyRect
{
int x, y, cx, cy;
char name[100];
};
int main()
{
MyRect mr;
mr.x = 100;
mr.y = 150;
mr.cx = 600;
mr.cy = 50;
strcpy(mr.name, "Rectangle1");
MyRect* ptr;
{
unsigned char bytes[256];
memcpy(bytes, &mr, 256);
ptr = (MyRect*)bytes;
}
printf("X = %d\nY = %d\nCX = %d\nCY = %d\nNAME = %s\n",
ptr->x, ptr->y, ptr->cx, ptr->cy, ptr->name);
return 0;
}
I was just testing how to put a struct/class in an array of bytes, and was suprised when it compiled and worked, the printf prints all the values which i set in the mr variable.
just a little confused to what exactly "ptr" is pointing to? has it allocated memory for ptr somewhere?
It works by pure chance.
Firstly, you're basically making a byte-by-byte copy of the struct and placing it in a stack-allocated buffer using memcpy. However, you shouldn't do this in practice. It happened to work this time, because your struct is a POD (plain-old-data or C-struct), but if your struct was a C++ object with constructors/copy-constructors or what have you, you may have gotten a nasty surprise.
Secondly, the stack-allocated buffer containing the struct goes out of scope by the time you use it via your pointer, so what you're doing is totally undefined behavior. It only works by pure chance, and is not guaranteed to work again on a different computer or with a different compiler, or even at a different time of day.
The unsigned char bytes[256] are allocated on the stack, i.e. everytime your function (in this case main) is entered, 256 byte are reserved on the stack for the variable bytes. And through the cast ptr is now pointing to this area on the stack and interpreted as being of type MyRect. Since you first copied such a struct to the stack area this is all fine and valid. But as soon as you leave main, the area ptr points to is gone, so you may not store a pointer to that area outside of this function.
Well, your program causes undefined behaviour, so you should probably not be surprised that it happens to work. Or if it happened to not work or caused the universe to end, for that matter. After your block containing the definition of bytes, ptr is out of scope, and may or may not still point to valid memory. In your case, it does. But you can't rely on that behaviour.
ptr is still pointing to the address of bytes. Or, what was once called bytes. Even though you've put bytes into its own block and the variable is semantically inaccessible outside of that block, the memory sticks around unmodified until the function exits. This is a typical implementation technique, but is undefined by the standard, so don't depend on it.
ptr = (MyRect*)bytes;
"bytes" is the address of the array in memory.
ptr gets assigned that address in this code.
The cast tells the compiler to ignore the difference in data types.
If you understand in detail what the compiler is doing under the covers this can certainly work just fine. The only problem is changing compilers or compiler settings might cause this code to fail. It can be a bit fragile.
It works because though the 'bytes' array is out of scope, the stack space it resides in is has not been stepped on by the time you call printf(). It also works because though 'mr' is not 256 bytes large, the memory following it (on the stack) doesn't care that you are reading it.
C is a very loose, non-type-safe language. Pointers can point to just about any memory location and you can cast to any pointer type you like.
So I agree, your program basically works by accident. But it does so, because C permits some wild things to be done with pointers.