I have a difficulties understanding pointers and how/when they fail. So I made a tiny program which creates a pointer, assigns a value to it and then prints that value. Compiles fine with both gcc and clang and does not give any warnings when using the -Wall switch. Why does it segfault and why does it not segfault when I assign a different value to the pointer first? I thought I had initialized the pointer to somewhere. When I just declare the pointer without initialization, then I rightfully get a compiler warning. However, here I do not get a compiler warning, but it still segfaults.
#include<iostream>
int main(){
int *b = (int*) 12; //pointer gets initialized, so it points to somewhere
//int a = 13; //works fine when uncommenting this and the next line
//b = &a;
*b = 11;
std::cout << "*b = " << *b << "\n";
return 0;
}
A pointer is a variable that save a memory address.
You "can" have any memory address in your pointer, but trying to read from memory space outside of where your application are allowed to read, will trigger the OS to kill you application with a segfault error.
If you allow me the metaphor:
You can write on a paper the address of any person in your country. But if you try to enter that house without permission, you most probably will get stopped by the police.
Back to code:
int *b = (int*) 123; // ok, you can save what you want.
std::cout << *b << std::endl; // SEGFAULT: you are not allowed to read at 123.
When you uncomment the two lines of code:
int a = 13;
b = &a;
Basically, b is not any-more pointing to that forbidden 123 address, but to the address of a. a is in your own code, and it memory is not forbidden to you, so accessing *b after this is allowed.
Reading at any hard-coded address is not forbidden by C++ (in fact it is useful in some situations), but your OS may not allow you to mess with that memory.
On the other hand, the compiler is able to detect a variable being used without initialization, and can warn this case. This has nothing to do with raw pointers.
A pointer is a variable to store an address.
int *b = (int*) 12;
This declares b as a pointer to a value of type int, and initializes it with the address 12. Do you know what resides at address 12? No, you don't. Thus, you should not use that address.
*b = 11;
This stores an integer 11 at the address pointed to by the pointer b. However, since pointer b points at address 12, the integer 11 overwrites something at that address (we don't even know what does it overwrite, because we don't know what is there to begin with). This could corrupt heap or stack or program's code or just cause an access violation, anything can happen.
But if you first do this:
b = &a;
Then pointer b now points at the address at which the variable a is stored. Thus, subsequently writing 11 at that address will just overwrite a's value (from 13 to 11), a perfectly valid operation, no problems.
I just need it to point to some location that can hold an int
Indeed, but this is not enough. Not only shall the location be able to hold an int, that location must also be available for your program to store such an int. This means the location needs to be either that of an existing object (e.g., of an existing variable of type int) or a newly allocated memory chunk capable of holding an int. For example:
b = new int;
This will dynamically allocate the memory needed to store an int, and assign the address of the newly allocated memory to the pointer b.
Remember that after you're done with such dynamically allocated memory, you should deallocate it:
delete b;
Otherwise, there will be a memory leak (at least until the whole process exits anyway).
Nowadays, in modern C++, there is rarely a need to use raw pointers to manage dynamic memory allocations/deallocations manually. You'd use standard containers and/or smart pointers for that purpose instead.
Related
in the following code I found that same pointer instruction crash the application in a situation while not in other situation.
#include <iostream>
using namespace std;
int main()
{
int *p;
*p = 50; //this instruction causes the crash
int* q = new int;
*q = 50; //this instruction executes ok
cout << "p:" << p << endl;
cout << "q:" << q << endl;
return 0;
}
I want to know why this is the case?
The first pointer is uninitialized. It doesn't point to a memory location that has an int value. So when you deref it on the next line, you get a crash.
The second pointer is initialized to an int that has an actual space in memory. So when you deref it, it finds the value held in that space.
int *p;
This pointer points to nowhere i.e not at any valid address of the process. That's why it crashes
int* q = new int;
Points to a valid address returned by new int, hence worked
I see you need some links to documentation:
http://en.cppreference.com/w/cpp/language/pointer
http://en.cppreference.com/w/cpp/language/operator_member_access
http://en.cppreference.com/w/cpp/language/new
http://en.cppreference.com/w/cpp/language/delete
http://en.cppreference.com/w/cpp/language/storage_duration
To wrap it up:
You can use the indirection operator (*) to return the object the pointer points to (dereference the pointer).
You can only access (read or modify) an object via a pointer, when the pointer actually points to an object (which by default they don't).
You can assign the address of an object to the pointer to let the pointer point at it.
You can use the address-of operator (&) to acquire the address of an object for assigning it to a pointer.
You can use the new operator to create a new object and return the address to it for assigning it to a pointer.
You must use delete to eventually destroy objects created using new.
When you use pointers, you alone are responsible for the validity of the objects your pointers point to. Don't expect the compiler to warn you when objects are leaked or accessed beyond the end of their lifetime. If you do it wrong, you might observe undefined behavior.
Smart pointers can help to keep track of object ownership and take care of proper destruction.
For further reading:
Can a local variable's memory be accessed outside its scope?
What does "dereferencing" a pointer mean?
Undefined, unspecified and implementation-defined behavior
What is a smart pointer and when should I use one?
https://ericlavesson.blogspot.de/2013/03/c-ownership-semantics.html
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#e6-use-raii-to-prevent-leaks
I am trying to learn Pointers behavior and I tried couple of examples.
From my understanding, "Program should throw error when we try to print the pointer while it is not assigned with the address of a value".
I wrote a block of code with pointer variables 'a' and 'b' and directly allocated value to pointer 'a'. I expected this would result in segmentation fault. Also the second pointer 'b' takes the address of pointer 'a'. Why do I see this behavior ?
Below is my Block of Code:
int *a; // What exactly happens behind the scenes here ? What will "a" contain ?
int *b; // Why does "b" take address of "a" ?
*a = 5; // Why don't I get a segmentation fault here ?
cout<<a<<endl;
cout<<*a<<endl;
cout<<b<<endl;
cout<<*b<<endl;
And my output is,
0x246ff20
5
0x246ff20
5
"Program should throw error when we try to print the pointer while it is not assigned with the address of a value".
In fact, when you attempt to use a variable whose value is not initialized, the behavior (i.e. what happens) is is undefined.
In this case, this means that your program could crash with a segmentation fault, or it could print something vaguely meaningful. (Or nasal demons :-) )
What you are probably seeing is some values that happen to be in the memory locations that now correspond to those variables. Those values most likely got there because somewhere earlier in the execution the memory locations were used for something else.
You've got a lot of undefined behaviour there.
int* a;
What is contained in a? Undefined. Completely up to the compiler. In this case it seems to default to 0x246ff20. Why do I say that? Because b is also set to this. You can't rely on this and it may just be coincidence.
It just so happens that the value seems to be valid (pointing to a data page in memory), which doesn't throw a seg fault when written to.
That also answers the rest of your questions. The behaviour you observe is completely undefined.
You should be pointing your pointers to something. For instance:
int a = 5;
int* pA = &a;
int* pB = pA;
int *a; is a pointer that may refer to every where!
when we can use of it that we assign a address to that or create a new of that, for example :
int *a;
int *b = a; // b variable is pointing to a variable (if *a changeed, *b also changed that value)
example 2 :
int *a = new int(5); // a is a variable that allocate a 4 byte memory to it and its first value is 5
cout<<a<<endl; // physycal address of 'a' variable is print
cout<<*a<<endl; // value of 'a' variable (that is in 'a' address) is print
Everything in C/C++ has a value by default. That value, however, is completely undefined.
Some OS-compiler combinations might zero it (ie, make your pointers null pointers), some might insert debug 'messages' into them (like baadf00d), but most merely leave whatever was previously in the memory there.
Whatever the case of what is in those values, they will be treated by the compiler as if you did define them. Accessing undefined values is "legitimate" within C/C++ in the sense it is not explicitly disallowed, but not a good thing to do. Some compilers possess warning flags which will tell you if you are using an uninitialized variable, but these are not always foolproof, as it can be difficult for compilers to determine when some variables are initialized.
Whenever you define a value, you should give it a default value. For pointers, that would be NULL or nullptr. In more complex situations, where a "zero-value" might be legitimate (like the number of child objects to an object), you may wish to use a second variable, such as bool initialized, to store if the values are legitimate or junk.
De-referencing an undefined pointer like you are doing, much less writing to it like you are, will probably be caught by the OS and result in your program being killed for memory access violations. I'm somewhat surprised you got any output.
A handy little tip, when using pointers, always set the pointer to null before you set it to a value, this way you always know that the pointer isn't pointing an obscure memory address or pointing to the same thing as another pointer.
And always use delete after using dynamic pointers
I have a few silly questions (probably for most), it's less about 'how' do I do them and more about 'why' do they work this way? I know you are not supposed to ask multiple questions, but these are fairly small and related to one topic. Seems like it'd be a waste to separate them out.
I understand for the most part pointers, and the operators themselves. (Although I am curious why the * is called the de-reference operator, since isn't it referring to a reference of what it contains?)
I.e:
int x = 25;
int *p = &x;
So & makes sense, since *p is a place in the stack of type int that contains the address of x (which we know is 25).
So by saying *p are we 'referencing' the address of p which is pointing to the 25. Maybe it's just an English semantics thing? Maybe someone can explain why it's called 'de-reference'?
Take this valid code sample:
int *p = new int;
*p = 5
Simple enough, we're making a new place in the stack with the size of an int pointer (whatever that may be). p will contain some address that has a 5 for a value.
Thing is, I haven't actually declared anything that's storing a 5, so what the heck is p pointing to? I mean, this does indeed work, but I never made a int x = 5 or anything like that, and gave the address to p to point to? How does it even know?
p contains an address that points to something, but I never made that 'address' it's pointing to? Does the compiler just know to create another address somewhere else? (Sorry if this is a really stupid question.)
I was reading on here about some memory leaks on another question:
A *object1 = new A();
pretending A is a class first of all. Someone was saying the object1 stores the value of A. Isn't that incorrect? Doesn't object1 store the address of what new A() created and is pointing to it?
Because delete object1 deletes the pointer, which points to the new A() (but from the other question delete object1 would indeed be the correct syntax. So does that leave the new A() hanging out there? Or does it get automatically deleted? (As you can tell I'm a bit confused.)
If delete object1 does indeed delete what the pointer is pointing to and not the pointer itself, doesn't this just leave a dangling pointer?
Here
int x = 25;
int *p = &x;
* is not the dereferencing operator. It's part of the type of p. It basically says that p is a pointer.
Here
int *p = new int;
*p = 5;
The key is new int. It dynamically creates an int and returns it's address, so p now points to that address. *p = 5 (btw, here * is the dereferencing operator) modifies the value -- of that dymanically allocated int -- to 5
Indeed object1 holds the address of the newly created A. Since we're here we should clarify this: A is a (user defined) type. So it makes no sense to say that A has a value. Objects of type A have value.
delete p doesn't delete a pointer. It does 2 things:
Destroys an object created by a new-expression
Deallocates storage previously allocated by a matching operator new
The pointer isn't actually changed, i.e. it still points to the same address. Only now that address isn't allocated, i.e. can't be dereferenced anymore.
You can further refer to this SO answer - Static, automatic and dynamic storage duration to further understand objects, pointers, new/delete.
1: I'm not sure what you're implying when you say "which we know is 25". The address of x is not 25, rather that is the value of x. You do not set the address of x in a statment like int x = 25
But to answer your question, p is a reference, which is to say its value is an address. Accessing the value stored at the address p requires the *p, which dereferences the pointer.
2: You have allocated memory for p; you executed a new. You have allocated 4 (or 8) bytes of memory for a new integer on the heap, so p is pointing to a newly allocated block of memory. Saying *p = 5; tells the compiler to set the value stored at that address to 5.
3: Your assumption is correct; object1 does not store the value of a new A, rather points to a block of memory equivalent in size to aninstance of an object of size A.
delete object1; Does not delete the pointer. The pointer is simply an integer on the stack, rather it gives back the allocated memory for that pointer back to the system. A as you knew it is deleted, but the pointer still exists, and using it at this point is undefined behavior. You are correct in assuming you have a dangling pointer now, that is why you should always set deleted pointers to NULL.
Why can I do:
int i = *(new int (5));
and successfuly use i after it,
but when I'm trying:
delete &i;
I get a run time error:
Unhandled exception at 0x5ddccaf7 (msvcr100d.dll) in Test.exe:
0xC00000FD: Stack overflow.
If i was a reference:
int & i = *(new int (5));
, all this (including delete) works fine.
I know, that it's no good to keep allocated memory handler in something other than pointer and *(new ...) is awful, but I'm just wondering, why new works good, but delete fails.
//Below are just my guesses about the reason of such behavior:
Is it because module which executes program (it's probably not "compiler", because of there is run time already) when it encounters with delete, it searches for some information like length of data pointing by &i (in some internal array of such information about all pointers) and don't find it or interpretes some trash data as this information? (I suppose, pointers and references have it, but variables don't)
Your original version does not assign i to an address. It allocates a new int on the heap and initializes its value to 5, then copies that value into i which is on the stack. The memory that you allocated (new'ed) is inaccessible and gets leaked.
The reference version works because i refers to the otherwise-anonymous new'ed memory. Hence &i gives the heap address. In your first version, &i gives the address of the stack variable, not the heap memory, and deleting stack memory is bad news.
You get a run-time error because the address of i is not the same as the address returned by the operator new. Once you dereference the result of new, you make a copy of the value. That copy is then placed into variable i, which has an address in automatic storage, which cannot be passed to delete legally without triggering undefined behavior.
When you make a reference, however, you do not make a copy: the result of new becomes referenced through a variable i of type "reference to int". Hence, the reference has the same address as has been returned by operator new, so the code that uses the reference works fine.
The line:
int i = *(new int (5));
Is equivalent to:
int* p = new int(5);
int i = *p;
The address of i (which you are trying to delete) is not the address of the allocated memory.
In mathematical notation, &x == &y implies x == y, but x == ydoes not imply &x == &y.
In many tutorials, the first code samples about dynamic memory start along the lines of:
int * pointer;
pointer = new int; // version 1
//OR
pointer = new int [20]; // version 2
They always proceed to explain how the second version works, but totally avoid talking about the first version.
What I want to know is, what does pointer = new int create? What can I do with it? What does it mean? Every tutorial without fail will avoid talking about the first version entirely. All I've found out (through messing about) is this:
#include <iostream>
using namespace std;
int main()
{
int * pointer;
pointer = new int;
pointer[2] = 1932; // pointer [2] exists? and i can assign to it?!
cout << pointer[2] << endl; // ... and access it successfully?!
};
The fact that I can subscript pointer tells me so far that pointer = new int implicitly creates an array. But if so, then what size is it?
If someone could help clear this all up for me, I'd be grateful...
My teacher explained it like this.
Think of cinema. The actual seats are memory allocations and the ticket you get are the pointers.
int * pointer = new int;
This would be a cinema with one seat, and pointer would be the ticket to that seat
pointer = new int [20]
This would be a cinema with 20 seats and pointer would be the ticket to the first seat. pointer[1] would be the ticket to the second seat and pointer[19] would be the ticket to the last seat.
When you do int* pointer = new int; and then access pointer[2] you're letting someone sit in the aisle, meaning undefined behaviour
This is a typical error in C and C++ for beginners. The first sentence, creates a space for holding just an int. The second one creates a space for holding 20 of those ints. In both cases, however, it assigns the address of the beginning of the dynamically-reserved area to the pointer variable.
To add to the confusion, you can access pointers with indices (as you put pointer[2]) even when the memory they're pointing is not valid. In the case of:
int* pointer = new int;
you can access pointer[2], but you'd have an undefined behavior. Note that you have to check that these accesses don't actually occur, and the compiler can do usually little in preventing this type of errors.
This creates only one integer.
pointer = new int; // version 1
This creates 20 integers.
pointer = new int [20] // version 2
The below is invalid, since pointer[2] translates as *(pointer + 2) ; which is not been created/allocated.
int main()
{
int * pointer;
pointer = new int;
pointer[2] = 1932; // pointer [2] exists? and i can assign to it?!
cout << pointer[2] << endl; // ... and access it succesfuly?!
};
Cheers!
new int[20] allocates memory for an integer array of size 20, and returns a pointer to it.
new int simply allocates memory for one integer, and returns a pointer to it. Implicitly, that is the same as new int[1].
You can dereference (i.e. use *p) on both pointers, but you should only use p[i] on the pointer returned by the new int[20].
p[0] will still work on both, but you might mess up and put a wrong index by accident.
Update: Another difference is that you must use delete[] for the array, and delete for the integer.
pointer = new int allocates enough memory on the heap to store one int.
pointer = new int [20] allocates memory to store 20 ints.
Both calls return a pointer to the newly allocated memory.
Note: Do not rely on the allocated memory being initialized, it may contain random values.
pointer = new int; allocates an integer and stores it's address in pointer. pointer[2] is a synonym for pointer + 2. To understand it, read about pointer arithmetic. This line is actually undefined behavior, because you are accessing memory that you did not previously allocate, and it works because you got lucky.
int* p = new int allocates memory for one integer. It does not implictly create an array. The way you are accessing the pointer using p[2] will cause the undefined behavior as you are writing to an invalid memory location. You can create an array only if you use new[] syntax. In such a case you need to release the memory using delete[]. If you have allocated memory using new then it means you are creating a single object and you need to release the memory using delete.
*"The fact that i can subscript pointer tells me so far that I pointer = new int implicitly creates an array. but if so, then what size is it?"
*
This was the part of the question which I liked the most and which you emphasize upon.
As we all know dynamic memory allocation makes use of the space on the Stack which is specific to the given program.
When we take a closer look onto the definition of new operator :-
void* operator new[] (std::size_t size) throw (std::bad_alloc);
This actually represents an array of objects of that particular size and if this is successful, then it automatically Constructs each of the Objects in the array. Thus we are free to use the objects within the bound of the size because it has already been initialized/constructed.
int * pointer = new int;
On the other hand for the above example there's every possibility of an undefined behaviour when any of
*(pointer + k) or *(k + pointer)
are used. Though the particular memory location can be accessed with the use of pointers, there's no guarantee because the particular Object for the same was not created nor constructed.This can be thought of as a space which was not allocated on the Stack for the particular program.
Hope this helps.
It does not create array. It creates a single integer and returns the pointer to that integer. When you write pointer[2] you refer to a memory which you have not allocated. You need to be carefull and not to do this. That memory can be edited from the external program which you, I belive, don't want.
int * pointer; pointer = new int; // version 1
//OR
pointer = new int [20] // version 2
what I want to know is, what does pointer = new int create? what can I do with it? what does it mean? Every tutorial without fail will avoid talking about the first version entirely
The reason the tutorial doesn't rell you what to do with it is that it really is totally useless! It allocates a single int and gives you a pointer to that.
The problem is that if you want an int, why don't you just declare one?
int i;