Does free() operator delete the address from the dynamic variable? - c++

Let's consider below program:
int main ()
{
int *p, *r;
p = (int*)malloc(sizeof(int));
cout<<"Addr of p = "<<p <<endl;
cout<<"Value of p = "<<*p <<endl;
free(p);
cout<<"After free(p)"<<endl;
r = (int*)malloc(sizeof(int));
cout<<"Addr of r = "<<r <<endl;
cout<<"Value of r = "<<*r <<endl;
*p = 100;
cout<<"Value of p = "<<*p <<endl;
cout<<"Value of r = "<<*r <<endl;
return 0;
}
Output:
Addr of p = 0x2f7630
Value of p = 3111728
free(p)
Addr of r = 0x2f7630
Value of r = 3111728
*p = 100
Value of p = 100
Value of r = 100
In the above code, p and r are dynamically created.
p is created and freed. r is created after p is freed.
On changing the value in p, r's value also gets changed. But I have already freed p's memory, then why on changing p's value, r's value also gets modified with the same value as that of p?
I have come to below conclusion. Please comment if I am right?
Explanation:
Pointer variables p and q are dynamically declared. Garbage values are stored initially. Pointer variable p is freed/deleted. Another pointer variable r is declared. The addresses allocated for r is same as that of p (p still points to the old address). Now if the value of p is modified, r’s value also gets modified with the same value as that of p (since both variables are pointing to the same address).
The operator free() only frees the memory address from the pointer variable and returns the address to the operating system for re-use, but the pointer variable (p in this case) still points to the same old address.

The free() function and the delete operator do not change the content of a pointer, as the pointer is passed by value.
However, the stuff in the location pointed to by the pointer may not be available after using free() or delete.
So if we have memory location 0x1000:
+-----------------+
0x1000 | |
| stuff in memory |
| |
+-----------------+
Lets assume that the pointer variable p contains 0x1000, or points to the memory location 0x1000.
After the call to free(p), the operating system is allowed to reuse the memory at 0x1000. It may not use it immediately or it could allocate the memory to another process, task or program.
However, the variable p was not altered, so it still points to the memory area. In this case, the variable p still has a value, but you should not dereference (use the memory) because you don't own the memory any more.

Your analysis is superficially close in some ways but not correct.
p and r are defined to be pointers in the first statement of main(). The are not dynamically created. They are defined as variables of automatic storage duration with main(), so they cease to exist when (actually if, in the case of your program) main() returns.
It is not p that is created and freed. malloc() dynamically allocates memory and, if it succeeds, returns a pointer which identifies that dynamically allocated memory (or a NULL pointer if the dynamic allocation fails) but does not initialise it. The value returned by malloc() is (after conversion into a pointer to int, which is required in C++) assigned to p.
Your code then prints the value of p.
(I have highlighted the next para in italic, since I'll refer back to it below).
The next statement prints the value of *p. Doing that means accessing the value at the address pointed to by p. However, that memory is uninitialised, so the result of accessing *p is undefined behaviour. With your implementation (compiler and library), at this time, that happens to result in a "garbage value", which is then printed. However, that behaviour is not guaranteed - it could actually do anything. Different implementations could give different results, such as abnormal termination (crash of your program), reformatting a hard drive, or [markedly less likely in practice] playing the song "Crash" by the Primitives through your computer's loud speakers.
After calling free(p) your code goes through a similar sequence with the pointer r.
The assignment *p = 100 has undefined behaviour, since p holds the value returned by the first malloc() call, but that has been passed to free(). So, as far as your program is concerned, that memory is no longer guaranteed to exist.
The first cout statement after that accesses *p. Since p no longer exists (having being passed to free()) that gives undefined behaviour.
The second cout statement after that accesses *r. That operation has undefined behaviour, for exactly the same reason I described in the italic paragraph above (for p, as it was then).
Note, however, that there have been five occurrences of undefined behaviour in your code. When even a single instance of undefined behaviour occurs, all bets are off for being able to predict behaviour of your program. With your implementation, the results happen to be printing p and r with the same value (since malloc() returns the same value 0x2f7630 in both cases), printing a garbage value in both cases, and then (after the statement *p = 100) printing the value of 100 when printing *p and *r.
However, none of those results are guaranteed. The reason for no guarantee is that the meaning of "undefined behaviour" in the C++ standard is that the standard describes no limits on what is permitted, so an implementation is free to do anything. Your analysis might be correct, for your particular implementation, at the particular time you compiled, linked, and ran your code. It might even be correct next week, but be incorrect a month from now after updating your standard library (e.g. applying bug fixes). It is probably incorrect for other implementations.
Lastly, a couple of minor points.
Firstly, your code is incomplete, and would not even compile in the form you have described it. In discussion above, I have assumed your code is actually preceded by
#include <iostream>
#include <cstdlib>
using namespace std;
Second, malloc() and free() are functions in the standard library. They are not operators.

Your analysis of what actually happened is correct; however, the program is not guaranteed to behave this way reliably. Every use of p after free(p) "provokes undefined behavior". (This also happens when you access *p and *r without having written anything there first.) Undefined behavior is worse than just producing an unpredictable result, and worse than just potentially causing the program to crash, because the compiler is explicitly allowed to assume that code that provokes undefined behavior will never execute. For instance, it would be valid for the compiler to treat your program as identical to
int main() {}
because there is no control flow path in your program that does not provoke undefined behavior, so it must be the case that the program will never run at all!

free() frees the heap memory to be re-used by OS. But the contents present in the memory address are not erased/removed.

Related

Pointers in c++ after delete

After reading many posts about this, I want to clarify the next point:
A* a = new A();
A* b = a;
delete a;
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
So the question is about using the value of copy of deleted pointer.
I've read, that in c++ 11 reading the value of a leads to undefined behaviour - but what about reading the value of b?
Trying to read the value of the pointer (note: this is different to
dereferencing it) causes implementation-defined behaviour since C++14,
which may include generating a runtime fault. (In C++11 it was
undefined behaviour)
What happens to the pointer itself after delete?
Both:
A* c = a;
A* d = b;
are undefined in C++11 and implementation defined in C++14. This is because a and b are both "invalid pointer values" (as they point to deallocated storage space), and "using an invalid pointer value" is either undefined or implementation defined, depending on the C++ version. ("Using" includes "copying the value of").
The relevant section ([basic.stc.dynamic.deallocation]/4) in C++11 reads (emphasis added):
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.
with a non-normative note stating:
On some implementations, it causes a system-generated runtime
In C++14 the same section reads:
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.
with a non-normative note stating:
Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault
These 2 lines do not have any difference (meaning legality for C++):
A* c = a; //illegal - I know it (in c++ 11)
A* d = b; //I suppose it's legal, is it true?
Your mistake (and it is pretty common) to think if you call delete on a it makes it any different than b. You should remember that when you call delete on a pointer you pass argument by value, so memory, where a points to after delete is not usable anymore, but that call does not make a any different than b in your example.
You should not use the pointer after delete. My below example with acessing a is based on implementation-defined behaviour.
(thanks to for M.M and Mankarse for pointing this)
I feel that it is not the variable a (or b, c, d) that is important here, but that the value (=the memory address of a deallocated block) which in some implementations can trigger a runtime fault when used in some 'pointer context'.
This value may be an rvalue/expression, not necessarily the value stored in a variable - so I do not believe the value of a ever changes (I am using the loose 'pointer context' to distinguish from using the same value, i.e. the same set of bits, in non-pointer related expressions - which will not cause a runtime fault).
------------My original post is below.---------------
Well, you are almost there with your experiment. Just add some cout's like here:
class A {};
A* a = new A();
A* b = a;
std::cout << a << std::endl; // <--- added here
delete a;
std::cout << a << std::endl; // <--- added here. Note 'a' can still be used!
A* c = a;
A* d = b;
Calling delete a does not do anything to the variable a. This is just a library call. The library that manages dynamic memory allocation keeps a list of allocated memory blocks and uses the value passed by variable a to mark one of the previously allocated blocks as freed.
While it is true what Mankarse cites from C++ documentation, about: "rendering invalid all pointers referring to any part of the deallocated storage" - note that the value of variable a remains untouched (you did not pass it by reference, but by value !).
So to sum up and to try to answer your question:
Variable a still exists in the scope after delete. The variable a still contains the same value, which is the address of the beginning of the memory block allocated (and now already deallocated) for an object of class A. This value of a technically can be used - you can e.g. print it like in my above example – however it is hard to find a more reasonable use for it than printing/logging the past...
What you should not do is trying to de-reference this value (which you also keep in variables b, c, and d) – as this value is not a valid memory pointer any longer.
You should never rely on the object being in the deallocated storage (while it is quite probable that it will remain there for some while, as C++ does not require to clear the storage freed after use) - you have no guarantees and no safe way to check this).

I don't understand C++ pointer arithmetic

I have the following program which defines 2 integers and a pointer to an integer.
#include <stdio.h>
int main() {
int bla=999;
int a=42;
int* pa=&a;
printf("%d \n", *pa);
printf("%d \n", pa);
pa++;
//*pa=666; //runs (no error), but the console is showing nothing at all
printf("%d \n", *pa);
printf("%d \n", pa);
pa++;
//*pa=666; //runs and changes the value of *pa to 666;
printf("%d \n", *pa);
printf("%d \n", pa);
}
The output is:
42
2686740
2686744
2686744 //this value is strange, I think
999
2686748
The adresses are making sense to me, but the fourth value is strange, because it is exactly the adress of the int. Can somebody explain that behaviour ?
When I comment *pa=666 (the first apperance) in, the console shows nothing, so here is some sort of error, but the compiler does not show an error. Maybe this is because of the size of int on my system, I have a 64bit-windows-os, so maybe the int is 64 bit and not 32 ? And because of that the *pa-value is 999 after the second increment and not the first ?
I am sure, there are a lot of C-programmers out there who can explain what is going on :)
int* pa=&a;
pa is pointer to an integer and accessing *pa is defined.
Once you increment your pointer, then the pointer is pointing to some memory(after p) which is not allocated by you or not known to you so dereferencing it leads to undefined beahvior.
pa++;
*pa is UB
Edit:
Use proper format specifier to print the pointer value %p as pointed out by #haccks
The output is not strange, it is to be expected: You have three variables in main(), all of which are stored on the stack, and which happen to be right one after the other. One of these variables is the pointer itself. So, when you dereference the pointer in the third line, you get the current value of the pointer itself.
Nevertheless, this output is not predictable, it is undefined behavior: You are only allowed to use pointer arithmetic to access data within a single memory object, and in your case, the memory object is just a single int. Consequently, accessing *pa after the first pa++ is illegal, and the program is allowed to do anything from that point on.
More specifically, there is no guarantee which other variables follow a certain variable, in which order they follow, or if there is accessable memory at all. Even reading *pa after the first pa++ is allowed to crash your program. As you have witnessed, you will not experience a crash in many cases (which would be easy to debug), yet the code is still deeply broken.
You are using wrong format specifier to print address. This will invoke undefined behavior and once the UB is invoked, all bets are off. Use %p instead.
printf("%p \n", (void *)pa);
Another problem is that after execution of pa++;, you are accessing unallocated memory and another reason for UB.
You're not smarter than your compiler.
As said by another answer what you do is Undefined Behaviour. With pa you are just doing non-sense, it does'nt correspond to any reasonlable algorithm for a defined goal: it's non sense.
However I will propose you a possible scenario of what's happenning. Though much of it could be false because compilers do optimizations.
int bla=999;
int a=42;
int* pa=&a;
These variables are allocated on the stack.
When writing pa = &a you say "I want pointer pa to be equal to the address of a".
Probably the compiler could have allocated the memory in the order or declaration, which would give something like:
bla would have address 0x00008880
a would have address 0x00008884
pa would have address 0x00008888
when you do pa++ you're telling: move my pointer of int to the next position of int in memory.
As ints are 32 bits, you're doing pa = pa + 4bytes i.e. pa = 0x00008888
Notice that, by chance !,you're probably pointing to the address of the pa pointer.
So now the pointer pa contains its own address... which is pretty esoteric and could be called ouroboros.
Then you're asking again pa++... so pa = pa + 4 bytes i.e. pa = 0x0000888c
So now you are probably accessing an unknown memory zone. It could be an access violation. It's undefined behaviour if you ever want to read or write.
When you first assigned the pointer it pointed to 2686740. The pointer is an integer pointer and integers use 4 bytes (usually, on your machine it used 4). That means pa++ is going to increase the value to be 4 more which is 2686744. Doing it again resulted in 2686748
If you were to look at the resulting assembly code the order of your local variables would be switched around. The ordering was a, pa, bla when the code ran. Because you don't have explicit control over this ordering the output of your printing is considered to be undefined
After the first time you did pa++ the pointer pointed at itself, that is why you got the "strange value"
As mentioned by many of the other answers, this is not good use of pointers and should be avoided. You don't have control over what the pointer is pointing to in this situation. A much better use of pointer arithmetic would be pointing at the beginning of an array and then doing pa++ to point to the next element in the array. The only problem you could experience then would be incrementing past the last element of the array
Are you trying to increment the value in a through the pointer *pa?
If so, do: (*pa)++. The brackets are crucial as they mean "take the value of the pointer", then use that address to increment whatever it is referencing.
This is completely different to *pa++ which simply returns the value pointed to by *pa and then increments the pointer (not the thing that it is referencing).
One of the little traps of C syntax. K&R has a few pages devoted to this, I suggest you try some of the examples there.

Memory location of null pointer

After reading a lot of questions about null pointers, I still have confusion about memory allocation in null pointer.
If I type following code-
int a=22;
int *p=&a;//now p is pointing towards a
std::cout<<*p;//outputs 22
std::cout<<p;//outputs memory address of object a;
int *n=nullptr;// pointer n is initialized to null
std::cout<<n;
After compiling this code pointer n outputs literal constant 0, and if i try this,
std::cout<<*n;
this line of code is compiled by compiler but it is unable to execute, what is wrong in this code, it should print memory location of this pointer.
std::cout<<p;
does this output location of pointer in memory or location of an object in memory.
Since many or all of these answers are already answered in previous questions but somehow i am unable to understand because I am beginner in C++.
A nullptr pointer doesn't point to anything. It doesn't contain a valid address but a "non-address". It's conceptual, you shouldn't worry about the value it has.
The only thing that matters is that you can't dereference a nullptr pointer, because this will cause undefined behavior, and that's why your program fails at runtime (std::cout<<*n)
std::cout<<p;
In general outputs value of variable p, what that value means depends on p's type. In your case type or p is pointer to int (int *) so value of it is address of int. As pointer itself is an lvalue you can get address of it, so if you want to see where your pointer n located in memory just output it's address:
std::cout << &n << std::endl;
As said on many other answers do not dereference null pointer, as it leads to UB. So again:
std::cout << n << std::endl; // value of pointer n, ie address, in your case 0
std::cout << &n << std::endl; // address of pointer n, will be not 0
std::cout << *n << std::endl; // undefined behavior, you try to dereference nullptr
If you want to see address of nullptr itself, you cannot - it is a constant, not lvalue, and does not have address:
std::cout << &nullptr << std::endl; // compile error, nullptr is not lvalue
When you compile:
std::cout << *n;
The compiler will typically build some code like this:
mov rax, qword ptr [rbp - 0x40]
mov esi, dword ptr [rax]
call cout
The first line looks up the address of the pointer (rdp - 0x40) and stores it in the CPU register RAX. In this case the address of the nullptr is 0. RAX now contains 0.
The second line tries to read memory from the location (0) specified by RAX. On a typical computer setup memory location 0 is protected (it isn't a valid data memory location). This causes an invalid operation and you get a crash*.
It never reaches the third line.
*However, this isn't necessary true in all circumstances: on a micro-controller where you don't have an operating system in place, this might successfully dereference and read the value of memory location 0. However *nullptr wouldn't be a good way of expressing this intention! See http://c-faq.com/null/machexamp.html for more discussion. If you want the full detail on nullptr: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf
nullptr is a special value, selected in such a way that no valid pointer could get this value. On many systems, the value is equal to numeric zero, but it is not a good idea to think of nullptr in terms of its numeric value.
To understand the meaning of nullptr you should first consider the meaning of a pointer: it is a variable that refers to something else, which may also refer to nothing at all. You need to be able to distinguish the state "my pointer refers to something" from the state "my pointer refers to nothing at all". This is where nullptr comes in: if a pointer is equal to nullptr, you know that it references "nothing at all".
Note: dereferencing nullptr (i.e. applying the unary asterisk operator to it) is undefined behavior. It may crash, or it may print some value, but it would be a "garbage value".
Dereferencing a null pointer is undefined behavior, so anything at all can happen. But, a null pointer still has to have a place in memory. So what you're seeing is just that. Typically compilers implement a null pointer as its value being all 0's.
Just because it's a gold quote, here's what Scott Meyer's has to say about UD behavior, from his book Effective C++ 2nd Ed.
"Nevertheless, there is something very troubling here. Your program's
behavior is undefined -- you have no way of knowing what will
happen... That means compilers may generate code to do whatever they
like: reformat your disk, send suggestive email to your boss, fax
source code to your competitors, whatever."
It is undefined behavior to dereference a null pointer. Any behavior that the compiler chooses or unintentionally happens is valid.
Changing the program in other places may also change the behavior of this code line.
I suspect that your confusion revolves around the fact that there are two memory locations involved here.
In this code:
int *n=nullptr;// pointer n is initialized to null
There is one variable, n, and that variable occupies space in memory. You can take the address of n and prove this to yourself:
std::cout << &n << "\n";
And you'll see that the address-of n is something legitimate. As in, not NULL.
n happens to be of type pointer-to-int, and the thing it points to is NULL. That means it doesn't point to anything at all; it's in a state where you can't dereference it.
But "dereference it" is exactly what you are doing here:
std::cout<<*n;
n is valid, but the thing it points to is not. That is why your program is ill-formed.

Can a unique_ptr be used with a negative index without leaking memory?

I read Are negative array indexes allowed in C? and found it interesting that negative values can be used for the index of an array. I tried it again with the c++11 unique_ptr and it works there as well! Of course the deleter must be replaced with something which can delete the original array. Here is what it looks like:
#include <iostream>
#include <memory>
int main()
{
const int min = -23; // the smaller valid index
const int max = -21; // the highest valid index
const auto deleter = [min](char* p)
{
delete [](p+min);
};
std::unique_ptr<char[],decltype(deleter)> up(new char[max-min+1] - min, deleter);
// this works as expected
up[-23] = 'h'; up[-22] = 'i'; up[-21] = 0;
std::cout << (up.get()-23) << '\n'; // outputs:hi
}
I'm wondering if there is a very, very small chance that there is a memory leak. The address of the memory created on the heap (new char[max-min+1]) could overflow when adding 23 to it and become a null pointer. Subtracting 23 still yields the array's original address, but the unique_ptr may recognize it as a null pointer. The unique_ptr may not delete it because it's null.
So, is there a chance that the previous code will leak memory or does the smart pointer behave in a way which makes it safe?
Note: I wouldn't actually use this in actual code; I'm just interested in how it would behave.
Edit: icepack brings up an interesting point, namely that there are only two valid pointer values that are allowed in pointer arithmetic:
§5.7 [expr.add] p5
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
As such, the new char[N] - min of your code already invokes UB.
Now, on most implementations, this will not cause problems. The destructor of std::unique_ptr, however, will (pre-edit answer from here on out):
§20.7.1.2.2 [unique.ptr.single.dtor] p2
Effects: If get() == nullptr there are no effects. Otherwise get_deleter()(get()).
So yes, there is a chance that you will leak memory here if it indeed maps to whatever value represents the null pointer value (most likely 0, but not necessarily). And yes, I know this is the one for single objects, but the array one behaves exactly the same:
§20.7.1.3 [unique.ptr.runtime] p2
Descriptions are provided below only for member functions that have behavior different from the primary template.
And there is no description for the destructor.
new char[max-min+1] doesn't allocate memory on the stack but rather on heap - that's how standard operator new behaves. The expression max-min+1 is evaluated by the compiler and results in 3, so eventually this expression is equal to allocating 3 bytes on the heap. No problem here.
However, subtracting min results in pointer which is 23 bytes beyond the beginning of the allocated memory returned by new and since in new you allocated only 3 bytes, this will definitely point to a location not owned by you --> anything following will result in undefined behavior.

What does "dereferencing" a pointer mean?

Please include an example with the explanation.
Reviewing the basic terminology
It's usually good enough - unless you're programming assembly - to envisage a pointer containing a numeric memory address, with 1 referring to the second byte in the process's memory, 2 the third, 3 the fourth and so on....
What happened to 0 and the first byte? Well, we'll get to that later - see null pointers below.
For a more accurate definition of what pointers store, and how memory and addresses relate, see "More about memory addresses, and why you probably don't need to know" at the end of this answer.
When you want to access the data/value in the memory that the pointer points to - the contents of the address with that numerical index - then you dereference the pointer.
Different computer languages have different notations to tell the compiler or interpreter that you're now interested in the pointed-to object's (current) value - I focus below on C and C++.
A pointer scenario
Consider in C, given a pointer such as p below...
const char* p = "abc";
...four bytes with the numerical values used to encode the letters 'a', 'b', 'c', and a 0 byte to denote the end of the textual data, are stored somewhere in memory and the numerical address of that data is stored in p. This way C encodes text in memory is known as ASCIIZ.
For example, if the string literal happened to be at address 0x1000 and p a 32-bit pointer at 0x2000, the memory content would be:
Memory Address (hex) Variable name Contents
1000 'a' == 97 (ASCII)
1001 'b' == 98
1002 'c' == 99
1003 0
...
2000-2003 p 1000 hex
Note that there is no variable name/identifier for address 0x1000, but we can indirectly refer to the string literal using a pointer storing its address: p.
Dereferencing the pointer
To refer to the characters p points to, we dereference p using one of these notations (again, for C):
assert(*p == 'a'); // The first character at address p will be 'a'
assert(p[1] == 'b'); // p[1] actually dereferences a pointer created by adding
// p and 1 times the size of the things to which p points:
// In this case they're char which are 1 byte in C...
assert(*(p + 1) == 'b'); // Another notation for p[1]
You can also move pointers through the pointed-to data, dereferencing them as you go:
++p; // Increment p so it's now 0x1001
assert(*p == 'b'); // p == 0x1001 which is where the 'b' is...
If you have some data that can be written to, then you can do things like this:
int x = 2;
int* p_x = &x; // Put the address of the x variable into the pointer p_x
*p_x = 4; // Change the memory at the address in p_x to be 4
assert(x == 4); // Check x is now 4
Above, you must have known at compile time that you would need a variable called x, and the code asks the compiler to arrange where it should be stored, ensuring the address will be available via &x.
Dereferencing and accessing a structure data member
In C, if you have a variable that is a pointer to a structure with data members, you can access those members using the -> dereferencing operator:
typedef struct X { int i_; double d_; } X;
X x;
X* p = &x;
p->d_ = 3.14159; // Dereference and access data member x.d_
(*p).d_ *= -1; // Another equivalent notation for accessing x.d_
Multi-byte data types
To use a pointer, a computer program also needs some insight into the type of data that is being pointed at - if that data type needs more than one byte to represent, then the pointer normally points to the lowest-numbered byte in the data.
So, looking at a slightly more complex example:
double sizes[] = { 10.3, 13.4, 11.2, 19.4 };
double* p = sizes;
assert(p[0] == 10.3); // Knows to look at all the bytes in the first double value
assert(p[1] == 13.4); // Actually looks at bytes from address p + 1 * sizeof(double)
// (sizeof(double) is almost always eight bytes)
++p; // Advance p by sizeof(double)
assert(*p == 13.4); // The double at memory beginning at address p has value 13.4
*(p + 2) = 29.8; // Change sizes[3] from 19.4 to 29.8
// Note earlier ++p and + 2 here => sizes[3]
Pointers to dynamically allocated memory
Sometimes you don't know how much memory you'll need until your program is running and sees what data is thrown at it... then you can dynamically allocate memory using malloc. It is common practice to store the address in a pointer...
int* p = (int*)malloc(sizeof(int)); // Get some memory somewhere...
*p = 10; // Dereference the pointer to the memory, then write a value in
fn(*p); // Call a function, passing it the value at address p
(*p) += 3; // Change the value, adding 3 to it
free(p); // Release the memory back to the heap allocation library
In C++, memory allocation is normally done with the new operator, and deallocation with delete:
int* p = new int(10); // Memory for one int with initial value 10
delete p;
p = new int[10]; // Memory for ten ints with unspecified initial value
delete[] p;
p = new int[10](); // Memory for ten ints that are value initialised (to 0)
delete[] p;
See also C++ smart pointers below.
Losing and leaking addresses
Often a pointer may be the only indication of where some data or buffer exists in memory. If ongoing use of that data/buffer is needed, or the ability to call free() or delete to avoid leaking the memory, then the programmer must operate on a copy of the pointer...
const char* p = asprintf("name: %s", name); // Common but non-Standard printf-on-heap
// Replace non-printable characters with underscores....
for (const char* q = p; *q; ++q)
if (!isprint(*q))
*q = '_';
printf("%s\n", p); // Only q was modified
free(p);
...or carefully orchestrate reversal of any changes...
const size_t n = ...;
p += n;
...
p -= n; // Restore earlier value...
free(p);
C++ smart pointers
In C++, it's best practice to use smart pointer objects to store and manage the pointers, automatically deallocating them when the smart pointers' destructors run. Since C++11 the Standard Library provides two, unique_ptr for when there's a single owner for an allocated object...
{
std::unique_ptr<T> p{new T(42, "meaning")};
call_a_function(p);
// The function above might throw, so delete here is unreliable, but...
} // p's destructor's guaranteed to run "here", calling delete
...and shared_ptr for share ownership (using reference counting)...
{
auto p = std::make_shared<T>(3.14, "pi");
number_storage1.may_add(p); // Might copy p into its container
number_storage2.may_add(p); // Might copy p into its container } // p's destructor will only delete the T if neither may_add copied it
Null pointers
In C, NULL and 0 - and additionally in C++ nullptr - can be used to indicate that a pointer doesn't currently hold the memory address of a variable, and shouldn't be dereferenced or used in pointer arithmetic. For example:
const char* p_filename = NULL; // Or "= 0", or "= nullptr" in C++
int c;
while ((c = getopt(argc, argv, "f:")) != -1)
switch (c) {
case f: p_filename = optarg; break;
}
if (p_filename) // Only NULL converts to false
... // Only get here if -f flag specified
In C and C++, just as inbuilt numeric types don't necessarily default to 0, nor bools to false, pointers are not always set to NULL. All these are set to 0/false/NULL when they're static variables or (C++ only) direct or indirect member variables of static objects or their bases, or undergo zero initialisation (e.g. new T(); and new T(x, y, z); perform zero-initialisation on T's members including pointers, whereas new T; does not).
Further, when you assign 0, NULL and nullptr to a pointer the bits in the pointer are not necessarily all reset: the pointer may not contain "0" at the hardware level, or refer to address 0 in your virtual address space. The compiler is allowed to store something else there if it has reason to, but whatever it does - if you come along and compare the pointer to 0, NULL, nullptr or another pointer that was assigned any of those, the comparison must work as expected. So, below the source code at the compiler level, "NULL" is potentially a bit "magical" in the C and C++ languages...
More about memory addresses, and why you probably don't need to know
More strictly, initialised pointers store a bit-pattern identifying either NULL or a (often virtual) memory address.
The simple case is where this is a numeric offset into the process's entire virtual address space; in more complex cases the pointer may be relative to some specific memory area, which the CPU may select based on CPU "segment" registers or some manner of segment id encoded in the bit-pattern, and/or looking in different places depending on the machine code instructions using the address.
For example, an int* properly initialised to point to an int variable might - after casting to a float* - access memory in "GPU" memory quite distinct from the memory where the int variable is, then once cast to and used as a function pointer it might point into further distinct memory holding machine opcodes for the program (with the numeric value of the int* effectively a random, invalid pointer within these other memory regions).
3GL programming languages like C and C++ tend to hide this complexity, such that:
If the compiler gives you a pointer to a variable or function, you can dereference it freely (as long as the variable's not destructed/deallocated meanwhile) and it's the compiler's problem whether e.g. a particular CPU segment register needs to be restored beforehand, or a distinct machine code instruction used
If you get a pointer to an element in an array, you can use pointer arithmetic to move anywhere else in the array, or even to form an address one-past-the-end of the array that's legal to compare with other pointers to elements in the array (or that have similarly been moved by pointer arithmetic to the same one-past-the-end value); again in C and C++, it's up to the compiler to ensure this "just works"
Specific OS functions, e.g. shared memory mapping, may give you pointers, and they'll "just work" within the range of addresses that makes sense for them
Attempts to move legal pointers beyond these boundaries, or to cast arbitrary numbers to pointers, or use pointers cast to unrelated types, typically have undefined behaviour, so should be avoided in higher level libraries and applications, but code for OSes, device drivers, etc. may need to rely on behaviour left undefined by the C or C++ Standard, that is nevertheless well defined by their specific implementation or hardware.
Dereferencing a pointer means getting the value that is stored in the memory location pointed by the pointer. The operator * is used to do this, and is called the dereferencing operator.
int a = 10;
int* ptr = &a;
printf("%d", *ptr); // With *ptr I'm dereferencing the pointer.
// Which means, I am asking the value pointed at by the pointer.
// ptr is pointing to the location in memory of the variable a.
// In a's location, we have 10. So, dereferencing gives this value.
// Since we have indirect control over a's location, we can modify its content using the pointer. This is an indirect way to access a.
*ptr = 20; // Now a's content is no longer 10, and has been modified to 20.
In simple words, dereferencing means accessing the value from a certain memory location against which that pointer is pointing.
A pointer is a "reference" to a value.. much like a library call number is a reference to a book. "Dereferencing" the call number is physically going through and retrieving that book.
int a=4 ;
int *pA = &a ;
printf( "The REFERENCE/call number for the variable `a` is %p\n", pA ) ;
// The * causes pA to DEREFERENCE... `a` via "callnumber" `pA`.
printf( "%d\n", *pA ) ; // prints 4..
If the book isn't there, the librarian starts shouting, shuts the library down, and a couple of people are set to investigate the cause of a person going to find a book that isn't there.
Code and explanation from Pointer Basics:
The dereference operation starts at
the pointer and follows its arrow over
to access its pointee. The goal may be
to look at the pointee state or to
change the pointee state. The
dereference operation on a pointer
only works if the pointer has a
pointee -- the pointee must be
allocated and the pointer must be set
to point to it. The most common error
in pointer code is forgetting to set
up the pointee. The most common
runtime crash because of that error in
the code is a failed dereference
operation. In Java the incorrect
dereference will be flagged politely
by the runtime system. In compiled
languages such as C, C++, and Pascal,
the incorrect dereference will
sometimes crash, and other times
corrupt memory in some subtle, random
way. Pointer bugs in compiled
languages can be difficult to track
down for this reason.
void main() {
int* x; // Allocate the pointer x
x = malloc(sizeof(int)); // Allocate an int pointee,
// and set x to point to it
*x = 42; // Dereference x to store 42 in its pointee
}
I think all the previous answers are wrong, as they
state that dereferencing means accessing the actual value.
Wikipedia gives the correct definition instead:
https://en.wikipedia.org/wiki/Dereference_operator
It operates on a pointer variable, and returns an l-value equivalent to the value at the pointer address. This is called "dereferencing" the pointer.
That said, we can dereference the pointer without ever
accessing the value it points to. For example:
char *p = NULL;
*p;
We dereferenced the NULL pointer without accessing its
value. Or we could do:
p1 = &(*p);
sz = sizeof(*p);
Again, dereferencing, but never accessing the value. Such code will NOT crash:
The crash happens when you actually access the data by an
invalid pointer. However, unfortunately, according the the
standard, dereferencing an invalid pointer is an undefined
behaviour (with a few exceptions), even if you don't try to
touch the actual data.
So in short: dereferencing the pointer means applying the
dereference operator to it. That operator just returns an
l-value for your future use.