Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have just started using c++ and saw that their is a null value for pointers. I am curious as to what this is used for. It seems like it would be pointless to add a pointer to point to nothing.
Well, the null pointer value has the remarkable property that, despite it being a well-defined and unique constant value, the exact value depending on machine-architecture and ABI (on most modern ones all-bits-zero, not that it matters), it never points to (or just behind) an object.
This allows it to be used as a reliable error-indicator when a valid pointer is expected (functions might throw an exception or terminate execution instead), as well as a sentinel value, or to mark the absence of something optional.
On many implementations accessing memory through a nullpointer will reliably cause a hardware exception (some even trap on arithmetic), though on many others, especially those without paging and / or segmentation it will not.
Generally it's a placeholder. If you just declare a pointer, int *a;, there's no guarantee what is in the pointer when you want to access it. So if your code may or may not set the pointer later, there's no way to tell if the pointer is valid or just pointing to garbage memory. But if you declare it as NULL, such as int *a = NULL; you can then check later to see if the pointer was set, like if(a == NULL).
Most of the time during initialization we assign null value to a pointer so that we can check whether it is still null or a address has been assign to it or not.
It seems like it would be pointless to add a pointer to point to
nothing.
No, it is not. Suppose you have a function returning optional dynamically allocated value. When you want to return "nothing" you return null. The caller can check for null and distinguish between 2 different cases: when the return value is "nothing" and when the return value is some valid usable object.
null value in C and C++ is equal to 0. But nullptr in C++ is different from it, nullptr is always a pointer type in C++. We assign a null value to a pointer variable for various reason.
To check whether a memory has been allocated to the pointer or not
To neutralize a dangling pointer so that it should not create any side effect
To check whether a return address is a valid address or not etc.
Most of the time during initialization we assign null value to a pointer so that we can check whether it is still null or a address has been assign to it or not.
Basically, pointers are just integers. The null pointer is a pointer with a value of 0. It doesn't strictly point to nothing, it points to absolute address 0, which generally isn't accessible to your program; dereferencing it causes a fault.
It's generally used as a flag value, so that you can, for example, use it to end a loop.
Update:
There seem to be a lot of people confused by this answer, which is, strictly, completely correct. See C11(ISO/IEC 9899:201x) §6.3.2.3 Pointers Section 3:
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
So, what's an address? It's a number n where 0 ≤ n ≤ max_address. And how do we represent such a number? Why, it's an integer, just like the standard says.
The C11 standard makes it clear that there's never anything to reference at address 0, because in some old pathologically non-portable code in BSD 4.2, you often saw code like this:
/* DON'T TRY THIS AT HOME */
int
main(){
char target[100] ;
char * tp = &target ;
char * src = "This won't do what you think." ;
void exit(int);
while((*tp++ = *src++))
;
exit(0);
}
This is still valid C:
$ gcc -o dumb dumb.c
dumb.c:6:12: warning: incompatible pointer types initializing 'char *' with an
expression of type 'char (*)[100]' [-Wincompatible-pointer-types]
char * tp = &target ;
^ ~~~~~~~
1 warning generated.
$
In 4.2BSD on a VAX, you could get away with that nonsense, because address 0 reliably contained the value 0, so the assignment evaluated to 0, which is of course FALSE.
Now, to demonstrate:
/* Very simple program dereferencing a NULL pointer. */
int
main() {
int * a_pointer ;
int a_value ;
void exit(int); /* To avoid any #includes */
a_pointer = ((void*)0);
a_value = *a_pointer ;
exit(0);
}
Here's the results:
$ gcc -o null null.c
$ ./null
Segmentation fault: 11
$
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
PS, I know what a pointer is and how to use one, but confused on one thing. I have already tried searching stackoverflow on this question:
int *ptr = 20 //why illegal or seg fault or crash?
printf("%i", *ptr) // Seg Fault
printf("%i", ptr) // Output -> 20
printf("%p", &ptr) // Returns a valid address.
and found that, By directly assigning a value to a pointer without initializing with malloc or null, means that we are saying to the compiler, Hey CPU, Make a space in the memory to store an integer at the exact address given as value, which in this case 20. So basically saying to the compiler make a place for an INT in the ram with the address 20. By doing this we are touching system memory or illegal space.
But what I don't get is,
How the integer 20 can directly be referenced as a memory?
What happens when we do the same for float or char? for example float *ptr = 20.25
I tried directly converting c code to assembly with a website, for a legal and illegal pointer example, where I see that the same registers are called, same MOV operations are done, And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
Lastly, What exactly happens when we declare strings by doing char
*ptr = "Hello"?
I have tried every possible way to understand this, but couldn't. Can you guys point me to the right direction? Thanks ...
How the integer 20 can directly be referenced as a memory?
Using my C++ compiler, it doesn't compile: I get this error instead:
temp.cpp:22:14: error: cannot initialize a variable of type 'int *' with an
rvalue of type 'int'
int * x = 20;
It does compile as C, albeit with this warning:
temp.c:12:11: warning: incompatible integer to pointer conversion initializing
'int *' with an expression of type 'int' [-Wint-conversion]
int * x = 20;
However, this does compile under both C and C++:
int * x = (int *) 20;
... it compiles because 20 is a well-formed memory-address (it specifies a memory location 20 bytes from the start of the process's memory space).
Note that on most operating systems it is not a usable memory-address though; most operating systems mark the first few pages of the address space as "unreadable/unwritable" specifically so that they can crash the process when someone tries to dereference a NULL-pointer (which otherwise would cause the process to read or write memory at a small offset from the start of the memory space)
What happens when we do the same for float or char? for example float
*ptr = 20.25
Those types won't compile, because floating point (or char) values don't make sense as memory addresses. In most environments, memory addresses are integer offsets from the top of the memory space, so if you want to specify one as a constant (which btw you usually don't want to do, unless you are working at a very low level, e.g. addressing DMA hardware directly in an embedded controller), it needs to be an integer constant.
And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
That's to be expected -- setting a pointer to a value doesn't implicitly make space for anything, it only sets the pointer to point at the memory-address the constant specified.
Lastly, What exactly happens when we declare strings by doing char
*ptr = "Hello"?
In this case, the compiler recognizes that you have declared a string-constant and adds that string as a read-only array to the process's memory-space. Having done that, it can then set the pointer to point to the start of that array. Note that this behavior is specific to string constants, and doesn't carry over to other data types like int or float.
Also note that it is the declaration of the string constant that triggers the addition of that constant, not the setting of the pointer to point at that constant. For example, if you had this code:
const char * s1 = "Hello";
const char * s2 = "Hello";
printf("s1=%p s2=%p\n", s1, s2);
... you will see output something like this:
s1=0x104608f4e s2=0x104608f4e
... note that both pointers are pointing to the same memory location; since the two strings are identical and read-only, the compiler is free to save memory by only allocating a single instance of the string-data.
Contrariwise, if you did this:
const char * x = (const char *) 20;
... you'd run into the exact same problems you saw with your int * example.
Regarding C++:
int *ptr = 20 //why illegal or seg fault or crash?
This program is ill-formed. Compilers are not required to succesfully compile this program, and they are required to inform you of the issue. There is no implicit conversion from integer literal to pointer (except the literal zero).
How the integer 20 can directly be referenced as a memory?
It cannot be referenced in general. Only if the memory at the address has been allocated, can the pointer be meaningfully used. Furthermore, some usage such as reading the value of the pointed object require that an object exists within its lifetime at the pointed address. Otherwise the behaviour of the program is undefined.
What happens when we do the same for float or char?
Mostly the same as with pointer to int. Unless there is an object of compatible type at the pointed address, the behaviour is undefined when you access the object by indirecting through the pointer. char is slightly different in in that it is compatible with objects of all types. But even char cannot be used to read unallocated memory.
... no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
Well, you didn't tell C++ to allocate any memory, so why would there be any "space made" at the given address?
Lastly, What exactly happens when we declare strings by doing char *ptr = "Hello"?
The program will be ill-formed, since an array of const char doesn't implicitly convert to a pointer to non-const char. Unless standard is pre-C++11, in which case the program is well-formed due to such conversion existing. You would get a deprecation warning instead.
int *ptr = 20 //why illegal or seg fault or crash?
It is illegal because C and C++ standards say so.
by directly assigning a value to a pointer without initializing with malloc or null, means that we are saying to the compiler, Hey CPU, Make a space in the memory to store an integer at the exact address given as value, which in this case 20.
Nothing like this happens. It is simply illegal, full stop.
How the integer 20 can directly be referenced as a memory?
This question is unclear.
What happens when we do the same for float or char? for example float *ptr = 20.25
It is just as illegal as the one above.
I tried directly converting c code to assembly with a website, for a legal and illegal pointer example, where I see that the same registers are called, same MOV operations are done, And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
There is normally no "MAKE SPACE AT given ADDRESS" instruction that can be contrilled by a C or C++ program.
What exactly happens when we declare strings by doing char *ptr = "Hello"?
In C++, the implementation produces a diagnostic message. What happens next depends on the implementation. In C, the implementation does whatever magic is necessary to cause ptr to point at the first character of a null-terminated character arrray that contains "Hello".
Now for the questions you didn't ask.
What happens in this line
int *ptr = (int*)20;
The number 20 is is interpreted as an address and converted, in an implelmentation-defined way, to a pointer of type int*. No space is allocated at this address. ptr is just made to point there.
How can I allocate an int worth of memory at address 20?
You cannot as far as C and C++ languages go.
This question already has answers here:
Returning an array using C
(8 answers)
Closed 3 years ago.
I have come across a very confusing thing. I have made a local char array in a function, and return the array name, but the return value is null?
char* get_string(){
char local[] ="hello world\n";
cout<<"1"<<(int)local<<endl;//shows a reasonable value
return local;
}
int main(){
char* p = get_string();
cout<<"2"<<(int) p<<endl;//shows 0
return 0;
}
I know it is not good to use a local variable, because when the function returns, the stack part that the local variable occupies would be used by other function calls, but I think this should return the address of the first element of the array, should not be null. I'm very confused; any help would be appreciated.
I use QT 32 version, compiler is MSVC2015 (I am at baby stage about compiler; not even sure that MSVC is compiler name).
--updated, I think this question is not a duplicate of this Returning an array using C I know it is not valid to use atomic/local storage outside the scope, and my question is why the return value becomes 0 despite its inappropriate use.
--ok, thank you, everyone. I think I found the answer. I see the assembly code of the function char* get_string(), the last part of the assembly code is this
0x44bce7 mov $0x0,%eax
0x44bcec leave
0x44bced ret
I think this is implementation defined, hard coded in the compiler, if I return the address of a local variable, then %eax or %rax is set to 0.
The C++ standard says (quoting the latest draft):
[basic.stc]
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values.
Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior.
Any other use of an invalid pointer value has implementation-defined behavior.
p contains an invalid pointer value, and printing the value of the pointer is included in "any other use", and thus the behaviour is implementation defined. In the observed case, the behaviour was to output 0.
Note to readers that in the code in the example there is no indirection through the invalid pointer and the behaviour is not undefined.
P.S. Converting pointer to int is not correct. int isn't guaranteed to be sufficiently large to represent all pointer values, and on most 64 bit systems, it isn't sufficiently large. Standard only specifies the behaviour for conversion to sufficiently large integer type. I would suggest converting to void* instead for this case.
This question already has answers here:
Is incrementing a null pointer well-defined?
(9 answers)
Closed 7 years ago.
This is a related question to the discussion around Example of error caused by UB of incrementing a NULL pointer
Suppose I define this data structure:
union UPtrMem
{
void* p;
char ach[sizeof(void*)];
}
UPtrMem u;
u.p = nullptr;
u.p++; // UB according to standards
u.ach[0]++; // why is this OK then??
p and ach share the same memory, so is merely the act of modifying a memory location (that happens to contain a pointer) UB? I would think it only gets undefined once you try to dereference the pointer.
This is still UB because
it's undefined behavior to read from the member of the union that wasn't most recently written.
(from here). So you have UB, regardless of the value of p. To conclude:
why is this OK then??
It is not.
Your example doesn't contain any UB, because you don't get that far: it's invalid code that just won't compile.
To have the kind of UB you're thinking about, the title's “UB when manipulating nullptr”, you need to have the code executed.
That doesn't happen when it doesn't compile.
Just in case the question is changed after I answer, which isn't uncommon with these kinds of apparently designed-to-trap-the-responder questions, this is the code presented as I'm writing this:
union UPtrMem
{
void* p;
char ach[sizeof(void*)];
}
UPtrMem u;
u.p = nullptr;
u.p++; // UB according to standards
u.ach[0]++; // why is this OK then??
Incrementing a void* is just invalid, not a supported operation, and won't compile.
The reason why the standard makes incrementing a null pointer undefined is because it is not always the case that a null pointer contains an arithmetically meaningful value like 0. It could contain a specific bit pattern that indicates non-addressable memory to the CPU.
Your example has other problems too.
When you increment an allocated pointer it adds the size of the thing it points to to its value.
So on a 32bit computer and int* will likely advance 4 places (sizeof(int)) when you add 1 to it.
The problem with void* is the compiler has no size information and so can not know how far to increment its value.
In your example you then do this:
u.ach[0]++;
That doesn't increment a pointer at all, it increments whatever char value is contained in the first element of the char array. This, of course, is undefined so, even though it works, you can not rely on it having any specific value.
Seems to me u.p++; isn't even valid because void has no size so - nothing to increment. But u.ach[0]++; is valid because your incrementing a char.
edit yes it takes up space in the structure... but what it points to has no size... what would it increment by?
I am wondering how would I deal with a call to a function when an integer is passed into a function that accepts a pointer? In my case hasPlayedInTeam() accepts a pointer to Team, however, received an int. This causes the Q_ASSERT to hang.
In addition, is my problem also known as a null pointer? My professor has used that term several times in lecture, but I am not sure what he was referring to.
//main.cpp
Person p1("Jack", 22, "UCLA");
Q_ASSERT(p1.hasPlayedInTeam(0) == false);
//person.cpp
bool Person::hasPlayedInTeam(Team *pTeam) {
bool temp = false;
foreach (Team* team, teamList) {
if (team->getName() == pTeam->getName() {
temp = true;
}
}
return temp;
}
In your call:
p1.hasPlayedInTeam(0)
the integer literal 0 is converted to a NULL pointer. So, you are not actually "receiving" an integer; you are passing an integer, the compiler can automatically cast it to the null pointer (given the definition for NULL).
I think you can fix the definition of hasPlayedInTeam by either asserting that its argument is not NULL, or by returning a default value when NULL is passed in:
//person.cpp
bool Person::hasPlayedInTeam(Team *pTeam) {
assert(pTeam!=NULL); //-- in this case, your program will assert and halt
or:
//person.cpp
bool Person::hasPlayedInTeam(Team *pTeam) {
if (pTeam == NULL)
return false; //-- in this case, your program will not assert and continue with a sensible (it actually depends on how you define "sensible") return value
Yes, it sounds like your problem is a null pointer. A null pointer means that you have a pointer which isn't actually pointing to anything:
Team* team = NULL;
It so happens that in C++ NULL is a macro for the integer 0. Stroustrup has some comments on which one he prefers to use in code.
Function hasPlayedInTeam() looks for the argument of type "Team" whereas you are passing the argument of type "integer" which is wrong....
Yes, I think that you are referring to a null pointer in that situation.
To treat the case when an int is passed, you can overload the function and make it behave as you want it to do, when an int is passed.
In C++ there is NULL which is defined as 0 (in some standard header file, cstddef I think) so yes the integer you are passing is the null pointer. 0 is the only (as far as I know) integer that will automatically (implicitly) be converted to a pointer of whatever type is needed.
In practice, I think most people prefer to use NULL instead of 0 for the null pointer.
I'm not sure why it is hanging however, dereferencing the NULL pointer (in your statement pTeam->getName()) should cause the program to crash if you pass it NULL, not just hang.
Unfortunately the null pointer literal is one of the confused parts of the language. Let me try to recap:
For any type there is the concept of "pointer to that type". For example you can have integers and pointer to integers (int x; int *y;), doubles and pointer to doubles (double x; double *y;), Person and pointer to Person (Person x,*y;). If X is a type then "pointer to X" is a type itself and therefore you can even find pointers to pointers to integers (int **x;) or pointers to pointers to pointers to chars (char ***x;).
For any pointer type there is a null pointer value. It's a value that doesn't really point to an object, so it's an error to try to dereference it ("dereferencing" a pointer means reading or writing the object that is being pointed to). Note that the C++ language doesn't guarantee that you will get a message or a crash when you use a null pointer to get try to get to a non-existent pointed object but just that you should not do it in a program because consequences are unpredictable. The language simply assumes you are not going to do this kind of error.
How is the null pointer expressed in a program? Here comes the tricky part. For reasons that are beyond comprehension the C++ language uses a strange rule: if you get any constant integer expression with value zero then that can be (if needed) considered to be a null pointer for any type.
The last rule is extremely strange and illogical and for example means that
char *x = 0; // x is a pointer to char with the null pointer value (ok)
char *y = (1-1); // exactly the same (what??)
char *z = !! !! !! !! !! !!
!!! !! !! !! !! !!
!!!! !! !! !! !! !!
!! !! !! !! !! !! !!
!! !!!! !! !! !! !!
!! !!! !! !! !! !!
!! !! !!!!!! !!!!!!! !!!!!!1; // Same again (!)
and this is true for any pointer type.
Why is the standard mandating that any expression and not just a zero literal can be considered the null pointer value? Really no idea.
Apparently Stroustrup also found the thing amusing instead of disgusting like it should be (the last example with the "NULL" text written with an odd number of negations is present on "The C++ Programming Language" book).
Also note that there is a NULL symbol defined in standard headers that provide a valid definition for a null pointer value for any type. In "C" a valid definition could have been (void *)0 but this is not valid in C++ because void pointers cannot be converted implicitly to other pointer types like they do in "C".
Note also that you may find in literature the term NUL (only one L) but this is the ASCII character with code 0 (represented in C/C++ with '\0') and is a logically distinct thing from a pointer or an integer number.
Unfortunately in C++ characters are integers too and therefore for example '\0' is a valid null pointer value and the same goes for ('A'-'A') (they are integer constant expressions evaluating to zero).
C++11 increases complexity of these already questionable rules with std::nullptr_t and nullptr. I cannot explain those rules because I didn't understand them myself (and I'm not yet 100% sure I want to understand them).
What is the meaning of
*(int *)0 = 0;
It does compile successfully
It has no meaning. That's an error. It's parsed as this
(((int)0) = 0)
Thus, trying to assign to an rvalue. In this case, the right side is a cast of 0 to int (it's an int already, anyway). The result of a cast to something not a reference is always an rvalue. And you try to assign 0 to that. What Rvalues miss is an object identity. The following would work:
int a;
(int&)a = 0;
Of course, you could equally well write it as the following
int a = 0;
Update: Question was badly formatted. The actual code was this
*(int*)0 = 0
Well, now it is an lvalue. But a fundamental invariant is broken. The Standard says
An lvalue refers to an object or function
The lvalue you assign to is neither an object nor a function. The Standard even explicitly says that dereferencing a null-pointer ((int*)0 creates such a null pointer) is undefined behavior. A program usually will crash on an attempt to write to such a dereferenced "object". "Usually", because the act of dereferencing is already declared undefined by C++.
Also, note that the above is not the same as the below:
int n = 0;
*(int*)n = 0;
While the above writes to something where certainly no object is located, this one will write to something that results from reinterpreting n to a pointer. The mapping to the pointer value is implementation defined, but most compilers will just create a pointer referring to address zero here. Some systems may keep data on that location, so this one may have more chances to stay alive - depending on your system. This one is not undefined behavior necessarily, but depends on the compiler and runtime-environment it is invoked in.
If you understand the difference between the above dereference of a null pointer (only constant expressions valued 0 converted to pointers yield null pointers!) and the below dereference of a reinterpreted zero value integer, i think you have learned something important.
It will usually cause an access violation at runtime. The following is done: first 0 is cast to an int * and that yields a null pointer. Then a value 0 is written to that address (null address) - that causes undefined behaviour, usually an access violation.
Effectively it is this code:
int* address = reinterpret_cast<int*>( 0 );
*address = 0;
Its a compilation error. You cant modify a non-lvalue.
It puts a zero on address zero. On some systems you can do this. Most MMU-based systems will not allow this in run-time. I once saw an embedded OS writing to address 0 when performing time(NULL).
there is no valid lvalue in that operation so it shouldn't compile.
the left hand side of an assignment must be... err... assignable