What does directly setting a pointer with a variable means? [closed] - c++

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
PS, I know what a pointer is and how to use one, but confused on one thing. I have already tried searching stackoverflow on this question:
int *ptr = 20 //why illegal or seg fault or crash?
printf("%i", *ptr) // Seg Fault
printf("%i", ptr) // Output -> 20
printf("%p", &ptr) // Returns a valid address.
and found that, By directly assigning a value to a pointer without initializing with malloc or null, means that we are saying to the compiler, Hey CPU, Make a space in the memory to store an integer at the exact address given as value, which in this case 20. So basically saying to the compiler make a place for an INT in the ram with the address 20. By doing this we are touching system memory or illegal space.
But what I don't get is,
How the integer 20 can directly be referenced as a memory?
What happens when we do the same for float or char? for example float *ptr = 20.25
I tried directly converting c code to assembly with a website, for a legal and illegal pointer example, where I see that the same registers are called, same MOV operations are done, And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
Lastly, What exactly happens when we declare strings by doing char
*ptr = "Hello"?
I have tried every possible way to understand this, but couldn't. Can you guys point me to the right direction? Thanks ...

How the integer 20 can directly be referenced as a memory?
Using my C++ compiler, it doesn't compile: I get this error instead:
temp.cpp:22:14: error: cannot initialize a variable of type 'int *' with an
rvalue of type 'int'
int * x = 20;
It does compile as C, albeit with this warning:
temp.c:12:11: warning: incompatible integer to pointer conversion initializing
'int *' with an expression of type 'int' [-Wint-conversion]
int * x = 20;
However, this does compile under both C and C++:
int * x = (int *) 20;
... it compiles because 20 is a well-formed memory-address (it specifies a memory location 20 bytes from the start of the process's memory space).
Note that on most operating systems it is not a usable memory-address though; most operating systems mark the first few pages of the address space as "unreadable/unwritable" specifically so that they can crash the process when someone tries to dereference a NULL-pointer (which otherwise would cause the process to read or write memory at a small offset from the start of the memory space)
What happens when we do the same for float or char? for example float
*ptr = 20.25
Those types won't compile, because floating point (or char) values don't make sense as memory addresses. In most environments, memory addresses are integer offsets from the top of the memory space, so if you want to specify one as a constant (which btw you usually don't want to do, unless you are working at a very low level, e.g. addressing DMA hardware directly in an embedded controller), it needs to be an integer constant.
And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
That's to be expected -- setting a pointer to a value doesn't implicitly make space for anything, it only sets the pointer to point at the memory-address the constant specified.
Lastly, What exactly happens when we declare strings by doing char
*ptr = "Hello"?
In this case, the compiler recognizes that you have declared a string-constant and adds that string as a read-only array to the process's memory-space. Having done that, it can then set the pointer to point to the start of that array. Note that this behavior is specific to string constants, and doesn't carry over to other data types like int or float.
Also note that it is the declaration of the string constant that triggers the addition of that constant, not the setting of the pointer to point at that constant. For example, if you had this code:
const char * s1 = "Hello";
const char * s2 = "Hello";
printf("s1=%p s2=%p\n", s1, s2);
... you will see output something like this:
s1=0x104608f4e s2=0x104608f4e
... note that both pointers are pointing to the same memory location; since the two strings are identical and read-only, the compiler is free to save memory by only allocating a single instance of the string-data.
Contrariwise, if you did this:
const char * x = (const char *) 20;
... you'd run into the exact same problems you saw with your int * example.

Regarding C++:
int *ptr = 20 //why illegal or seg fault or crash?
This program is ill-formed. Compilers are not required to succesfully compile this program, and they are required to inform you of the issue. There is no implicit conversion from integer literal to pointer (except the literal zero).
How the integer 20 can directly be referenced as a memory?
It cannot be referenced in general. Only if the memory at the address has been allocated, can the pointer be meaningfully used. Furthermore, some usage such as reading the value of the pointed object require that an object exists within its lifetime at the pointed address. Otherwise the behaviour of the program is undefined.
What happens when we do the same for float or char?
Mostly the same as with pointer to int. Unless there is an object of compatible type at the pointed address, the behaviour is undefined when you access the object by indirecting through the pointer. char is slightly different in in that it is compatible with objects of all types. But even char cannot be used to read unallocated memory.
... no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
Well, you didn't tell C++ to allocate any memory, so why would there be any "space made" at the given address?
Lastly, What exactly happens when we declare strings by doing char *ptr = "Hello"?
The program will be ill-formed, since an array of const char doesn't implicitly convert to a pointer to non-const char. Unless standard is pre-C++11, in which case the program is well-formed due to such conversion existing. You would get a deprecation warning instead.

int *ptr = 20 //why illegal or seg fault or crash?
It is illegal because C and C++ standards say so.
by directly assigning a value to a pointer without initializing with malloc or null, means that we are saying to the compiler, Hey CPU, Make a space in the memory to store an integer at the exact address given as value, which in this case 20.
Nothing like this happens. It is simply illegal, full stop.
How the integer 20 can directly be referenced as a memory?
This question is unclear.
What happens when we do the same for float or char? for example float *ptr = 20.25
It is just as illegal as the one above.
I tried directly converting c code to assembly with a website, for a legal and illegal pointer example, where I see that the same registers are called, same MOV operations are done, And no explicit "MAKE SPACE AT given ADDRESS" instructions were set.
There is normally no "MAKE SPACE AT given ADDRESS" instruction that can be contrilled by a C or C++ program.
What exactly happens when we declare strings by doing char *ptr = "Hello"?
In C++, the implementation produces a diagnostic message. What happens next depends on the implementation. In C, the implementation does whatever magic is necessary to cause ptr to point at the first character of a null-terminated character arrray that contains "Hello".
Now for the questions you didn't ask.
What happens in this line
int *ptr = (int*)20;
The number 20 is is interpreted as an address and converted, in an implelmentation-defined way, to a pointer of type int*. No space is allocated at this address. ptr is just made to point there.
How can I allocate an int worth of memory at address 20?
You cannot as far as C and C++ languages go.

Related

I have a local char array in a function — when I return the array name, why is the return value null? [duplicate]

This question already has answers here:
Returning an array using C
(8 answers)
Closed 3 years ago.
I have come across a very confusing thing. I have made a local char array in a function, and return the array name, but the return value is null?
char* get_string(){
char local[] ="hello world\n";
cout<<"1"<<(int)local<<endl;//shows a reasonable value
return local;
}
int main(){
char* p = get_string();
cout<<"2"<<(int) p<<endl;//shows 0
return 0;
}
I know it is not good to use a local variable, because when the function returns, the stack part that the local variable occupies would be used by other function calls, but I think this should return the address of the first element of the array, should not be null. I'm very confused; any help would be appreciated.
I use QT 32 version, compiler is MSVC2015 (I am at baby stage about compiler; not even sure that MSVC is compiler name).
--updated, I think this question is not a duplicate of this Returning an array using C I know it is not valid to use atomic/local storage outside the scope, and my question is why the return value becomes 0 despite its inappropriate use.
--ok, thank you, everyone. I think I found the answer. I see the assembly code of the function char* get_string(), the last part of the assembly code is this
0x44bce7 mov $0x0,%eax
0x44bcec leave
0x44bced ret
I think this is implementation defined, hard coded in the compiler, if I return the address of a local variable, then %eax or %rax is set to 0.
The C++ standard says (quoting the latest draft):
[basic.stc]
When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values.
Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior.
Any other use of an invalid pointer value has implementation-defined behavior.
p contains an invalid pointer value, and printing the value of the pointer is included in "any other use", and thus the behaviour is implementation defined. In the observed case, the behaviour was to output 0.
Note to readers that in the code in the example there is no indirection through the invalid pointer and the behaviour is not undefined.
P.S. Converting pointer to int is not correct. int isn't guaranteed to be sufficiently large to represent all pointer values, and on most 64 bit systems, it isn't sufficiently large. Standard only specifies the behaviour for conversion to sufficiently large integer type. I would suggest converting to void* instead for this case.

What are null pointers used for [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have just started using c++ and saw that their is a null value for pointers. I am curious as to what this is used for. It seems like it would be pointless to add a pointer to point to nothing.
Well, the null pointer value has the remarkable property that, despite it being a well-defined and unique constant value, the exact value depending on machine-architecture and ABI (on most modern ones all-bits-zero, not that it matters), it never points to (or just behind) an object.
This allows it to be used as a reliable error-indicator when a valid pointer is expected (functions might throw an exception or terminate execution instead), as well as a sentinel value, or to mark the absence of something optional.
On many implementations accessing memory through a nullpointer will reliably cause a hardware exception (some even trap on arithmetic), though on many others, especially those without paging and / or segmentation it will not.
Generally it's a placeholder. If you just declare a pointer, int *a;, there's no guarantee what is in the pointer when you want to access it. So if your code may or may not set the pointer later, there's no way to tell if the pointer is valid or just pointing to garbage memory. But if you declare it as NULL, such as int *a = NULL; you can then check later to see if the pointer was set, like if(a == NULL).
Most of the time during initialization we assign null value to a pointer so that we can check whether it is still null or a address has been assign to it or not.
It seems like it would be pointless to add a pointer to point to
nothing.
No, it is not. Suppose you have a function returning optional dynamically allocated value. When you want to return "nothing" you return null. The caller can check for null and distinguish between 2 different cases: when the return value is "nothing" and when the return value is some valid usable object.
null value in C and C++ is equal to 0. But nullptr in C++ is different from it, nullptr is always a pointer type in C++. We assign a null value to a pointer variable for various reason.
To check whether a memory has been allocated to the pointer or not
To neutralize a dangling pointer so that it should not create any side effect
To check whether a return address is a valid address or not etc.
Most of the time during initialization we assign null value to a pointer so that we can check whether it is still null or a address has been assign to it or not.
Basically, pointers are just integers. The null pointer is a pointer with a value of 0. It doesn't strictly point to nothing, it points to absolute address 0, which generally isn't accessible to your program; dereferencing it causes a fault.
It's generally used as a flag value, so that you can, for example, use it to end a loop.
Update:
There seem to be a lot of people confused by this answer, which is, strictly, completely correct. See C11(ISO/IEC 9899:201x) §6.3.2.3 Pointers Section 3:
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
So, what's an address? It's a number n where 0 ≤ n ≤ max_address. And how do we represent such a number? Why, it's an integer, just like the standard says.
The C11 standard makes it clear that there's never anything to reference at address 0, because in some old pathologically non-portable code in BSD 4.2, you often saw code like this:
/* DON'T TRY THIS AT HOME */
int
main(){
char target[100] ;
char * tp = &target ;
char * src = "This won't do what you think." ;
void exit(int);
while((*tp++ = *src++))
;
exit(0);
}
This is still valid C:
$ gcc -o dumb dumb.c
dumb.c:6:12: warning: incompatible pointer types initializing 'char *' with an
expression of type 'char (*)[100]' [-Wincompatible-pointer-types]
char * tp = &target ;
^ ~~~~~~~
1 warning generated.
$
In 4.2BSD on a VAX, you could get away with that nonsense, because address 0 reliably contained the value 0, so the assignment evaluated to 0, which is of course FALSE.
Now, to demonstrate:
/* Very simple program dereferencing a NULL pointer. */
int
main() {
int * a_pointer ;
int a_value ;
void exit(int); /* To avoid any #includes */
a_pointer = ((void*)0);
a_value = *a_pointer ;
exit(0);
}
Here's the results:
$ gcc -o null null.c
$ ./null
Segmentation fault: 11
$

Is the name of a two dimensional array address of the address of its first element in C++?

When implementing a two dimensional array like this:
int a[3][3];
these hold: A=&A[0], at the same time A[0]=&A[0][0]. So, A=&(&A[0][0]), what basically says that A is the address of the address of the first element of the array, which is not quite true. What is my mistake here? Does A really decay to a pointer to a pointer?
Your mistake is that you have an incorrect understanding of the relationship between arrays and pointers. An array is not a pointer. It is an array. However, an array is implicitly convertible to a pointer to its own first element. So, while this expression does evaluate to true:
A == &A[0]
It is not correct to say that A is &A[0]. The conversion does not happen in all expressions. For example:
&A
This does not take the address of the address of the first element of A (that doesn't even make sense). It takes the actual address of A, who's type is int[3][3]. So the type of &A is int(*)[3][3], read as "pointer to array of 3 arrays of 3 ints".
The primary difference between &A and &A[0] is that if you add 1 to &A, you will get an address that is 3 * 3 * sizeof(int) bytes away, while if you add 1 to &A[0], you will get a pointer that is only 3 * sizeof(int) bytes away.
With all this in mind, you should be able to see where your mistake is. A[0] is not &A[0][0], but it is implicitly convertible to it. However, like all conversions, this results in a temporary, which you cannot take the address of. So the expression &(&A[0][0]) doesn't even make sense.
Because of reactions on my previous answer I did some research to learn more on whatever was wrong in my explanation.
Found a rather elaborate explanation of the topic here :
http://eli.thegreenplace.net/2009/10/21/are-pointers-and-arrays-equivalent-in-c
I'll try to summarize :
if you have following :
char array_place[100] = "don't panic";
char* ptr_place = "don't panic";
the way that this is represented in memory is entirely different.
whereas ptr_place is a real pointer, array_place is just a label.
char a = array_place[7];
char b = ptr_place[7];
The semantics of arrays in C dictate that the array name is the address of the first element of the array, which is not the same as saying that it is a pointer. Hence in the assignment to a, the 8th character of the array is taken by offsetting the value of array_place by 7, and moving the contents pointed to by the resulting address into the al register, and later into a.
The semantics of pointers are quite different. A pointer is just a regular variable that happens to hold the address of another variable inside. Therefore, to actually compute the offset of the 8th character of the string, the CPU will first copy the value of the pointer into a register and only then increment it. This takes another instruction [1].
This point is frequently ignored by programmers who don't actually hack on compilers. A variable in C is just a convenient, alphanumeric pseudonym of a memory location. Were we writing assembly code, we would just create a label in some memory location and then access this label instead of always hard-coding the memory value - and this is what the compiler does.
Well, actually the address is not hard-coded in an absolute way because of loading and relocation issues, but for the sake of this discussion we don't have to get into these details.
A label is something the compiler assigns at compile time. From here the great difference between arrays and pointers. This also explains why sizeof(array_place) gives the full size of the array where as the size of a pointer will give the size of a pointer.
I must say, I was not aware of these subtle differences myself, and I have been coding for quite a long time in C and C++ and with arrays too.
Nevertheless if the name of the array element is the address of the first element of the array. You can create a pointer and initialise it what that value
char* p = array_place
p will point to the memory location where the characters are.
to conclude :
There is one difference between an array name and a pointer that must be kept in mind. A pointer is a variable, so p=array_place and p++ are legal. But an array name is not a variable; constructions like array_place=p and array_place++ are illegal. That I did know ;-)

Why is it allowed to cast a pointer to a reference?

Originally being the topic of this question, it emerged that the OP just overlooked the dereference. Meanwhile, this answer got me and some others thinking - why is it allowed to cast a pointer to a reference with a C-style cast or reinterpret_cast?
int main() {
char c = 'A';
char* pc = &c;
char& c1 = (char&)pc;
char& c2 = reinterpret_cast<char&>(pc);
}
The above code compiles without any warning or error (regarding the cast) on Visual Studio while GCC will only give you a warning, as shown here.
My first thought was that the pointer somehow automagically gets dereferenced (I work with MSVC normally, so I didn't get the warning GCC shows), and tried the following:
#include <iostream>
int main() {
char c = 'A';
char* pc = &c;
char& c1 = (char&)pc;
std::cout << *pc << "\n";
c1 = 'B';
std::cout << *pc << "\n";
}
With the very interesting output shown here. So it seems that you are accessing the pointed-to variable, but at the same time, you are not.
Ideas? Explanations? Standard quotes?
Well, that's the purpose of reinterpret_cast! As the name suggests, the purpose of that cast is to reinterpret a memory region as a value of another type. For this reason, using reinterpret_cast you can always cast an lvalue of one type to a reference of another type.
This is described in 5.2.10/10 of the language specification. It also says there that reinterpret_cast<T&>(x) is the same thing as *reinterpret_cast<T*>(&x).
The fact that you are casting a pointer in this case is totally and completely unimportant. No, the pointer does not get automatically dereferenced (taking into account the *reinterpret_cast<T*>(&x) interpretation, one might even say that the opposite is true: the address of that pointer is automatically taken). The pointer in this case serves as just "some variable that occupies some region in memory". The type of that variable makes no difference whatsoever. It can be a double, a pointer, an int or any other lvalue. The variable is simply treated as memory region that you reinterpret as another type.
As for the C-style cast - it just gets interpreted as reinterpret_cast in this context, so the above immediately applies to it.
In your second example you attached reference c to the memory occupied by pointer variable pc. When you did c = 'B', you forcefully wrote the value 'B' into that memory, thus completely destroying the original pointer value (by overwriting one byte of that value). Now the destroyed pointer points to some unpredictable location. Later you tried to dereference that destroyed pointer. What happens in such case is a matter of pure luck. The program might crash, since the pointer is generally non-defererencable. Or you might get lucky and make your pointer to point to some unpredictable yet valid location. In that case you program will output something. No one knows what it will output and there's no meaning in it whatsoever.
One can rewrite your second program into an equivalent program without references
int main(){
char* pc = new char('A');
char* c = (char *) &pc;
std::cout << *pc << "\n";
*c = 'B';
std::cout << *pc << "\n";
}
From the practical point of view, on a little-endian platform your code would overwrite the least-significant byte of the pointer. Such a modification will not make the pointer to point too far away from its original location. So, the code is more likely to print something instead of crashing. On a big-endian platform your code would destroy the most-significant byte of the pointer, thus throwing it wildly to point to a totally different location, thus making your program more likely to crash.
It took me a while to grok it, but I think I finally got it.
The C++ standard specifies that a cast reinterpret_cast<U&>(t) is equivalent to *reinterpret_cast<U*>(&t).
In our case, U is char, and t is char*.
Expanding those, we see that the following happens:
we take the address of the argument to the cast, yielding a value of type char**.
we reinterpret_cast this value to char*
we dereference the result, yielding a char lvalue.
reinterpret_cast allows you to cast from any pointer type to any other pointer type. And so, a cast from char** to char* is well-formed.
I'll try to explain this using my ingrained intuition about references and pointers rather than relying on the language of the standard.
C didn't have reference types, it only had values and pointer types (addresses) - since, physically in memory, we only have values and addresses.
In C++ we've added references to the syntax, but you can think of them as a kind of syntactic sugar - there is no special data structure or memory layout scheme for holding references.
Well, what "is" a reference from that perspective? Or rather, how would you "implement" a reference? With a pointer, of course. So whenever you see a reference in some code you can pretend it's really just a pointer that's been used in a special way: if int x; and int& y{x}; then we really have a int* y_ptr = &x; and if we say y = 123; we merely mean *(y_ptr) = 123;. This is not dissimilar from how, when we use C array subscripts (a[1] = 2;) what actually happens is that a is "decayed" to mean pointer to its first element, and then what gets executed is *(a + 1) = 2.
(Side note: Compilers don't actually always hold pointers behind every reference; for example, the compiler might use a register for the referred-to variable, and then a pointer can't point to it. But the metaphor is still pretty safe.)
Having accepted the "reference is really just a pointer in disguise" metaphor, it should now not be surprising that we can ignore this disguise with a reinterpret_cast<>().
PS - std::ref is also really just a pointer when you drill down into it.
Its allowed because C++ allows pretty much anything when you cast.
But as for the behavior:
pc is a 4 byte pointer
(char)pc tries to interpret the pointer as a byte, in particular the last of the four bytes
(char&)pc is the same, but returns a reference to that byte
When you first print pc, nothing has happened and you see the letter you stored
c = 'B' modifies the last byte of the 4 byte pointer, so it now points to something else
When you print again, you are now pointing to a different location which explains your result.
Since the last byte of the pointer is modified the new memory address is nearby, making it unlikely to be in a piece of memory your program isn't allowed to access. That's why you don't get a seg-fault. The actual value obtained is undefined, but is highly likely to be a zero, which explains the blank output when its interpreted as a char.
when you're casting, with a C-style cast or with a reinterpret_cast, you're basically telling the compiler to look the other way ("don't you mind, I know what I'm doing").
C++ allows you to tell the compiler to do that. That doesn't mean it's a good idea...

Casting between integers and pointers in C++

#include<iostream>
using namespace std;
int main()
{
int *p,*c;
p=(int*)10;
c=(int*)20;
cout<<(int)p<<(int)c;
}
Somebody asked me "What is wrong with the above code?" and I couldn't figure it out. Someone please help me.
The fact that int and pointer data types are not required to have the same number of bits, according to the C++ standard, is one thing - that means you could lose precision.
In addition, casting an int to an int pointer then back again is silly. Why not just leave it as an int?
I actually did try to compile this under gcc and it worked fine but that's probably more by accident than good design.
Some wanted a quote from the C++ standard (I'd have put this in the comments of that answer if the format of comments wasn't so restricted), here are two from the 1999 one:
5.2.10/3
The mapping performed by reinterpret_cast is implementation defined.
5.2.10/5
A value of integral type or enumeration type can be explicitly converted to a pointer.
A pointer converted to an integer of sufficient size (if ant such exists on the implementation)
and back to the same pointer type will have its original value; mappings between pointers and
integers are otherwise implementation-defined.
And I see nothing mandating that such implementation-defined mapping must give a valid representation for all input. Otherwise said, an implementation on an architecture with address registers can very well trap when executing
p = (int*)10;
if the mapping does not give a representation valid at that time (yes, what is a valid representation for a pointer may depend of time. For instance delete may make invalid the representation of the deleted pointer).
Assuming I'm right about what this is supposed to be, it should look like this:
int main()
{
int *p, *c;
// Something that creates whatever p and c point to goes here, a trivial example would be.
int pValue, cValue;
p = &pValue;
c = &cValue;
// The & operator retrieves the memory address of pValue and cValue.
*p = 10;
*c = 20;
cout << *p << *c;
}
In order to assign or retrieve a value to a variable referenced by a pointer, you need to dereference it.
What your code is doing is casting 10 into pointer to int (which is the memory address where the actual int resides).
addresses p and c may be larger than int.
The problem on some platforms you need
p = (int*) (long) 10;
See GLIB documentation on type conversion macros.
And for the people who might not find a use for this type of expressions, it is possible to return data inside pointer value returning functions. You can find real-world examples, where this case it is better to use this idiom, instead of allocating a new integer on the heap, and return it back - poor performance, memory fragmentation, just ugly.
You're assigning values (10 and 20) to the pointers which obviously is a potential problem if you try to read the data at those addresses. Casting the pointer to an integer is also really ugly. And your main function does not have a return statement. That is just a few things.
there is more or less everything wrong with it:
int *p,*c;
p=(int*)10;
c=(int*)20;
afterwards p is pointing to memory address 10
afterwards c is pointing to memory address 20
This doesn't look very intentional.
And I suppose that the whole program will simply crash.