Pointer (trick): Memory Reference - c++

A trick question about C pointer. Read code snippet below and try explain why the list value changed (this question was based on this code):
tail has the memory address of list.
How is possible list be changed below?
typedef struct _node {
struct _node *next;
int value;
}Node;
int main(){
Node *list, *node, *tail;
int i = 100;
list = NULL;
printf("\nFirst . LIST value = %d", list);
tail =(Node *) &list;
node = malloc (sizeof (Node));
node->next = NULL;
node->value = i;
//tail in this point contains the memory address of list
tail->next = node;
printf("\nFinally. LIST value = %d", list);
printf("\nLIST->value = %d", (list->value));
return 0;
}
---- Output
First . List value = 0
why this values ??? im not expecting this ...
Finally . LIST value = 16909060
LIST->value = 100

Let's look at what happens to the memory in your program. You start with 3 local variables, all of type Node*. At the moment they all point to garbage, as they have been declared but not initialised.
An ascii art diagram of the memory might be (The layout is implementation dependant)
list node tail
--------------------------
... | 0xFE | 0x34 | 0xA3 | ...
--------------------------
You then set list to NULL, and tail to the address of node (casting away its type, a bad idea), giving you
list node tail
--------------------------
... | NULL | 0xFE | &list | ...
--------------------------
^ |
+-------------+
You then malloc a new Node, setting list to its address.
list node tail next value
--------------------------- ------------------
... | NULL | &next | &list | ... | NULL | 100 | ...
--------------------------- ------------------
^ | | ^
| +---------------------+
+--------------+
You next try to set tail->next to node. You've said that you know tail points to a Node when you did the typecast, so the compiler believes you. The Node tail points to starts at list's address, like so
tail list
next value next value
---------------------------------- ------------------
... | NULL | &list->next | &list | ... | NULL | 100 | ...
---------------------------------- ------------------
You then set tail->next to node, making both list and node point to the list structure.
list node tail next value
--------------------------- ------------------
... | &next | &next | &list | ... | NULL | 100 | ...
--------------------------- ------------------
| ^ | | ^
| | +---------------------|
| +-------------+ |
+-----------------------------+
You've printed list as a signed integer ("%d"). This is a bad idea - if you are using a 64 bit machine and have other arguments in the printf statement they may be clobbered, use the pointer format ("%p") instead. list->value is the same as node->value, so it's still going to be 100.
Pointers become easier if you think about how they actually are represented in the machine - as an index to a huge array which holds all of your data (modulo pointer sizes, virtual memory etc.).
Next time it might be easier just to use list = node.

The following line is wrong:
tail =(Node *) &list;
You take the address of the variable list, which is actually of type Node **.
Then you cast it to a Node *. Although you can do this in C/C++, this is probably not want you intended.
To get the wanted behavior, tail should be of type Node **. So no casting is needed anymore, and at the end, you need to write (*tail)->next = node.

The reason tail has memory address of list is in this line
tail =(Node *) &list;
which means, assign the address of the pointer pointed to by list to the pointer variable tail.
And since tail and list both point to the same address, that is the basics of setting up the linked-list data structure.
Edit:
Speaking of which, there is NO reference to Node as you have a struct _node declared... Amended this to take into account of the OP's code posting that left out Node....

The problem is in setting
tail = (Node*) &list
Thus list is a Node*, tail is a Node** , which is cast to Node*. Now here
tail->next == (*tail)+0 == (*&list)+0
thus
tail->next == list
Thus changing tail->next is the same as changing list.

The line
tail =(Node *) &list;
assigns the address of list to tail. Since &list is a Node **, the compiler doesn't like this assignment by default, so you add an explicit cast to silence it. Then
tail->next = node;
changes a member value in the struct supposedly pointed to by tail (at least the compiler believes it is a struct, since you explicitly told it so). Since next is the first member of the struct, its address is most likely the same as that of the struct itself. And since tail points to the address of list, in effect this assignment changes the value of list, which is a pointer to a _node. That is, it makes list point to node.
What you probably want is
Node** tail;
...
tail = &list;
...
(*tail)->next = node;
That is, declare tail as a pointer to pointer to _node, and add the extra indirection (*) when assigning a value through it.

By assigning the address of list to tail, you cause list and tail->next to refer to the same location in memory; when you assign to one, you clobber the other.
Let's start by looking at a hypothetical memory map of node after allocation and assignments (assuming 4 byte pointers and ints):
Object Address 0x00 0x01 0x02 0x03
------ ------- ---- ---- ---- ----
node 0x08000004 0x10 0x00 0x00 0x00 // points to address 0x10000000
...
node.next 0x10000000 0x00 0x00 0x00 0x00 // points to NULL
node.value 0x10000004 0x00 0x00 0x00 0x64 // value = 100 decimal
When you write node->next = NULL, you're assigning NULL to memory location 0x10000000. IOW, the value of node corresponds to the address where node->next will be found.
Now let's look at a hypothetical layout of list, node, and tail after you've assigned list and tail
Object Address 0x00 0x01 0x02 0x03
------ ------- ---- ---- ---- ----
list 0x08000000 0x00 0x00 0x00 0x00 // after list = NULL
node 0x08000004 0x10 0x00 0x00 0x00 // after node = malloc(sizeof *node);
tail 0x08000008 0x08 0x00 0x00 0x00 // after tail = (Node*) &list;
So now here's the memory map of tail after you've assigned tail->next:
Object Address 0x00 0x01 0x02 0x03
------ ------- ---- ---- ---- ----
tail 0x08000008 0x08 0x00 0x00 0x00 // points to address 0x80000000,
... // which is where list lives
tail.next 0x08000000 0x08 0x00 0x00 0x04 // points to node
tail.value 0x08000004 0x10 0x00 0x00 0x00 // value = some big number
Presto: list now contains the address of node.
Please for the love of God never do this in production code.

Isn't this line...
tail =(Node *) &list;
assigning to tail the address of the pointer to list, not the address of list?

If memory is changing unexpectedly the quickest way to track down the issue is to configure a breakpoint conditional on the memory change, including the size of the memory block of interest - 4 in this case, assuming it's a 32-bit platform pointer. In Windows (Visual Studio IDE or Windbg) this is easy to do - I have no info on other systems.
Usually you will find what's causing the issue very quickly this way.

Related

What is the difference between following two snippets?

char *t=new char();
and
char *t=new char[102];
as my code was accepted by using the latter one?
//BISHOPS SPOJ
char *t=new char();
Allocates memory for a single character, and calls the default constructor.
char *t=new char[102];
Creates an array of 102 chars and calls the default constructor.
As the default constructor for POD types is nothing, the difference is the amount of memory allocated (single char vs array of char)
Actually both are pointers on char, but second is pointer to char array.
Which allows you to store 102 characters into the array.
char *t=new char[102];
0 1 2 3 101
+---+---+---+---+ ... +---+
| | | | | |
+---+---+---+---+ ... +---+
^
|
-----------------------------+---+
| * |
+---+
t
It allows you to dereference these indexes 0-101.
While first one allows you to store only one character.
char *t=new char();
0
+---+
| |
+---+
^
|
-----------------------------+---+
| * |
+---+
t
Where dereferencing other index than 0 would lead to access outside of bounds and undefined behavior.
Deleting
To delete an character char *t=new char();
delete t;
Where to delete an array char *t=new char[102]; you have to write empty brackes, to explicitly say its an array.
delete [] t;
Same with these codes
char *t = new char[10]; // Poitner to array of 10 characters
char *t = new char(10); // Pointer to one character with value of 10
Memory initialialization
char *t = new char(); // default initialized (ie nothing happens)
char *t = new char(10); // zero initialized (ie set to 0)
Arrays:
char *t = new char[10]; // default initialized (ie nothing happens)
char *t = new char[10](); // zero initialized (ie all elements set to 0)

Understanding address of values and dereferecing pointers

I'm trying to have a better understanding of pointers and how they work. I'm also trying to understand the idea of dereferencing pointers. This is my understanding of pointers, and what may be possible of pointers.
This is a table representing 3 cells of memory. In each cell there is an address, cell name, and value.
+---------------------------+
| 1672 x | 1673 y | 5 |
| 1673 | 5 | 65 |
+---------------------------+
This is the initialization for the three cell blocks, assuming address 5 hasn't been initialized.
int x = 1673;
int y = 5;
This is how pointers are commonly used.
int* p_x = &x; //&x == 1672);
*p_x == 1673;
This would be true if you forgot to place ampersand in from of variable name.
int* p_y = y; //y == 5);
*p_y == 65;
If everything else is true would this also be true? If you were using the pointer to the address of x for a simple return on a member function, could you just skip a declaration of the pointer and just send the dereferenced address?
*&x == &y;
The lines with == will not compile, and are not actual c++ code. They are just an effective way to show equivalency.

Recursively copying all leaves from BST to another BST

I've worked out the algorithm on how to copy the leaves from a BST to another BST.
Check if the tree is empty
If we reach a leaf, copy the data to the destination BST.
Call the recursive function with source BST -> left and source BST -> right with destination's pointer to the respective direction
Otherwise, recursively call without the dest -> left or dest -> right (since we would be going into null).
Would this algorithm work?
434 int copy_leaves(node * source, node * & dest)
435 {
436 if (!source)
437 {
438 dest = NULL;
439 return 0;
440 }
441
442 if (!(source -> left) && !(source -> right))
443 {
444 dest = new node;
445 dest -> data = source -> data;
XXX dest -> left = dest -> right = NULL;
446 }
447
448 return copy_leaves(source -> left, dest) + //???
449 copy_leaves(source -> right, dest) + 1; //???
450 }
Ok I tried implementing my algorithm and there are several faults. I do not quite know where to do the recursive call. I know that I am reaching null after two invocations (then we know the node is a leaf) which means that I copy the data. I don't understand where to pass dest->right and dest -> left for the recursive calls.
I don't see how this works, as written. I'll echo this with some indentation:
BST_copy(src, dst)
# step 1
if tree is empty
<action not specified; assume return>
else
# step 2
if src is a leaf
copy data to the destination tree
# step 3
BST_copy(src->left, dst->left)
BST_copy(src->right, dst->right)
#step 4
else
BST_copy(src->left, dst)
BST_copy(src->right, dst)
Step 1: you haven't specified the action; please fill in.
Step 2: Does the destination tree already have a full structure identical to that of the source tree? If not, how are you managing the copy?
Step 3: If the tree is a leaf, then there are no left & right subtrees; why are you recurring when you know the links are null?
Step 4: This gets your structures out of synch; you've gone down one level in the source tree, along two branches, but you haven't descended in the destination tree. If this works in your set-up, then there's something about the tree structure or copy operation that is not yet in this algorithm.

How binary search tree insertion works using recursion?

I'm having some trouble understanding the recursive part of binary search tree insertion.
bstnode* insert(bstnode* root,int data)
{
if(root==NULL){
bstnode* tmp= new bstnode();
tmp->data=data;
tmp->left=tmp->right=NULL;
return tmp;
}
if(data<root->data)
root->left = insert(root->left, data);
else
root->right = insert(root->right, data); //can't understand the logic here
return root;
}
/* consider following BST with their addresses[]:
15 [100]
/ \
10 20 [200]
\
tmp [300]
*/
According to me root->right = insert(root->right, data); should store the address of the newly created node in root->right so this code shouldn't work for tree with height>2.
However, it is working perfectly for any number of nodes.
I must be missing some crucial details here.
suppose I want to insert 25 in BST i.e. insert(root,25);
as 25>15:- I'm breaking down the recursive part here:
root->right = insert(root->right, 25);
or 15->right = insert(15->right,25); Here, recursively calling it again because 25>20
insert(root->right, 25) => root->right->right = insert(root->right->right, 25);
or insert(15->right, 25) => 20->right = insert(20->right, 25);
insert(20->right,25) is NULL so a new node tmp is created.
insert(20->right,25); returns tmp.
unwinding the recursion now.
//20->right = insert(20->right, 25);
so,
20->right= 300 (tmp address);
//insert(15->right, 25) => 20->right
//and 15->right = insert(15->right,25);
15->right = 20->next;
therefore 15->right = [300] address.
or
root->right = [300] address.
what's wrong with my approach?
Again an overview of recursive calls:
15->right = insert(15->right,25);
15->right = [20->right = insert(20->right,25)]; //20->right is NULL so creating new node
15->right = [20->right= 300 address of tmp];
15->right = [20->right or 300]
15->right = [300] // but in reality 15->right = [200]
you are forgetting that root->right is the root->right of the address you are passing into the function as root. every call to insert passes in root->right or root->left depending on which way you traverse.
This statement is incorrect:
root->right = root->right->right = tmp;
once an iteration of the function is returned it is removed from the stack so in this case we have 3 calls I will put your numbers in place of the pointer value.
insert(15->right,25)
insert(20->right,25)
the last one is null so it creates the node with 25 and returns it to the call insert(20->right,25) and sets 25 as 20->right so you have a tree that looks like this
/* consider following BST with their addresses[]:
20 [200]
\
25 [300]
*/
it then returns this tree to the call insert(15->right,25) and sets that trees right to the tree we just returned which so we get your final tree
/* consider following BST with their addresses[]:
15 [100]
/ \
30 20 [200]
\
25 [300]
*/
EDIT: let me see if I can clarify. Lets look at your tree again
/* consider following BST with their addresses[]:
15 [100]
/ \
10 20 [200]
\
tmp [300]
*/
we want to insert 25 so we call (again I will use the value at that node of the tree to represent the pointer we are passing)
insert(15, 25)
this then calls insert on root->right which happens to be 20
insert(20, 25)
this calls insert again on 20 right node now which happens to be null
insert(null,25)
so lets now look at the returns
insert(null,25) returns a node with 25 in it and then is remove from the stack
return 25;
insert(20,25) gets its return of a node with 25. it sets its right child to 25 which looks like this
20->right = 25;
return 20;
now we are back to the original call of insert(15,25). it got returned 20. so it does
15->right = 20;
return 15;
I think the confusion may be coming from two different sources for you.
First the tree commented into your code would not be possible. Second is that a new node is only created when the function is passed in a null pointer. Only values less than 15 can go to the left. It would be something like this instead (depending on add order):
15
/ \
20
/ \
30
When you go to add 25 to this it will end up as follows:
15
/ \
20
/ \
30
/
25
I will try and step through the code on this to explain. When adding 25 to the original tree on the first function call the first node is not NULL and 25 > 15 so the
else
{
root->right = insert(root->right, data);
}
is called. This calls the same insert function recursively but is now using the 20 node as it's comparison. Again not null and 25 > 20 so call insert on right node as above. This again calls the recursive function but now on 30. 25<30 so it calls the function on the left node. At this point the function as been passed in a NULL pointer as there is nothing there and a new node is created and placed in this spot.
Note that insert() always returns the root that was passed to it as an argument unless root == NULL. There's therefore no way for the new node you insert to "walk up the tree". What happens in the recursive call doesn't matter -- you always return the same root that you were passed in the non-NULL case.
Despite the way some people teach recursion, I think it helps (for my brain anyway) to not try to unroll the recursion, and instead consider whether the logic makes sense:
If you are passed a non-NULL node and data < root->data, would you get the correct result if you do root->left = insert(root->left, data) and assume the insert() magically "just works" (i.e., that it inserts data into the left tree and returns the root of that tree)?
If the logic checks out for both the left and right case, you then consider the base case: If you are passed a NULL node, will you return the correct one-element tree?
If the logic checks out for the base case too, then you know your code must be correct, since the recursive steps make sense and you know that you will land in a base case that also makes sense (since you will eventually reach a NULL node as you walk down the tree).
In a way you are correct. You can never have a sub-tree (not tree) of height >2.
In this code, you will never have a root->right->right because, as far as the code is concerned, when you call
root->left = insert(root->left, data);
the (local) root pointer is now pointing to the node you just inserted. the (local) root is pointing to root->left.
Therefore, you CAN have a tree of any height(However, the local root pointer is pointing to a sub-tree of height <2)

How do you move between nodes of a linked list?

This is a piece of code that tries to build a linked list.
struct node {
char name[20];
int age;
int height;
node* next; // Pointer to the next node
};
node* startPTR = NULL;
void addNode_AT_END() {
node *temp1;
node *temp2;
temp1 = new node;
cout << "Enter the name : ";
cin >> temp1->name;
cout << endl << "Enter the age : ";
cin >> temp1->age;
cout << endl << "Enter height : ";
cin >> temp1->height;
temp1->next = NULL;
if( startPTR == NULL) {
startPTR = temp1;
} else {
temp2 = startPTR;
while( temp2->next != NULL )
temp2 = temp2->next;
temp2->next = temp1;
}
}
The following is diagram after 2 back to back calls to the above function.
start = addr1;
|
V
(addr1) ----> (addr2) ----> (NULL) at end
^
|
temp2
where addr1 and addr2 are the address of the first and second nodes respectively.
What happens after the third call ? How the iteration will go on for the third call? I am unable to understand how the list links up after the second call.According to me all that has been build up till know will vanish.Then how will list move further ? How is node placed during the third call?
Here is where all the magic happens:
1. temp2 = startPTR;
2. while( temp2->next != NULL )
3. temp2 = temp2->next;
4. temp2->next = temp1;
First, temp2 will point to the beginning of the list. In lines 2 and 3, you change temp2 to the next node until you reach the node where temp2->next is NULL. This node is the last node of the list, regardless of the size of the list.
Finally, in line 4 you change temp2->next to temp1 so now it points to the new node (that is last node now points to the new node). temp1->next is also NULL, so temp1 now represents the end of the list.
After line 1 you have
start = addr1;
|
V
(addr1) ----> (addr2) ----> (NULL)
^
|
temp2
temp2->next is not NULL (it is addr2), so you iterate and execute line 3 and you get:
start = addr1;
|
V
(addr1) ----> (addr2) ----> (NULL)
^
|
temp2
temp2->next is now NULL. So you stop the loop and execute line 4 and you get:
start = addr1;
|
V
(addr1) ----> (addr2) ----> (addr3) ----> (NULL)
^ ^
| |
temp2 temp1
Note: Do you know how pointers work? Imagine this: You have a node, which is some data in the memory. When you have variables in memory, these variables have addresses. Let's say addr1 is 10, addr2 is 150 and addr3 (which is the node just newed) is 60. start has value 10. Therefore, "pointing" to the first node of the list (that is using this address, you have access to its data). One of these data, is the next field. The first node's next field has value 150, thus pointing to the next node. When you say temp2 = start, you put number 10 in temp2, at this point temp2->next has value 150. When you say temp2=temp2->next, you simply put value 150 in temp2, overwriting the previous value. This way you have effectively moved your pointer from pointing to the first node, to now pointing to the second node. Now temp2->next is NULL (that is 0). When you now say temp2->next=temp1, you put value 60 in the next field of temp2. So now temp2->next is 60. temp2->next->next is NULL.
It's pretty simple. The while cycle will move temp2 to the last element. Then the node you created, pointed by temp1, is assigned as temp2's next node
I don't get what's bothering you. During any call while() loop will go through all the nodes in the list untill it reaches the end, and then set the pointer in the last one to the newly allocated node (temp1).
temp1 and temp2 are pointers. they do not store data of the node, they store address in memory where data is stored. so at the end of first iteration, after startPTR = temp1 startPTR points to the same address that temp1 pointed to. it doesn't matter if temp1 is still there, since now startPTR points to the node. at the end of the second call temp2->next=temp1 (temp2==startPTR at this moment) makes next field of the node point to the newly allocated temp1