Deep Copy Linked List - O(n) - c++

I'm trying to deep copy a linked list . I need an algorithm that executes in Linear Time O(n). This is what i have for now , but i'm not able to figure out what's going wrong with it. My application crashes and i'm suspecting a memory leak that i've not been able to figure out yet. This is what i have right now
struct node {
struct node *next;
struct node *ref;
};
struct node *copy(struct node *root) {
struct node *i, *j, *new_root = NULL;
for (i = root, j = NULL; i; j = i, i = i->next) {
struct node *new_node;
if (!new_node)
{
abort();
}
if (j)
{
j->next = new_node;
}
else
{
new_root = new_node;
}
new_node->ref = i->ref;
i->ref = new_node;
}
if (j)
{
j->next = NULL;
}
for (i = root, j = new_root; i; i = i->next, j = j->next)
j->ref =i->next->ref;
return new_root;
}
Can anyone point out where i'm going wrong with this ??

This piece alone:
struct node *new_node;
if (!new_node)
{
abort();
}
Seems good for a random abort() happening. new_node is not assigned and will contain a random value. The !new_node expression could already be fatal (on some systems).
As a general hint, you should only require 1 for-loop. Some code upfront to establish the new_root.
But atruly deep copy would also require cloning whatever ref is pointing to. It seems to me the second loop assigns something from the original into the copy. But I'm not sure, what is ref ?

One thing I immediately noticed was that you never allocate space for new_node. Since auto variables are not guaranteed to be initialized, new_node will be set to whatever value was in that memory before. You should probably start with something like:
struct node *new_node = (new_node *) malloc(sizeof(struct node));
in C, or if you're using C++:
node* new_node = new node;
Copying the list is simple enough to do. However, the requirement that the ref pointers point to the same nodes in the new list relative to the source list is going to be difficult to do in any sort of efficient manner. First, you need some way to identify which node relative to the source list they point to. You could put some kind of identifier in each node, say an int which is set to 0 in the first node, 1 in the second, etc. Then after you've copied the list you could make another pass over the list to set up the ref pointers. The problem with this approach (other that adding another variable to each node) is that it will make the time complexity of the algorithm jump from O(n) to O(n^2).

This is possible, but it takes some work. I'll assume C++, and omit the struct keyword in struct node.
You will need to do some bookkeeping to keep track of the "ref" pointers. Here, I'm converting them to numerical indices into the original list and then back to pointers into the new list.
node *copy_list(node const *head)
{
// maps "ref" pointers in old list to indices
std::map<node const *, size_t> ptr_index;
// maps indices into new list to pointers
std::map<size_t, node *> index_ptr;
size_t length = 0;
node *curn; // ptr into new list
node const *curo; // ptr into old list
node *copy = NULL;
for (curo = head; curo != NULL; curo = curo->next) {
ptr_index[curo] = length;
length++;
// construct copy, disregarding ref for now
curn = new node;
curn->next = copy;
copy = curn;
}
curn = copy;
for (size_t i=0; i < length; i++, curn = curn->next)
index_ptr[i] = curn;
// set ref pointers in copy
for (curo = head, curn = copy; curo != NULL; ) {
curn->ref = index_ptr[ptr_index[curo->ref]];
curo = curo->next;
curn = curn->next;
}
return copy;
}
This algorithm runs in O(n lg n) because it stores all n list elements in an std::map, which has O(lg n) insert and retrieval complexity. It can be made linear by using a hash table instead.
NOTE: not tested, may contain bugs.

Related

Using an array of pointers-to-pointers to manipulate the pointers it points to (C++)

I've been doing this as an exercise on my own to get better at C++ (messing around with a linked list I wrote). What I want to do is to reverse the list by twisting the pointers around, rather than just 'printing' the data out in reverse (which is relatively straightforward).
I have an array of pointers-to-pointers, each pointing to a node in a linked list. But this is less a question about linked-list dynamics (which I understand), and more about pointer magick.
A node looks like this,
template<class T>
struct node {
T data;
node *next;
node(T value) : data(value), next(nullptr) {}
};
And the code in question,
node<T> **reverseArr[listLength];
node<T> *parser = root;
for (auto i : reverseArr) {
i = &parser;
parser = parser->next;
}
root = *(reverseArr[listLength - 1]);
for (int ppi = listLength - 1; ppi >= 0; --ppi) {
if (ppi == 0) {
(*reverseArr[ppi])->next = nullptr;
//std::cout << "ppi is zero!" << "\t";
}
else {
(*reverseArr[ppi])->next = (*reverseArr[ppi - 1]);
//std::cout << "ppi, 'tis not zero!" << "\t";
}
}
My logic:
The new root is the last element of the list,
Iterate through the array in reverse,
Set the current node's next pointer to the previous one by setting the current node's nextNode to the next node in the loop.
What's happening:
If I leave the debug print statements commented, nothing. The function's called but the linked list remains unchanged (not reversed)
If I uncomment the debug prints, the program seg-faults (which doesn't make a whole lot of sense to me but seems to indicate a flaw in my code)
I suspect there's something I'm missing that a fresh pair of eyes might catch. Am I, perhaps, mishandling the array (not accounting for the decay to a pointer or something)?
You're overthinking the problem. The correct way to reverse a single-linked list is much simpler than you think, and does not involve arrays at all.
All you need to do is walk through the list setting each node's next pointer to the head of the list, then set the head of the list to that node. Essentially, you are unlinking each node and inserting it at the start of the list. Once you reach the end, your list is reversed.
It just requires a bit of care, because the order that you do things is important. Something like this should do it:
template <class T>
node<T> * reverse( node<T> * head )
{
node<T> *current = head;
head = NULL;
while( current != NULL )
{
// store remainder of list
node<T> *remain = current->next;
// re-link current node at the head
current->next = head;
head = current;
// continue iterating remainder of list
current = remain;
}
return head;
}
The operation has a linear time complexity. You would invoke it by passing your list's head node as follows:
root = reverse( root );
It should go without saying that it would be a bad idea to call this function with any node that is not the head of a list, or to pass in a list that contains cycles.

Inserting a Node in a linked list using double pointers

I recently saw an implementation of deleting a node from a single linked list using a double pointer . Other than making the code more beautiful does this implementation have any benefits efficiency wise . Also how can I implement a similar approach towards inserting a node to a linked list ( without keeping track of previous Node ). I am really curious if there is any better algorithm out there to achieve this
Node* Delete(Node *head, int value)
{
Node **pp = &head; /* pointer to a pointer */
Node *entry = head;
while (entry )
{
if (entry->value == value)
{
*pp = entry->next;
}
pp = &entry->next;
entry = entry->next;
}
return head;
}
For insertion to the back of a list storing only the head, no tail (which would imply a small list where linear-time insertion is acceptable), you can do this by introducing the extra pointer indirection to eliminate special cases:
Simple Version (Pointers to Pointers to Nodes)
void List::push_back(int value)
{
// Point to the node link (pointer to pointer to node),
// not to the node.
Node** link = &head;
// While the link is not null, point to the next link.
while (*link)
link = &(*link)->next;
// Set the link to the new node.
*link = new Node(value, nullptr);
}
... which you can reduce to just:
void List::push_back(int value)
{
Node** link = &head;
for (; *link; link = &(*link)->next) {}
*link = new Node(value, nullptr);
}
As opposed to, say:
Complex Version (Pointers to Nodes)
void List::push_back(int value)
{
if (head)
{
// If the list is not empty, walk to the back and
// insert there.
Node* node = head;
while (node->next)
node = node->next;
node->next = new Node(value, nullptr);
}
else
{
// If the list is empty, set the head to the new node.
head = new Node(value, nullptr);
}
}
Or to be fair and remove comments:
void List::push_back(int value)
{
if (head)
{
Node* node = head;
for (; node->next; node = node->next) {}
node->next = new Node(value, nullptr);
}
else
head = new Node(value, nullptr);
}
No Special Case for Simple Version
The main reason the first version doesn't have to special case empty lists is because if we imagine head is null:
Node** link = &head; // pointer to pointer to null
for (; *link; link = &(*link)->next) {}
*link = new Node(value, nullptr);
Then the for loop condition is immediately false and we then assign the new node to the head. We don't have to check for that case separately outside the loop when we use pointers to pointers.
Insertion Sort
If you want to do an insertion sort instead of simply inserting to the back, then this:
void List::insert_sorted(int value)
{
Node** link = &head;
for (; *link && (*link)->value < value; link = &(*link)->next) {}
// New node will be inserted to the back of the list
// or before the first node whose value >= 'value'.
*link = new Node(value, *link);
}
Performance
As for performance, not sure it makes much difference to eliminate the extra branch, but it definitely makes the code tighter and reduces its cyclomatic complexity. The reason Linus considers this style to be "good taste" is because in C, you often have to write linked list logic often since it's not so easy and necessarily worthwhile to generalize linked lists since we have no class templates there, e.g., so it's handy to favor a smaller, more elegant, less error-prone way to write this stuff. Plus it demonstrates that you understand pointers quite well.
Other than making the code more beautiful does this implementation
have any benefits efficiency wise.
Don't have anything to compare this to so hard to say but this is about as efficient as you can remove a node from a linked list. Note that the function name Delete would be more accurate as Remove since it does not actually clean up the node it removes from the list.
Also how can I implement a similar approach towards inserting a node
to a linked list ( without keeping track of previous Node ).
One way is to look ahead. Best shown in an example following the format of your Delete function.
void insert(Node *head, int value)
{
Node *entry = head;
while (entry)
{
if (entry->next == NULL)
{
entry->next = new Node(NULL, value);
return;
}
else if (value < entry->next->value)
{
entry->next = new Node(entry->next, value);
return;
}
entry = entry->next;
}
}

Splitting Doubly linked list C++

I was wondering if I could get any insight as to why I get the error: terminating with uncaught exception of type std::out_of_range: coordinates (400,0) outside valid range of [0, 399] X [0, 239] when I run the following code.
The method is suppose to split the DLL at the index passed to it. So split(2) would leave the list it was called on with 2 nodes, while the rest of the old list would be returned as new list. I've tried many different ways to write this and none of them work (we must use unique_ptrs for the assignment).
template <class T>
auto dl_list<T>::split(node* start, unsigned split_point)
-> std::unique_ptr<node>
{
assert(split_point > 0);
if(start == nullptr)
return nullptr;
node* curr = start;
uint64_t i = 0;
while(i < split_point)
{
curr = (curr->next).get();
i++;
}
std::unique_ptr<node> temp{nullptr};
temp = std::move(curr->prev->next);
(temp)->prev = nullptr;
return temp;
}
Here is the wrapper method, if that helps:
template <class T>
auto dl_list<T>::split(unsigned split_point) -> dl_list
{
if (split_point >= size_)
return {};
if (split_point == 0)
{
dl_list lst;
swap(*this);
return lst;
}
auto old_size = size_;
auto new_head = split(head_.get(), split_point);
// set up current list
size_ = split_point;
for (tail_ = head_.get(); tail_->next; tail_ = tail_->next.get())
;
// set up returned list
dl_list ret;
ret.head_ = std::move(new_head);
for (ret.tail_ = ret.head_.get(); ret.tail_->next;
ret.tail_ = ret.tail_->next.get())
;
ret.size_ = old_size - split_point;
return ret;
}
Here’s code to unlink a specified node plus rest of list, on the node’s left (prev) side, assuming that there always is a previous node (i.e. a list with header node):
auto unlinked_list_from( Node* p )
-> Node*
{
p->prev->next = nullptr;
p->prev = nullptr;
return p;
}
Finding the n’th node is a separate problem.
Combining the two, in finding the n’th node and unlinking it and the rest of the list, is a higher level problem where you use the two lower level solutions.
It’s generally not a good idea to user ownership smart-pointers such as std::unique_ptr or std::shared_ptr in the internal pointers of linked list or tree. It’s simply not the case that each node owns the previous and next node, in the sense of having responsibility for destruction. This is one situation where raw pointers rule.
But instead, except for learning and for advanced programming, use standard library containers such as std::vector.

Randomly shuffling a linked list

I'm currently working on a project and the last piece of functionality I have to write is to shuffle a linked list using the rand function.
I'm very confused on how it works.
Could someone clarify on how exactly I could implement this?
I've looked at my past code examples and what I did to shuffle an array but the arrays and linked lists are pretty different.
Edit:
For further clarifications my Professor is making us shuffle using a linked list because he is 'awesome' like that.
You can always add another level of indirection... ;)
(see Fundamental theorem of software engineering in Wikipedia)
Just create an array of pointers, sized to the list's length, unlink items from the list and put their pointers to the array, then shuffle the array and re-construct the list.
EDIT
If you must use lists you might use an approach similar to merge-sort:
split the list into halves,
shuffle both sublists recursively,
merge them, picking randomly next item from one or the other sublist.
I don't know if it gives a reasonable random distribution :D
bool randcomp(int, int)
{
return (rand()%2) != 0;
}
mylist.sort(randcomp);
You can try iterate over list several times and swap adjacent nodes with certain probablity. Something like this:
const float swapchance = 0.25;
const int itercount = 100;
struct node
{
int val;
node *next;
};
node *fisrt;
{ // Creating example list
node *ptr = 0;
for (int i = 0; i < 20; i++)
{
node *tmp = new node;
tmp.val = i;
tmp.next = ptr;
ptr = tmp;
}
}
// Shuffling
for (int i = 0; i < itercount; i++)
{
node *ptr = first;
node *prev = 0;
while (ptr && ptr->next)
{
if (std::rand() % 1000 / 1000.0 < swapchance)
{
prev->next = ptr->next;
node *t = ptr->next->next;
ptr->next->next = ptr;
ptr->next = t;
}
prev = ptr;
ptr = ptr->next;
}
}
The big difference between an array and a linked list is that when you use an array you can directly access a given element using pointer arithmetic which is how the operator[] works.
That however does not preclude you writing your own operator[] or similar where you walk the list and count out the nth element of the list. Once you got this far, removing the element and placing it into a new list is quite simple.
The big difference is where the complexity is O(n) for an array it becomes O(n^2) for a linked list.

What is the pointer-to-pointer technique for the simpler traversal of linked lists? [duplicate]

This question already has answers here:
An interesting C linked list idiom
(11 answers)
Closed 5 years ago.
Ten years ago, I was shown a technique for traversing a linked list: instead of using a single pointer, you used a double pointer (pointer-to-pointer).
The technique yielded smaller, more elegant code by eliminating the need to check for certain boundary/edge cases.
Does anyone know what this technique actually is?
I think you mean double pointer as in "pointer to a pointer" which is very efficient for inserting at the end of a singly linked list or a tree structure. The idea is that you don't need a special case or a "trailing pointer" to follow your traversal pointer once you find the end (a NULL pointer). Since you can just dereference your pointer to a pointer (it points to the last node's next pointer!) to insert. Something like this:
T **p = &list_start;
while (*p) {
p = &(*p)->next;
}
*p = new T;
instead of something like this:
T *p = list_start;
if (p == NULL) {
list_start = new T;
} else {
while (p->next) {
p = p->next;
}
p->next = new T;
}
NOTE: It is also useful for making efficient removal code for a singly linked list. At any point doing *p = (*p)->next will remove the node you are "looking at" (of course you still need to clean up the node's storage).
By "double-pointer", I think you mean "pointer-to-pointer". This is useful because it allows you to eliminate special cases for either the head or tail pointers. For example, given this list:
struct node {
struct node *next;
int key;
/* ... */
};
struct node *head;
If you want to search for a node and remove it from the list, the single-pointer method would look like:
if (head->key == search_key)
{
removed = head;
head = head->next;
}
else
{
struct node *cur;
for (cur = head; cur->next != NULL; cur = cur->next)
{
if (cur->next->key == search_key)
{
removed = cur->next;
cur->next = cur->next->next;
break;
}
}
}
Whereas the pointer-to-pointer method is much simpler:
struct node **cur;
for (cur = &head; *cur != NULL; cur = &(*cur)->next)
{
if ((*cur)->key == search_key)
{
removed = *cur;
*cur = (*cur)->next;
break;
}
}
I think you mean doubly-linked lists where a node is something like:
struct Node {
(..) data // The data being stored in the node, it can be of any data type
Node *next; // A pointer to the next node; null for last node
Node *prev; // A pointer to the previous node; null for first node
}
I agree with the comments about using the STL containers for handling your list dirty work. However, this being Stack Overflow, we're all here to learn something.
Here's how you would normally insert into a list:
typedef struct _Node {
void * data;
Node * next;
} Node;
Node * insert( Node * root, void * data ) {
Node * list = root;
Node * listSave = root;
while ( list != null ) {
if ( data < list->data ) {
break;
}
listSave = list;
list = list->next;
}
Node * newNode = (Node*)malloc( sizeof(Node) );
newNode->data = data;
/* Insert at the beginning of the list */
if ( listSave == list ) {
newNode->next = list;
list = newNode;
}
/* Insert at the end of the list */
else if ( list == null ) {
listSave->next = newNode;
newNode->next = null;
list = root;
}
/* Insert at the middle of the list */
else {
listSave->next = newNode;
newNode->next = list;
list = root;
}
return list;
}
Notice all the extra checking you have to do depending on whether the insertion occurs at the beginning, end or middle of the list. Contrast this with the double pointer method:
void insert( Node ** proot, void * data ) {
Node ** plist = proot;
while ( *plist != null ) {
if ( data < (*plist)->data ) {
break;
}
plist = &(*plist)->next;
}
Node * newNode = (Node *)malloc( sizeof(Node) );
newNode->data = data;
newNode->next = *plist;
*plist = newNode;
}
As Evan Teran indicated, this works well for singly linked lists, but when it's doubly linked, you end up going through just as many if not more manipulations as the single pointer case. The other draw back is that you're going through two pointer dereferences for each traversal. While the code looks cleaner, it probably doesn't run as quickly as the single pointer code.
You probably mean a doubly-linked list, with one of the pointers going forward and the other going backward. This allows you to get to the next and previous nodes for a given node without having to remember the last one or two nodes encountered (as in a singly-linked list).
But the one thing I discovered which made the code even more elegant was to always have two dummy elements in the list at all times, the first and the last. This gets rid of the edge cases for insertion and deletion since you're always acting on a node in the middle of the list.
For example, an empty list is created:
first = new node
last = new node
first.next = last
first.prev = null
last.next = null
last.prev = first
// null <- first <-> last -> null
Obviously, traversing the list is slightly modified (forward version shown only):
curr = first.next
while curr <> last:
do something with curr
curr = curr.next
The insertions are much simpler since you don't have to concern yourself with whether you're inserting at the start or end of the list. To insert before the current point:
if curr = first:
raise error
add = new node
add.next = curr
add.prev = curr.prev
curr.prev.next = add
curr.prev = add
Deletions are also simpler, avoiding the edge cases:
if curr = first or curr = last:
raise error
curr.prev.next = curr.next
curr.next.prev = curr.prev
delete curr
All very much cleaner code and at the cost of only having to maintain two extra nodes per list, not a great burden in today's huge memory space environments.
Caveat 1: If you're doing embedded programming where space still might matter, this may not be a viable solution (though some embedded environments are also pretty grunty these days).
Caveat 2: If you're using a language that already provides linked list capabilities, it's probably better to do that rather than roll your own (other than for very specific circumstances).