Delete duplicates from Doubly linked list - c++

Hello
I stumbled following question
You given unsorted doubly linked list.You should find and delete duplicates from Doubly linked list.
What is the best way to do it with minimum algorithmic complexity?
Thank you.

If the space is abundance and you have to really optimize this with time, perhaps you can use a Hashset (or equivalent in C++). You read each element and push it to the hashset. If the hashset reports a duplicate, it means that there is a duplicate. You simply would delete that node.
The complexity is O(n)

Think of it as two singly linked lists instead of one doubly linked list, with one set of links going first to last and another set going last to first. You can sort the second list with a merge sort, which will be O(n log n). Now traverse the list using the first link. For each node, check if (node.back)->key==node.key and if so remove it from the list. Restore the back pointer during this traversal so that the list is properly doubly linked again.
This isn't necessarily the fastest method, but it doesn't use any extra space.

Assuming that the potential employer believes in the C++ library:
// untested O(n*log(n))
temlate <class T>
void DeDup(std::list<T>& l) {
std::set<T> s(l.begin(), l.end());
std::list<T>(s.begin(), s.end()).swap(l);
}

With minimum complexity? Simply traverse the list up to X times (where X is the number of items), starting at the head and then delete (and reassign pointers) down the list. O(n log n) (I believe) time at worse case, and really easy to code.

Related

What is the purpose of sorting a linked list?

I am wondering what is the purpose of sorting a linked list. Because if you need to find an element in an unsorted linked list and a sorted linked list, you have to do O(n).
Please forgive if my question is stupid
The purpose of sorting isn't always to search in logarithmic time. There are lots of other applications of sorted data obviously.
Suppose, you have to de-duplicate(remove the duplicate elements) from a large linked list and you don't have enough space to load the list items into hashtable as the list is very big. In this case, you can sort the list and remove consecutive elements if they are same and thus de-duplicate the list.
If you want to insert an element into it's appropriate position in a sorted container, sorted linked list is very handy which will guarantee linear time and constant space complexity. But for array, you need to use a temporary array and move all the elements afterwards one by one. Infact LRU cache is a doubly-linked list under the hood and keep sorted based on the recent hit on items. Newly used item and old item which is recently being accessed again, are inserted in front to keep the already sorted list sorted. If an array like structure would be used here, LRU cache can't offer of constant complexity
This is just some classic applications. You can find a lot of other applications.
Let us think a linked list is used to implement a priority queue. We can add elements of different priorities at random, but we want to process the elements of the queue according to priority, it would be useful to maintain a sorted linked list so that the top priority items appear at the beginning, and removing them from the queue is an easy operation. This not exactly sorting the list, but as and when an item is inserted, it would be placed in it's correct position based on the priority. This is similar to insertion sort of an array.

trivial singly linked list complexity query

We know that lookup on a singly linked list is O(n) given a head pointer. Let us say I maintain a pointer at half the linked list at all times. Would I be improving any lookup times?
Yes, it can reduce the complexity by a constant factor of 2, provided you have some way of determining whether to start from the beginning or middle of the list (typically, but not necessarily, the list being sorted). This is, however, a constant factor, so in terms of big-O complexity, it's irrelevant.
To be relevant to big-O complexity, you need more than a constant factor change. If, for example, you had a pointer to bisect each half, and again each half of that, and so on, you'd end up with logarithmic complexity instead of linear -- and you'd have transformed your "linked list" into an (already well known) threaded tree.
Nice thought, but this still does not improve the search operation. No matter how many pointers you have at different portions of the list, you still have to analyze each element in the list. However, you -could- two threads to search each half of the list making the operation twice as fast in theory.
Only if your linked list's data is sorted. Otherwise, as already said in the other reply.
It would, but asymptotically it would be still the same. However, there is a data structure that uses this idea, it is called skip list. Skip list is a linked list where some nodes have more pointers that are pointing in some sense to the middle of the rest of list. The idea is well illustrated on this image. This structure usually has logarithmic insert find and delete.

Heap Sort a Linked List

I'm trying to create a sort function in c++ that sorts a linked list object using Heap sort but I'm not sure how to get started. Can anyone give me any idea on how to do it ? I'm not even sure how I would sort a Linked List
Heapsort works by building a heap out of the data. A heap is only efficient to build when you have random-access to each element.
The first step is going to be creating an array of pointers to your list objects, so you can perform the usual heap sort on the array.
The last step will be converting your array of pointers back into a linked list.
A better sorting method for a linked list is an insertion sort -- not least because you can perform the sort as part of your linked list implementation's insert() function.
I have to agree with sarnolds answer. It is extremely inefficient to heap set a linked list for a number of reasons but the first being that they should have been sorted upon initial placement. That said, if I were going to try I would create an ArrayList<T> links the load all the links into it. Then you can grab that in heaps and sort them. Once you're finished just reload your linked list starting with thr head.
HeapSort is good for 2 reasons -
1- It is an In place algorithm.
2- Time complexity of O(nlogn)
The O(nlogn) is because of random access nature of array, But if you use linked list then you would not get random access advantage of array.
Hence the time complexity will become O(n^2). That is not good for sorting.
I will recommend you to use merge sort algo for linked list.

diff between ADT list and linked list

What is the ( real | significiant ) difference (s) between ADT list implementation and linked list implementation
with respect to queue ?
Moreover,
Can you suggest any website with visual example of these type of lists ?
It is REALLY hard to understand this question, but in an attempt to ask what the actual question is, I believe to have figured it out. So my assumption is, that the question is: "What is the difference between std::list and std::queue. #fatai: Please correct me, when I am wrong.
The std::list is a doubly-linked list. Each element of the list "knows" the next and previous element. And the list "knows" it's beginning and end. Look here: http://www.cplusplus.com/reference/stl/list/
The std::queue is a list, with special functionality. This functionality allows you to easily insert elements at the front, and remove elements from the back. Have a look here:
http://www.cplusplus.com/reference/stl/queue/
If you want to have minimal functionality, I'd use queue. The queue is optimized for its purpose. It also prevents you from doing things accidentally wrong (such as remove an element from the middle).
I hope that answers your (confusing) question. ;-)
Erasing and inserting into middle of the list by using iterator has O(n) complexity because in the background it has to shift all the other elements. (uses special model of vector ADT, but you cant even access to list element with index mechanism).
In linked-lists erasing and inserting to list has O(1) complexity. It doesn't needs to shift the elements for the operations. Even searching an element in linked lists has O(n) complexity like the list ADT.

Is time complexity for insertion/deletion in a doubly linked list of order O(n)?

To insert/delete a node with a particular value in DLL (doubly linked list) entire list need to be traversed to find the location hence these operations should be O(n).
If that's the case then how come STL list (most likely implemented using DLL) is able to provide these operations in constant time?
Thanks everyone for making it clear to me.
Insertion and deletion at a known position is O(1). However, finding that position is O(n), unless it is the head or tail of the list.
When we talk about insertion and deletion complexity, we generally assume we already know where that's going to occur.
It's not. The STL methods take an iterator to the position where insertion is to happen, so strictly speaking, they ARE O(1), because you're giving them the position. You still have to find the position yourself in O(n) however.
Deleting an arbitrary value (rather than a node) will indeed be O(n) as it will need to find the value. Deleting a node (i.e. when you start off knowing the node) is O(1).
Inserting based on the value - e.g. inserting in a sorted list - will be O(n). If you're inserting after or before an existing known node is O(1).
Inserting to the head or tail of the list will always be O(1) - because those are just special cases of the above.