Heap Sort a Linked List - list

I'm trying to create a sort function in c++ that sorts a linked list object using Heap sort but I'm not sure how to get started. Can anyone give me any idea on how to do it ? I'm not even sure how I would sort a Linked List

Heapsort works by building a heap out of the data. A heap is only efficient to build when you have random-access to each element.
The first step is going to be creating an array of pointers to your list objects, so you can perform the usual heap sort on the array.
The last step will be converting your array of pointers back into a linked list.
A better sorting method for a linked list is an insertion sort -- not least because you can perform the sort as part of your linked list implementation's insert() function.

I have to agree with sarnolds answer. It is extremely inefficient to heap set a linked list for a number of reasons but the first being that they should have been sorted upon initial placement. That said, if I were going to try I would create an ArrayList<T> links the load all the links into it. Then you can grab that in heaps and sort them. Once you're finished just reload your linked list starting with thr head.

HeapSort is good for 2 reasons -
1- It is an In place algorithm.
2- Time complexity of O(nlogn)
The O(nlogn) is because of random access nature of array, But if you use linked list then you would not get random access advantage of array.
Hence the time complexity will become O(n^2). That is not good for sorting.
I will recommend you to use merge sort algo for linked list.

Related

What is the purpose of sorting a linked list?

I am wondering what is the purpose of sorting a linked list. Because if you need to find an element in an unsorted linked list and a sorted linked list, you have to do O(n).
Please forgive if my question is stupid
The purpose of sorting isn't always to search in logarithmic time. There are lots of other applications of sorted data obviously.
Suppose, you have to de-duplicate(remove the duplicate elements) from a large linked list and you don't have enough space to load the list items into hashtable as the list is very big. In this case, you can sort the list and remove consecutive elements if they are same and thus de-duplicate the list.
If you want to insert an element into it's appropriate position in a sorted container, sorted linked list is very handy which will guarantee linear time and constant space complexity. But for array, you need to use a temporary array and move all the elements afterwards one by one. Infact LRU cache is a doubly-linked list under the hood and keep sorted based on the recent hit on items. Newly used item and old item which is recently being accessed again, are inserted in front to keep the already sorted list sorted. If an array like structure would be used here, LRU cache can't offer of constant complexity
This is just some classic applications. You can find a lot of other applications.
Let us think a linked list is used to implement a priority queue. We can add elements of different priorities at random, but we want to process the elements of the queue according to priority, it would be useful to maintain a sorted linked list so that the top priority items appear at the beginning, and removing them from the queue is an easy operation. This not exactly sorting the list, but as and when an item is inserted, it would be placed in it's correct position based on the priority. This is similar to insertion sort of an array.

Dynamic array VS linked list in C++ [duplicate]

This question already has answers here:
Linked List vs Vector
(5 answers)
Closed 7 years ago.
Why we need a linked list when we have dynamic array list?
I have studied static list and linked list. I have knowledge of dynamic array list. but I couldn't find out the exact difference between that
Anyone please help me to answer this
Dynamic array is an array that resizes itself up or down depending on the number of content.
Advantage:
accessing and assignment by index is very fast O(1) process, since internally index access is just [address of first member] + [offset].
appending object (inserting at the end of array) is relatively fast amortized O(1). Same performance characteristic as removing objects at the end of the array. Note: appending and removing objects near the end of array is also known as push and pop.
Disadvantage:
inserting or removing objects in a random position in a dynamic array is very slow O(n/2), as it must shift (on average) half of the array every time. Especially poor is insertion and removal near the start of the array, as it must copy the whole array.
Unpredictable performance when insertion or removal requires resizing
There is a bit of unused space, since dynamic array implementation usually allocates more memory than necessary (since resize is a very slow operation)
Linked List is an object that have a general structure of [head, [tail]], head is the data, and tail is another Linked List. There are many versions of linked list: singular LL, double LL, circular LL, etc.
Advantage:
fast O(1) insertion and removal at any position in the list, as insertion in linked list is only breaking the list, inserting, and repairing it back together (no need to copy the tails)
Linked list is a persistent data structure, rather hard to explain in short sentence, see: wiki-link . This advantage allow tail sharing between two linked list. Tail sharing makes it easy to use linked list as copy-on-write data structure.
Disadvantage:
Slow O(n) index access (random access), since accessing linked list by index means you have to recursively loop over the list.
poor locality, the memory used for linked list is scattered around in a mess. In contrast with, arrays which uses a contiguous addresses in memory. Arrays (slightly) benefits from processor caching since they are all near each other
Others:
Due to the nature of linked list, you have to think recursively. Programmers that are not used to recursive functions may have some difficulties in writing algorithms for linked list (or worse they may try to use indexing).
Simply put, when you want to use algorithms that requires random access, forget linked list. When you want to use algorithms that requires heavy insertion and removal, forget arrays.
This is taken from the best answer of this question
I am convinced by this answer.
Vector aka Dynamic Array: Like regular array. The continuous memory location is used for storing vector. Whenever you need to allocate more memory, and memory is not available in the current location, the entire array is copied to another location and extra memory is allocated.
List: Maintain a pointer in each element and that pointer points to the next element.
What are the complexity guarantees of the standard containers?
Look at this link for more information.

Best data structure/ container in C++ for insertion and deletion

I am looking for the best data structure for C++ in which insertion and deletion can take place very efficiently and fast.
Traversal should also be very easy for this data structure. Which one should i go with?
What about SET in C++??
A linked list provides efficient insertion and deletion of arbitrary elements. Deletion here is deletion by iterator, not by value. Traversal is quite fast.
A dequeue provides efficient insertion and deletion only at the ends, but those are faster than for a linked list, and traversal is faster as well.
A set only makes sense if you want to find elements by their value, e.g. to remove them. Otherwise the overhead of checking for duplicate as well as that of keeping things sorted will be wasted.
It depends on what you want to put into this data structure. If the items are unordered or you care about their order, list<> could be used. If you want them in a sorted order, set<> or multiset<> (the later allows multiple identical elements) could be an alternative.
list<> is typically a double-linked list, so insertion and deletion can be done in constant time, provided you know the position. traversal over all elements is also fast, but accessing a specified element (either by value or by position) could become slow.
set<> and its family are typically binary trees, so insertion, deletion and searching for elements are mostly in logarithmic time (when you know where to insert/delete, it's constant time). Traversal over all elements is also fast.
(Note: boost and C++11 both have data structures based on hash-tables, which could also be an option)
I would say a linked list depending on whether or not you're deletions are specific and often. Iterator about it.
It occurs to me, that you need a tree.
I'm not sure about the exact structure (since you didnt provide in-detail info), but if you can put your data into a binary tree, you can achieve decent speed at searching, deleting and inserting elements ( O(logn) average and O(n) worst case).
Note that I'm talking about the data structure here, you can implement it in different ways.

trivial singly linked list complexity query

We know that lookup on a singly linked list is O(n) given a head pointer. Let us say I maintain a pointer at half the linked list at all times. Would I be improving any lookup times?
Yes, it can reduce the complexity by a constant factor of 2, provided you have some way of determining whether to start from the beginning or middle of the list (typically, but not necessarily, the list being sorted). This is, however, a constant factor, so in terms of big-O complexity, it's irrelevant.
To be relevant to big-O complexity, you need more than a constant factor change. If, for example, you had a pointer to bisect each half, and again each half of that, and so on, you'd end up with logarithmic complexity instead of linear -- and you'd have transformed your "linked list" into an (already well known) threaded tree.
Nice thought, but this still does not improve the search operation. No matter how many pointers you have at different portions of the list, you still have to analyze each element in the list. However, you -could- two threads to search each half of the list making the operation twice as fast in theory.
Only if your linked list's data is sorted. Otherwise, as already said in the other reply.
It would, but asymptotically it would be still the same. However, there is a data structure that uses this idea, it is called skip list. Skip list is a linked list where some nodes have more pointers that are pointing in some sense to the middle of the rest of list. The idea is well illustrated on this image. This structure usually has logarithmic insert find and delete.

Is the linked list only of limited use?

I was having a nice look at my STL options today. Then I thought of something.
It seems a linked list (a std::list) is only of limited use. Namely, it only really seems
useful if
The sequential order of elements in my container matters, and
I need to erase or insert elements in the middle.
That is, if I just want a lot of data and don't care about its order, I'm better off using an std::set (a balanced tree) if I want O(log n) lookup or a std::unordered_map (a hash map) if I want O(1) expected lookup or a std::vector (a contiguous array) for better locality of reference, or a std::deque (a double-ended queue) if I need to insert in the front AND back.
OTOH, if the order does matter, I am better off using a std::vector for better locality of reference and less overhead or a std::deque if a lot of resizing needs to occur.
So, am I missing something? Or is a linked list just not that great? With the exception of middle insertion/erasure, why on earth would someone want to use one?
Any sort of insertion/deletion is O(1). Even std::vector isn't O(1) for appends, it approaches O(1) because most of the time it is, but sometimes you are going to have to grow that array.
It's also very good at handling bulk insertion, deletion. If you have 1 million records and want to append 1 million records from another list (concat) it's O(1). Every other structure (assuming stadard/naive implementations) are at least O(n) (where n is the number of elements added).
Order is important very often. When it is, linked lists are good. If you have a growing collection, you have the option of linked lists, array lists (vector in C++) and double-ended queues (deque). Use linked lists if you want to modify (add, delete) elements anywhere in the list often. Use array lists if fast retrieval is important. Use double-ended queues if you want to add stuff to both ends of the data structure and fast retrieval is important. For the deque vs vector question: use vector unless inserting/removing things from the beginning is important, in which case use deque. See here for an in-depth look at this.
If order isn't important, linked lists aren't normally ideal.
std::list is notable for its splice() method, which allows you to move one more more elements from one list to another in constant time, without copying or allocating any elements or list nodes.
This question reminds me of this infamous one. Read it for the parallels as to why such simple data structures are important.
Linked List is a fundamental data structure. Other data structures, like hash maps, may use linked lists internally.
Two different algorithms may have O(1) time complexity, for a look up, but that doesn't mean they have the same performance. For example the first one may be 10 or 100 times faster than the second.
Whenever you need to store, iterate and do something with a bunch of data, the normal (and fast) data stucture for that task is the Linked List. More complex data structures are for special cases, ie Set is suitable when you don't want repeated values.
std::list has the following properties:
Sequence
Front Seuqence
Back Seuqence
Forward Container
Reverse Container
Of these properties std::vector does not have (Back Seuqence)
While std::set does not support any sequence properties or (Reverse Container)
So what does this mean?
Will a back sequence supports O(1) for rend() and rbegin() etc
For full information see:
What are the complexity guarantees of the standard containers?
Linked lists are immutable and recursive datastructures whereas arrays are mutable and imperative (=change-based). In functional programming, there are usually no arrays - You don't change elements but transform lists into new lists. While linked lists don't even need additional memory here, this isn't possible efficiently with arrays.
You can easily build or decompose lists without having to change any value.
double [] = []
double (head:rest) = (2 * head):(double rest)
In C++, which is an imperative language, you won't use lists that often. One example could be a list of spaceships in a game from which you can easily remove all spaceships that have been destroyed since the previous frame.