doubly linked list implementation - c++

Which one would be more efficient?
I want to keep a list of items but, it's required of me to sort list
by id,
by name
by course credits
by the user
Would it be best to add items in list by id and then sort by the others or just add items without order and sort in the order needed when ever needed by the user?

If you're really required to keep the list sorted -- as opposed to using other data structures to give sorted access to the list -- then you could simply make a list whose elements have different pointers for different sort criteria.
In other words, instead of keeping just previous and next pointers, have previousById, nextById, previousByName, previousByCredits and nextByCredits. Likewise, you would have three head and/or tail pointers, instead of just one.
Please note that this approach has the drawback of being inflexible when it comes to implementing additional sort criteria. I'm assuming that you're trying to solve a homework-type problem, which is why I tried to tailor the answer to what seem to be the homework requirements.

You can use three maps (or hashmaps):
One mapping the id to the item, one mapping name to an item reference (or pointer) and one mapping course credits to item reference again.

It would be more efficient to sort it in whichever order that you know will be sorted for the most, for example if you know you're going to be retrieving by id most often, keep it sorted by id, otherwise pick one of the others though id would be the easiest if it is just an integer field
So then to do that you would check on insert to find where newid is less than nextid but greater than previousid, then allocate a new node with new and set the pointers appropriately.
Keeping the linked list sorted in some way is better than just keeping it unsorted. You're adding some time to how long it takes to insert an item but it's negligible to how long it would take to sort it that particular way

The more efficient would be to store the nodes as is, and keep 4 different indexes up-to-date. This way, when one order is required, you just pick up the right index and that's all. The cost is O(log N) for input, and O(1) for traversal.
Of course, keeping 4 indexes at once, with perhaps different requirements on uniqueness, and in the face of possible exceptions, is relatively difficult, but then, there's a Boost library for this: Boost MultiIndex
On example is to generate a set that can be sorted either by ID or by Name.
Since you can add as many indexes as you wish, it should get you going :)

Keep your lined list objects in the lined list, in random order. To sort the list by any key, use this pseudocode:
struct LinkedList {
string name;
LinkedList *prev;
LinkedList *next;
};
void FillArray(LinkedList *first, LinkedList **output, size_t &size) {
//function creates an array of pointers to every LinkedList object
LinedList *now;
size_t i; //you may use int instead of size_t
//check, how many objects are there in linked list
now=first;
while(now!=NULL) {
size++;
now=now->next;
}
//if linked list is empty
if (size==0) {
*output=NULL;
return;
}
//create the array;
*output = new LinkedList[size];
//fill the array
i=0;
now=first;
while(now!=NULL) {
*output[i++]=now;
now=now->next;
}
}
SortByName(LinkedList *arrayOfPointers, size_t size) {
// your function to sort by name here
}
void TemporatorySort(LinkedList *first, LinkedList **output, size_t &size) {
// this function will create the array of pointer to your linked list,
// sort this array, and return the sorted array. However, the linked
// list will stay as it is. It's good for example when your lined list
// is sorted by ID, but you need to print it sorted by names only once.
FillArray(first, *output, size);
SortByName(output,size);
}
void PermanentSort(LinkedList *first) {
// This function will sort the linked list and save the new order
// permanently.
LinkedList *sorted;
size_t size;
TemporatorySort(first,&sorted,size);
if (size>0) {
sorted[0].prev=NULL;
}
for(int i=1;i<size;i++) {
sorted[i-1].next=sorted[i];
sorted[i].prev=sorted[i-1];
}
sorted[size-1].next=NULL;
}
I hope, I actually did help you. If you don't understand any line from the code, simply put a comment to this "answer".

Related

How to return information from the kth element of a linked list C++

Lets say I have a linked list that consists of {12,25,46,27,57} and I choose k to be 4. I'm hoping for a way to iterate through the list and only print 27 as my output.
I was also wondering how you would go about deleting a variable in the same fashion. So if I chose variable 3 from the list I would want the linked list to print as {12,25,27,57}.
For the first part of your question, implement something like this:
Node * n=head;
for(int i=1; i<=k;i++){
if(i==k){
return n;
}
n=n->next;
}
For the second part, implement this:
Node * n=head;
Node * rightPatch=head;
Node * leftPatch =head;
for(int i=1; i<=k;i++){
if(i->next=nullptr){
rightPatch=nullptr;
tail=leftPatch;
}
else{
rightPatch=n->next;
}
if(i==k && i ==1){
delete n;
head=rightPatch;
return;
}
else if(i==k){
delete n;
leftPatch->next=rightPatch;
return;
}
leftPatch=n;
n=n->next;
}
The delete function is extremely common for students to mess up on and have segfaults, so make sure you understand it well. It's not too complicated, and it eventually becomes second nature. Also, make sure your functions handle outside cases like deleting on an empty list. Professors really want to see students take those into consideration when they code.
If you haven't yet implemented the list, then you could use std::list, which provides a member erase() operation to erase an element. The link to std::list shows some examples and also shows how the std::list makes much use of the concept of an iterator to access elements in the list.
If you use std::list then then you can use the std algorithm library to work with the list. For example std::find_if will find the first element that satisfies the predicate you supply ( actually you can also use std::find for your example). To find the others matching in the list just keep on iterating...
If you need to implement your own list, then looking at std::list is a great place to start to see a possible way to do it.

Hashing with separate chaining

Let's consider that for two different inputs("tomas", "peter") a hash function yields the same key (3).
Please correct my assumtion how it works under the hood with the separate chaining:
In a hash table, index 3 contains a pointer to a linked list header. The list contains two nodes implemented for example like this:
struct node{
char value_name[];
int value;
node* ptr_to_next_node;
};
The searching mechanism remembers the input name ("peter") and compares value_name in nodes. When it equals with "peter", the mechanism would return the value.
Is this correct? I've learned that ordinary linked list doesn't contain the name of the node so that I didn't know, how could I find the correspondind value in the list with nodes like this for different names ("tomas", "peter"):
struct node{
int value;
node* ptr_to_next_node;
};
Yes, its correct that this is a possible implementation of part of a hash table.
When you say an "ordinary linked list doesn't contain the name of the node", I'd be expecting a linked list to be generic if the implementation language allows that. In C++ it would be a template.
It would be a linked list of a certain type and each element would have an instance of or a handle to that type and a pointer to the next element as your second code snippet shows except substituting int with the type.
In this case the type would most likely be a key-value-pair
So the linked list in that case doesn't directly contain the name - it contains an object that contains the name (and the value)
This is just a possible implementation though. There are other options
Yes, basically: a table of structures, with the hash function limited to return a value between 0 and table size - 1, and that is used as an index into the table.
This gets you to the top of the list of chained elements, which will be greater than or equal to zero in number.
To save time and space, usually the table element is itself a chain list element. Since you specified strings as the data being stored in the hash table, usually the structure would be more like:
struct hash_table_element {
unsigned int length;
char *data_string;
struct hash_table_element *next;
}
and the string space allocated dynamically, maybe with one of the standard library functions. There are a number of ways to manage string tables that may optimize your particular use case, of course, and checking the length first can often times speed up the search if you use short cut evaluation:
if (length == element->length &&
memcmp(string, element->data_string, length))
{
found = TRUE
};
This will not waste time comparing the strings unless they are the same length.

separate chaining in hashing

I am reading about hashing in Robert Sedwick book on Algorithms in C++
We might be using a header node to streamline the code for insertion
into an ordered list, but we might not want to use M header nodes for
individual lists in separate chaining. Indeed, we could even eliminate
the M links to the lists by having the first nodes in the lists
comprise the table
.
class ST
{
struct node
{
Item item;
node* next;
node(Item x, node* t)
{ item = x; next = t; }
};
typedef node *link;
private:
link* heads;
int N, M;
Item searchR(link t, Key v)
{
if (t == 0) return nullItem;
if (t->item.key() == v) return t->item;
return searchR(t->next, v);
}
public:
ST(int maxN)
{
N = 0; M = maxN/5;
heads = new link[M];
for (int i = 0; i < M; i++) heads[i] = 0;
}
Item search(Key v)
{ return searchR(heads[hash(v, M)], v); }
void insert(Item item)
{ int i = hash(item.key(), M);
heads[i] = new node(item, heads[i]); N++; }
};
My two questions on above text what does author mean by
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table." How can we modify above code for this?
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
"We could even eliminate the M links to the lists by having the first nodes in the lists comprise the table."
Consider Node* x[n] vs Node x[n]: the former needs an extra pointer and on-insertion memory allocated for the head Node of every non-empty element, and an extra indirection for every hash table operation, while the latter eliminates the n pointers but requires that any unused elements will be able to be put in some discernable not-in-use state (tracking of which may or may not require extra memory), and if sizeof(Node) size is greater than sizeof(Node*), it may be more wasteful of memory anyway. The difference in memory use can also affect efficiency of cache use: if the table has a high element to buckets ratio then a Node[] gets the Node data into fewer contiguous memory pages, and if you're iterating (in unsorted order) then it's very cache efficient, whereas Node*[] will jump to separate memory allocations that might be all over the place (or on the other hand, might actually be quite close together in some actually useful: e.g. if both access patterns and dynamic memory allocation addresses correlate to chronological time of object creation.
How can we modify above code for this?
First, your existing code has a problem: heads[i] = new node(item, heads[i]); overwrites an entry in the hash table without first checking if it's empty... if there's anything there then you should be adding to the list, not overwriting the array.
The design change discussed needs:
link* heads;
...changed to...
node* head;
You'd initialise it like this:
head = new node[M];
Which needs an extra node constructor (if item has an equivalent default constructor, you can leave out its initialisation below)
node() : item(nullItem), next(nullptr) { }
Then there's some knock on changes to the rest of your code that are easy to work through. Basically, you're getting rid of a layer of pointers.
"we might not want to use M header nodes for individual lists in separate chaining." What does this statement mean.
I didn't write it so can't say authoritatively, but it appears to be saying that when designing the list code, a decision might have been made to have an initial Node even in an empty list, as this simplifies code for several list operations. While the extra data-less Node might seem a reasonable price when contemplating "usual" uses of a list, hash tables are unusual in that you want most of the lists chained of the buckets to have 0 or 1 element, and exponentially fewer should be longer and longer. So, such a list implementation is poorly suited to use in a hash table.

How to do fast sorting in sorted list when only one element is changed

I need a list of elements that are always sorted. the operation involved is quite simple, for example, if the list is sorted from high to low, i only need three operations in some loop task:
while true do {
list.sort() //sort the list that has hundreds of elements
val = list[0] //get the first/maximum value in the list
list.pop_front() //remove the first/maximum element
...//do some work here
list.push_back(new_elem)//insert a new element
list.sort()
}
however, since I only add one elem at a time, and I have speed concern, I don't want the sorting go through all the elements, e.g., using bubble sorting. So I just wonder if there is a function to insert the element in order? or whether the list::sort() function is smarter enough to use some kind of quick sort when only one element is added/modified?
Or maybe should I use deque for better speed performance if above are all the operations needed?
thanks alot!
As mentioned in the comments, if you aren't locked into std::list then you should try std::set or std::multiset.
The std::list::insert method takes an iterator which specifies where to add the new item. You can use std::lower_bound to find the correct insertion point; it's not optimal without random access iterators but it still only does O(log n) comparisons.
P.S. don't use variable names that collide with built-in classes like list.
lst.sort(std::greater<T>()); //sort the list that has hundreds of elements
while true do {
val = lst.front(); //get the first/maximum value in the list
lst.pop_front(); //remove the first/maximum element
...//do some work here
std::list<T>::iterator it = std::lower_bound(lst.begin(), lst.end(), std::greater<T>());
lst.insert(it, new_elem); //insert a new element
// lst is already sorted
}

Hashing to Calculate Frequencies can be improved?

I'm currently working on building a hash table in order to calculate the frequencies, depending on the running time of the data structure. O(1) insertion, O(n) worse look up time etc.
I've asked a few people the difference between std::map and the hash table and I've received an answer as;
"std::map adds the element as a binary tree thus causes O(log n) where with the hash table you implement it will be O(n)."
Thus I've decided to implement a hash table using the array of linked lists (for separate chaining) structure. In the code below I've assigned two values for the node, one being the key(the word) and the other being the value(frequency). It works as; when the first node is added if the index is empty it is directly inserted as the first element of linked list with the frequency of 0. If it is already in the list (which unfortunately takes O(n) time to search) increment its frequency by 1. If not found simply add it to the beginning of the list.
I know there are a lot of flows in the implementation thus I would like to ask the experienced people in here, in order to calculate frequencies efficiently, how can this implementation be improved?
Code I've written so far;
#include <iostream>
#include <stdio.h>
using namespace std;
struct Node {
string word;
int frequency;
Node *next;
};
class linkedList
{
private:
friend class hashTable;
Node *firstPtr;
Node *lastPtr;
int size;
public:
linkedList()
{
firstPtr=lastPtr=NULL;
size=0;
}
void insert(string word,int frequency)
{
Node* newNode=new Node;
newNode->word=word;
newNode->frequency=frequency;
if(firstPtr==NULL)
firstPtr=lastPtr=newNode;
else {
newNode->next=firstPtr;
firstPtr=newNode;
}
size++;
}
int sizeOfList()
{
return size;
}
void print()
{
if(firstPtr!=NULL)
{
Node *temp=firstPtr;
while(temp!=NULL)
{
cout<<temp->word<<" "<<temp->frequency<<endl;
temp=temp->next;
}
}
else
printf("%s","List is empty");
}
};
class hashTable
{
private:
linkedList* arr;
int index,sizeOfTable;
public:
hashTable(int size) //Forced initalizer
{
sizeOfTable=size;
arr=new linkedList[sizeOfTable];
}
int hash(string key)
{
int hashVal=0;
for(int i=0;i<key.length();i++)
hashVal=37*hashVal+key[i];
hashVal=hashVal%sizeOfTable;
if(hashVal<0)
hashVal+=sizeOfTable;
return hashVal;
}
void insert(string key)
{
index=hash(key);
if(arr[index].sizeOfList()<1)
arr[index].insert(key, 0);
else {
//Search for the index throughout the linked list.
//If found, increment its value +1
//else if not found, add the node to the beginning
}
}
};
Do you care about the worst case? If no, use an std::unordered_map (it handles collisions and you don't want a multimap) or a trie/critbit tree (depending on the keys, it may be more compact than a hash, which may lead to better caching behavior). If yes, use an std::set or a trie.
If you want, e.g., online top-k statistics, keep a priority queue in addition to the dictionary. Each dictionary value contains the number of occurrences and whether the word belongs to the queue. The queue duplicates the top-k frequency/word pairs but keyed by frequency. Whenever you scan another word, check whether it's both (1) not already in the queue and (2) more frequent than the least element in the queue. If so, extract the least queue element and insert the one you just scanned.
You can implement your own data structures if you like, but the programmers who work on STL implementations tend to be pretty sharp. I would make sure that's where the bottleneck is first.
1- The complexity time for search in std::map and std::set is O(log(n)). And, the amortize time complexity for std::unordered_map and std::unordered_set is O(n). However, the constant time for hashing could be very large and for small numbers become more than log(n). I always consider this face.
2- if you want to use std::unordered_map, you need to make sure that std::hash is defined for you type. Otherwise you should define it.