Copy Construction For hashMap in C++ - c++

In the recent assignment, we are asked to implement a hashmap in C++ without the techniques provided in STL.
I'm stack on one of the functions -- copy constructor. After searching the google, I found a valid solution in the question:
Writing a valid copy constructor for a hash map in C++
But I can't totally understand it. Could anyone please help explain
1. why we need to use a pointer-to-pointer Node** p = &hashTable[i]; ?
2. what is the logic in the while loop?
3. especially, what does this code p=&c->next; mean?

Firstly, there are many different types of hash table implementations, so any specific one you find online may or may not yield insights into what you'll need to do for your own implementation. That said...
p is initially pointed at the head element for the bucket, which is the Node*s at [this->]hashTable[i]. It's initially used to set it to NULL. As you're dealing with Node*s, a Node** is a natural way to keep track of their locations.
each iteration of the while loop duplicates the next Node that's in bucket [i] in hm; the duplicate is created in new memory at c, and *p (which tracks the linked list positions being created for the *this object under construction) is updated to point thereto.
p=&c->next; means p is set to the next member of the newly created Node (at address c): that next pointer must be initialised by the Node(const Node&) constructor to nullptr/NULL/0, or the linked lists created wouldn't terminate properly. Only if there are more elements in the linked list of colliding elements to be copied, the next iteration will overwrite *p and therefore the next member of the previously added Node with the next value of c.
Summarily, you're looking at a loop that copies amountOfBuckets linked lists. If you're not familiar with linked list operations, you'd be better off writing a linked list class first and getting that working, then use it to help implement the hash table.

Related

Copy elements from vector based on condition C++

I'm using C++ to create Hopcroft's algorithm for DFA Minimization.
Part of Hopcroft's algorithm is to - initially - divide two sets (P with accept and non-accept states and Q with non-accept states only). I already have group P, and from P I'm trying to extract Q. I'm using the following code to do it:
for(int i=0; i<groupP.size(); i++)
if(groupP[i]->final)
groupQ.push_back(groupP[i]);
in which groupP and groupQ are:
vector<node*> groupQ;
vector<node*> groupP;
and node is a structure that I've created to represent a node of my automata. It's guaranteed that the boolean attribute "final" is already correctly set (false for non-final states, true for final states).
Finally, my question is: is it correct to copy one element from a vector to another by doing what I've done? If I modify the content of a copied element from groupP, will this same element be modified in groupQ as well?
Right now, you have vectors of pointers. When you copy from one vector to another, you're copying the pointers, not the elements themselves.
Since you have two pointers referring to the same node, any modification made to a node will be visible in the other group--i.e., if yo make a change to groupP[i]->foo, then the same change will be visible in groupQ[j]->foo (provided that groupP[i] is one of the elements you copied from groupP to groupQ.
If you don't want that, you have a couple of choices. One would be to leave groupP and groupQ in the same vector, but partition the vector based on the state of an element's final member:
auto P_end = std::partition(groupP.begin(), groupQ.end(),
[](node *n) { return n->final;});
Then [groupP.begin(), P_begin) is groupP (i.e., final==true) and [P_begin, groupP.end()) is groupQ (i.e., final==false).
This moves the pointers around (and gives you an iterator so you know the dividing line between the two) so you have exactly one pointer to each element, but they're separated into the two relevant groups.
As a final possibility, you might want to actually copy elements from groupP to groupQ, and in the process create a new element, so after you copy items from groupP to groupQ, each item you copied now exists in two place--i.e., there's one element in groupP and one element in groupQ. Either one can be modified, but they're separate from each other, so either can be modified, but a modification to one has no effect on the other.
The most obvious way to achieve that would be be to just use vectors of nodes:
vector<node> groupQ;
vector<node> groupP;
This way, when you copy from one group to the other, you're copying the nodes themselves rather than pointers to nodes, so each copy creates a new, independent node with the same value as an existing node.
You could use std::copy_if which does the same thing:
std::copy_if(groupP.cbegin(), groupP.cend(),
std::back_inserter(groupQ),
[](node* n){ return n->final; });
Since you are manipulating pointers, the elements themselves are shared, so modifying a node in one of the container can be seen from the other.
Note that manipulating raw pointers like you are doing is very error prone, and you may want to use shared pointers for instance.
Edit: Adding missing std::back_inserter.

Insertion/Deletion of Doubly Linked List in constant time O(1)

a) I have seen a lot of examples of this question in stackoverflow but I am still having problems understanding how does it know which node it is referring when call functions like insertAfter(Node n, Object o). If we say insert after node 2, how does linkedlist know which node is node 2?
b) in the previous posts in stackoverflow, it is said there is a pointer to a node that you want to insert after or before, that is why we get constant time operations. does that mean, just like we have to head and tail in linked list, we also have a pointer to each node?
would really appreciate help in understanding this.
If deletion is done using key then your point is valid because we dont know the position of the element to be deleted and thus finding key in the list makes its time linear in the length of the queue. But at places where it is written to be Constant time is deletion by address. So one may always move to that address and delete the node in constant time.
Note : This ain't possible with singly linked list.

Does appending a list to another list in F# incur copying of underlying objects or just the pointers?

I've always thought that appending a list to another one meant copying the objects from the first list and then pointing to the appended list as described for example here.
However, in this blog post and in its comment, it says that it is only the pointers that are copied and not the underlying objects.
So what is correct?
Drawing from Snowbear's answer, a more accurate image of combining two lists (than the one presented in the first referred article in the question) would be as shown below.
let FIRST = [1;2;3]
let SECOND = [4;5;6]
let COMBINED = FIRST # SECOND
In the functional world, lists are immutable. This means that node sharing is possible because the original lists will never change. Because the first list ends with the empty list, its nodes must be copied in order to point its last node to the second list.
If you mean this statement then the answer is seems to be pretty simple. Author of the first article is talking about list node elements when he says nodes. Node element is not the same as the list item itself. Take a look at the pictures in the first article. There are arrows going from every element to the next node. These arrows are pointers. But integer type (which is put into the list) has no such pointers. There is probably some list node type which wraps those integers and stores the pointers. When author says that nodes must be copies he is talking about these wrappers being copied. The underlying objects (if they were not value types as in this case) would not be cloned, new wrappers will point to the same object as before.
F# lists hold references (not to be confused with F#'s ref) to their elements; list operations copy those references (pointers), but not the elements themselves.
There are two ways you might append items to an existing list, which is why there seems to be a discrepancy between the articles (though they both look to be correct):
Cons operator (::): The cons operator prepends a single item to an F# list, producing a new list. It's very fast (O(1)), since it only needs to call a very simple constructor to produce the new list.
Append operator (#): The append operator appends two F# lists together, producing a new list. It's not as fast (O(n)) because in order for the elements of the combined list to be ordered correctly, it needs to traverse the entire list on the left-hand-side of the operator (so copying can start at the first element of that list). You'll still see this used in production if the list on the left-hand-side is known to be very small, but in general you'll get much better performance from using ::.

Finding corruption in a linked list

I had an interview today for a developer position and was asked an interesting techincal question that i did not know the answer to. I will ask it here to see if anyone can provide me with a solution for my curiosity. It is a multi-part question:
1) You are given a singly linked list with 100 elements (integer and a pointer to next node), find a way to detect if there is a break or corruption halfway through the linked list? You may do anything with the linked list. Note that you must do this in the list as it is iterating and this is verification before you realise that the list has any issues with it.
Assuming that the break in the linked list is at the 50th element, the integer or even the pointer to the next node (51st element) may be pointing to a garbage value which is not necessarily an invalid address.
2) Note that if there is a corruption in the linked list, how would you minimize data loss?
To test for a "corrupted" integer, you would need to know what the range of valid values is. Otherwise, there is no way to determine that the value in any given (signed) integer is invalid. So, assuming you have a validity test for the int, you would always check that value before iterating to the next element.
Testing for a corrupted pointer is trickier - for a start, what you need to do is check the value of the pointer to the next element before you attempt to de-reference it, and ensure it is a valid heap address. That will avoid a segmentation fault. The next thing is to validate that what the pointer points at is in fact a valid linked list node element - that's a bit trickier? Perhaps de-reference the pointer into a list element class/struct, and test the validity of the int and "next" pointer, if they are also good, then can be pretty sure the previous node was good also.
On 2), having discovered a corrupted node, [if the next pointer is corrupted] what you should do is set the "next pointer" of the previous node to 'NULL' immediately, marking it as the end of the list, and log your error etc etc. if the corruption was just to the integer value, but not to the "next" element pointer, then you should remove that element from the list and link the previous and following nodes together instead - as no need to throw the rest of the list away in that case!
For the first part - Overload the new operator. When ever a new node is allocated allocate some additional space before and after the node and put some known values there. In traversal each node can be checked if it is in between the known values.
If you at design time know that corruption may become a critical issue, you could add a "magic value" as a field into the node data structure which allows you to identify whether some data is likely to be a node or not. Or even to run through memory searching for nodes.
Or double some link information, i.e. store the address of the node after the next node in each node, such that you can recover if one link is broken.
The only problem I see is that you have to avoid segmentation faults.
If you can do anything to the linked list, what you can do is to calculate the checksum of each element and store it on the element itself. This way you will be able to detect corruption even if it's a single bit error on the element.
To minimize data loss perhaps you can consider having storing the nextPtr in the previous element, that way if your current element is corrupted, you can always find the location of the next element from the previous.
This is an easy question, and there are several possible answers. Each trades off robustness with efficiency. Since increased robustness is a prerequisite of the question being asked, there are solutions available which sacrifice both time (list traversal speed, as well as speed of insertion and speed of deletion of nodes) or alternately space (extra info stored with each node). Now the problem has been stated that this is a fixed list of length 100, in which case the data structure of a linked list is most inappropriate. Why not make the puzzle a little more challenging and say that the size of the list is not known a priori?
Since the number of elements (100) is known, 100th node must contain a null pointer. If it does, the list with some good probability is valid (this cannot be guaranteed, if, for example, 99th node is corrupt and points to some memory location with all zeros). Otherwise, there is some problem (this can be returned as a fact).
upd: Also, it could be possible to, an every step, look at some structures delete would use if given the pointer, but since using delete itself is not safe in any sense, this is going to be implementation-specific.

copy linked list with random link in each node, each node has a variable,which randomly points to another node in the list

An interview question:
copy linked list with random link in each node, each node has a variable,which randomly
points to another node in the list.
My ideas:
Iterate the list, copy each node and its pointed nodes by its variable and add a sentinel at the end and then do the same thing for each node.
In the new list, for each node i, separate each list ended with sentinel and use i's variable points to it.
It is not efficient in space. It is O(n^2) in time and space.
Better ideas?
I think you can pinch ideas from e.g. Java
Serialisation, which recognises when pointers point to nodes already serialised, so that it can serialise (and then deserialise) arbitrary structures reasonably efficiently. The spec, which you can download via a link at http://docs.oracle.com/javase/1.4.2/docs/guide/serialization/index.html, says that this is done but doesn't say exactly how - I suspect hash tables.
I think copying is a lot like this - you don't even need to know that some of the pointers make up a linked list. You could use a depth first search to traverse the graph formed by the nodes and their pointers, putting the location of each node in a hash table as you go, with the value the copied node. If the node is already present you don't need to do anything except make the pointer in the copied node point to the copy of the node pointed to as given by the hash table. If the node is not already present, create the copy, put the node in the hash table with the address of the copy as its value, and recursively copy the information in the node, and its pointers, into the newly made copy.
This is a typical interview question. You can find many answers by using Google. This is a link I think is good for understanding. But please read the comments too, there are some errors in the main body: Copy a linked list with next and arbit pointer