I need to hold a list of nodes (strings) and have a "current" node (or more generally current nodes).
I would use a vector and current nodes specified by indexes, but sometimes a node is removed from the list, this silently invalidates the index, what is not acceptable.
So, I want to represent it as a reference counted digraph: the list is a path in the digraph (with only a single parent per node). When I add a new current node, I add a new parent and reference a node in the list increasing its reference count.
Now I doubt which data structure in Rust to use:
pub struct DigraphNode<T> {
pub next: DigraphNodeRef<T>, // I made it `pub` to be able `item.next.next()` to remove an item from the middle.
data: T,
}
pub struct DigraphNodeRef<T> {
rc: Rc<DigraphNode<T>>,
}
seems a right solution at first, but how then to represent the empty digraph (an empty list of strings)? (Rc can't hold zero strong references.)
Related
Let's consider that for two different inputs("tomas", "peter") a hash function yields the same key (3).
Please correct my assumtion how it works under the hood with the separate chaining:
In a hash table, index 3 contains a pointer to a linked list header. The list contains two nodes implemented for example like this:
struct node{
char value_name[];
int value;
node* ptr_to_next_node;
};
The searching mechanism remembers the input name ("peter") and compares value_name in nodes. When it equals with "peter", the mechanism would return the value.
Is this correct? I've learned that ordinary linked list doesn't contain the name of the node so that I didn't know, how could I find the correspondind value in the list with nodes like this for different names ("tomas", "peter"):
struct node{
int value;
node* ptr_to_next_node;
};
Yes, its correct that this is a possible implementation of part of a hash table.
When you say an "ordinary linked list doesn't contain the name of the node", I'd be expecting a linked list to be generic if the implementation language allows that. In C++ it would be a template.
It would be a linked list of a certain type and each element would have an instance of or a handle to that type and a pointer to the next element as your second code snippet shows except substituting int with the type.
In this case the type would most likely be a key-value-pair
So the linked list in that case doesn't directly contain the name - it contains an object that contains the name (and the value)
This is just a possible implementation though. There are other options
Yes, basically: a table of structures, with the hash function limited to return a value between 0 and table size - 1, and that is used as an index into the table.
This gets you to the top of the list of chained elements, which will be greater than or equal to zero in number.
To save time and space, usually the table element is itself a chain list element. Since you specified strings as the data being stored in the hash table, usually the structure would be more like:
struct hash_table_element {
unsigned int length;
char *data_string;
struct hash_table_element *next;
}
and the string space allocated dynamically, maybe with one of the standard library functions. There are a number of ways to manage string tables that may optimize your particular use case, of course, and checking the length first can often times speed up the search if you use short cut evaluation:
if (length == element->length &&
memcmp(string, element->data_string, length))
{
found = TRUE
};
This will not waste time comparing the strings unless they are the same length.
Context: I am implementing the Push-Relable Algorithm for MaxFlow in a network and want to keep track of labels of all nodes, for each possible label (2*V-1 many) I want to have a doubly-linked list containing the nodes with that label.
So I have a vector where each entry is a list. Now I need to delete an element from one list and move it into another list in another vector-entry.
In order to do so, I use an vector (wich size is equal to the number of elements) where each entry is an interator, so I always know the position of each element.
Before implementing it on a bigger scale I wanted to try whether it works at all. So I create the two vectors, add one element into a list, store the iterator in the other vector and try to delete that element again.
But the std::vector::erase() method always gets me SegFaults. Did I miss something?
int V=50;
int i=0, v=42;
vector<list<int> > B(2*V-1);
vector<list<int>::iterator> itstorage(V) ;
B[i].push_back(v);
itstorage[v]=B[i].end();
B[i].erase(itstorage[v]);
B[i].end() does not refer to the last item you pushed, it is one past the item you pushed.
What you want is:
std::list<int>::iterator p = B[i].end();
--p;
Alternatively, instead of using push_back, you could use the insert member function which returns an iterator to the newly inserted item.
itstorage[v] = B[i].insert(B[i].end(), v);
Why do the list/ring types in golang use the extra structs Element/Ring for the individual items and not interface{} ? I am assuming there is some benefit but I cannot see it.
Edit: I meant to ask about the api and NOT about the use of Element/Ring in the implementation. The implementation could still use a non exported type but have the api give and take an interface{}, so why make the users go in and out of Element/Ring?
Edit2: As an example the list Back() function could be like
func (l *List) Back() interface{} {
if l.len == 0 {
return nil
}
return l.root.prev.Value
}
Where the list still uses Element internally but it would be just element (unexported) since it wouldn't return it but only return the value.
container/list is linked list, so it'll be beneficial to have List struct that can operate on the list as a whole and keep track of the beginning and end of a list.
Since it's a linked list, you want to be able to link items together and navigate from one item to the next or the previous item. That requires a struct that hold pointers to the next and the previous item as well as allowing you to navigate to those items (with the Next() and Prev() functions). The Element struct serves that purpose, it contains pointers to the next/previous item, and the actual value.
Here's how the struct's are defined, and they have various member functions as well
type List struct {
root Element // sentinel list element, only &root, root.prev, and root.next are used
len int // current list length excluding (this) sentinel element
}
type Element struct {
// Next and previous pointers in the doubly-linked list of elements.
// To simplify the implementation, internally a list l is implemented
// as a ring, such that &l.root is both the next element of the last
// list element (l.Back()) and the previous element of the first list
// element (l.Front()).
next, prev *Element
// The list to which this element belongs.
list *List
// The value stored with this element.
Value interface{}
}
container/ring does not have an "extra" struct as you imply. There's only the Ring struct which links one item to the next/previous item and also holds the value. There's no start/end of a Ring, so there's no need to have a struct that operates on a ring as a whole or keeps track of the start.
type Ring struct {
next, prev *Ring
Value interface{} // for use by client; untouched by this library
}
They contain filtered or unexported fields.
Package list
File list.go:
// Package list implements a doubly linked list.
// Element is an element of a linked list.
type Element struct {
// Next and previous pointers in the doubly-linked list of elements.
// To simplify the implementation, internally a list l is implemented
// as a ring, such that &l.root is both the next element of the last
// list element (l.Back()) and the previous element of the first list
// element (l.Front()).
next, prev *Element
// The list to which this element belongs.
list *List
// The value stored with this element.
Value interface{}
}
Package ring
File ring.go:
// Package ring implements operations on circular lists.
// A Ring is an element of a circular list, or ring.
// Rings do not have a beginning or end; a pointer to any ring element
// serves as reference to the entire ring. Empty rings are represented
// as nil Ring pointers. The zero value for a Ring is a one-element
// ring with a nil Value.
//
type Ring struct {
next, prev *Ring
Value interface{} // for use by client; untouched by this library
}
Obviously, the type of Element and Ring can't be interface{} because it would make no sense. You can't have a method on an interface type.
The Go Programming Language Specification
Method declarations
A method is a function with a receiver. A method declaration binds an
identifier, the method name, to a method, and associates the method
with the receiver's base type.
MethodDecl = "func" Receiver MethodName ( Function | Signature ) .
Receiver = "(" [ identifier ] [ "*" ] BaseTypeName ")" .
BaseTypeName = identifier .
The receiver type must be of the form T or *T where T is a type name.
The type denoted by T is called the receiver base type; it must not be
a pointer or interface type and it must be declared in the same
package as the method. The method is said to be bound to the base type
and the method name is visible only within selectors for that type.
I want to maintain two related binary search trees the nodes of which hold two items of data - a page number and a time of creation/reference (the details are not that important - essentially it is two 64 bit numbers).
The first tree is ordered by page number, the second by time of creation - essentially the use case is:
Find if the page exists (is in the tree), searching by page number
If the page exists, update its reference time to now
If the page does not exist - add the page to both trees with a creation time of now
But, in the case above, if the tree has reached maximum capacity delete a page with the oldest reference time
The way I tried to do this was to search the first tree by page number - thus we get back a node that has also a record of the creation/reference time, then search the second tree for a node that has both that reference time and is that page.
The difficulty is that the reference time may not be unique (this is an absolute barrier): is there an algorithm I can implement in the node that allows me to search through the tree to find the correct node without breaking the tree code...
This the tree code now...
template <typename NODE> NODE* redblacktree<NODE>::locatenode(NODE* v,
NODE* node) const
{
if (node == NULL)
return node;
if (v->equals(node))
return node;
if (v->lessthan(node))
return locatenode(v, node->left);
else
return locatenode(v, node->right);
}
And here is a simple (working) single search piece of code at the node end for a single indexing value:
bool PageRecord::operator==(PageRecord& pRecord) const
{
return (pageNumber == pRecord.getPageNumber());
}
bool PageRecord::operator<(PageRecord& pRecord) const
{
return (pageNumber < pRecord.getPageNumber());
}
Change the NODE data structure to allow more than one page in the node.
typedef struct node
{
int choice; // if 1 then page tree, if 0 then reference tree
vector<Page> pg; // if choice=0
Reference ref; // if choice=1
struct node* left;
struct node* right;
}NODE;
You can modify the equals and lessthan functions accordingly. The locatenode function remains the same.
I'll add something about the data structures used here. You actually don't need a tree to maintain the references. References are required only when:
If the page exists, update its reference time to now
But, in the case above, if the tree has reached maximum capacity delete a page with the oldest reference time
So this can be done using a heap as well. The advantage is that, then insert operation will cost only O(1) time.
For the first point, if reference has to updated, the node goes downward in the heap. So maintain a link from the page tree to the reference heap. And continue swapping the nodes in the reference heap until the current node goes to the last level. O(logn) time.
For the second point, delete the first node of the heap. O(logn) time here.
And as for If the page does not exist - add the page to both trees with a creation time of now, add the new node at the end of the heap. O(1) time.
In this question I'm not asking how to do it but HOW IS IT DONE.
I'm trying (as an excersise) implement simple map and although I do not have problems with implementing links and they behavior (how to find next place to insert new link etc.) I'm stuck with the problem how to implement iterating over a map. When you think about it and look at std::map this map is able to return begin and end iterator. How? Especially end?
If map is a tree how can you say which branch of this map is an end? I just do not understand it. An how to iterate over a map? Starting from the top of the tree and then what? Go and list everything on the left? But those nodes on the left have also links to the right. I really don't know. I will be really glad if someone could explain it to me or give me a link so I could read about it.
A map is implemented using a binary search tree. To meet the complexity requirements it has to be a self-balancing tree, so a red-black tree is usually used, but that doesn't affect how you iterate over the tree.
To read the elements out of a binary search tree in order from least to greatest, you need to perform an in-order traversal of the tree. The recursive implementation is quite simple but isn't really practical for use in an iterator (the iterator would have to maintain a stack internally, which would make it relatively expensive to copy).
You can implement an iterative in-order traversal. This is an implementation taken from a library of tree containers I wrote a while ago. NodePointerT is a pointer to a node, where the node has left_, right_, and parent_ pointers of type NodePointerT.
// Gets the next node in an in-order traversal of the tree; returns null
// when the in-order traversal has ended
template <typename NodePointerT>
NodePointerT next_inorder_node(NodePointerT n)
{
if (!n) { return n; }
// If the node has a right child, we traverse the link to that child
// then traverse as far to the left as we can:
if (n->right_)
{
n = n->right_;
while (n->left_) { n = n->left_; }
}
// If the node is the left node of its parent, the next node is its
// parent node:
else if (n->parent_ && n == n->parent_->left_)
{
n = n->parent_;
}
// Otherwise, this node is the furthest right in its subtree; we
// traverse up through its parents until we find a parent that was a
// left child of a node. The next node is that node's parent. If
// we have reached the end, this will set node to null:
else
{
while (n->parent_ && n == n->parent_->right_) { n = n->parent_; }
n = n->parent_;
}
return n;
}
To find the first node for the begin iterator, you need to find the leftmost node in the tree. Starting at the root node, follow the left child pointer until you encounter a node that has no left child: this is the first node.
For an end iterator, you can set the node pointer to point to the root node or to the last node in the tree and then keep a flag in the iterator indicating that it is an end iterator (is_end_ or something like that).
The representation of your map's iterator is totally up to you. I think it should suffice to use a single wrapped pointer to a node. E.g.:
template <typename T>
struct mymapiterator
{
typename mymap<T>::node * n;
};
Or something similar. Now, mymap::begin() could return such instance of the iterator that n would point to the leftmost node. mymap::end() could return instance with n pointing to root probably or some other special node from which it is still possible to get back to rightmost node so that it could satisfy bidirectional iteration from end iterator.
The operation of moving between the nodes (operators++() and operator--(), etc.) are about traversing the tree from smaller to bigger values or vice versa. Operation that you probably have already implemented during insertion operation implementation.
For sorting purposes, a map behaves like a sorted key/value container (a.k.a. a dictionary); you can think of it as a sorted collection of key/value pairs, and this is exactly what you get when you query for an iterator. Observe:
map<string, int> my_map;
my_map["Hello"] = 1;
my_map["world"] = 2;
for (map<string, int>::const_iterator i = my_map.begin(); i != my_map.end(); ++i)
cout << i->first << ": " << i->second << endl;
Just like any other iterator type, the map iterator behaves like a pointer to a collection element, and for map, this is a std::pair, where first maps to the key and second maps to the value.
std::map uses a binary search internally when you call its find() method or use operator[], but you shouldn't ever need to access the tree representation directly.
One big trick you may be missing is that the end() iterator does not need to point to anything. It can be NULL or any other special value.
The ++ operator sets an iterator to the same special value when it goes past the end of the map. Then everything works.
To implement ++ you might need to keep next/prev pointers in each node, or you could walk back up the tree to find the next node by comparing the node you just left to the parent's right-most node to see if you need to walk to that parent's node, etc.
Don't forget that the iterators to a map should stay valid during insert/erase operations (as long as you didn't erase the iterator's current node).