generic "out of bounds", "past end" iterator

generic "out of bounds", "past end" iterator - c++

In my application I have a (unbalanced) tree datastructure. This tree is simply made of "std::list of std::lists" - node holds an arbitrary "list" of sub-nodes. Using this instead of a single list made the rest of the application a lot easier. (The program is about changing moving nodes from one tree to another tree / another part in the tree / to it's own tree).
Now an obvious task is to find a subtree inside a "tree". For non-recursive searches it is simple enough:
subtree_iterator find_subtree(const N& n) {
auto iter(subtrees.begin());
auto e(subtrees.end());
while (iter != e) {
if ((*iter)->name == n) {
return iter;
}
++iter;
}
return e;
}
Which returns an iterator to the subtree position. The problem however starts when I try to implement a multi-level search. Ie, I wish to search for hello.world.test where the dots mark a new level.
Searching worked alright
subtree_iterator find_subtree(const pTree_type& otree, std::string identify) const {
pTree_type tree(otree);
boost::char_separator<char> sep(".");
boost::tokenizer<boost::char_separator<char> > tokens(identify, sep);
auto token_iter(tokens.begin());
auto token_end(tokens.end());
subtree_iterator subtree_iter;
for (auto token_iter(tokens.begin()); token_iter != token_end; ++token_iter) {
std::string subtree_string(*token_iter);
subtree_iter = tree->find_subtree_if(subtree_string);
if (subtree_iter == tree->subtree_end()) {
return otree->subtree_end()
} else {
tree = *subtree_iter;
}
}
return subtree_iter;
}
On first glace it seemed to work "correct", however when I try to use it, it fails. Using it would be like
auto tIn(find_subtree(ProjectTree, "hello.world.test"));
if (tIn != ProjectTree->subtree_end()) {
//rest
}
however that gives a debug assertion error "list iterators not compatible". This isn't too weird: I'm comparing a iterators from different lists to each other. However I could I implement such a thing? My "backup" option would be to return a std::pair<bool,iterator> where the boolean part determines if the tree actually exists. Is there another method, short of making the whole tree single list?

You should not work on iterators internaly. Use nodes instead.
template <typename T>
struct Node {
T item;
Node<T>* next;
};
Then encapsulate your Node in an iterator facade like this :
template<typename T>
class iterator {
private:
Node<T>* node;
public:
...
};
Then use a generic invalid node (when node is nullptr) that is returned whenever end() is reached or returned.
Note that what i suggest is a single linked list (not double linked list as the standard one). this is because you can't go back from an invalid generic end() iterator that point to an invalid null node.
If you don't use iterator operator--() in your algorithms this should be fine.

std::vector<list_iterator> stack to traverse? Where the .back() of the stack is the only one allowed to be equal to end() of the previous one, and .front() is an iterator to the root list?

Related

The proper way to increment my binary tree

Here is the class I've created
#include <memory>
template <typename T>
class binary_tree {
private:
T t_data;
std::unique_ptr<binary_tree<T>> t_left, t_right;
class binary_tree_iterator { // -----------------------
private:
T data;
public:
binary_tree_iterator(T d) : data(d) {} // Iterator class
T& operator*() {return data;}
binary_tree_iterator& operator++() {} // <--------- ??????
};
// ------------------------
public:
binary_tree(T d) : t_data(d), t_left(nullptr), t_right(nullptr)
{}
void insert(T data) {
if(data <= t_data) {
if(t_left == nullptr) {
t_left = std::unique_ptr<binary_tree<T>>(new binary_tree<T>(data));
} else {
t_left->insert(data);
}
} else {
if(t_right == nullptr)
t_right = std::unique_ptr<binary_tree<T>>(new binary_tree<T>(data));
else
t_right->insert(data);
}
}
const T data() const {
return t_data;
}
const std::unique_ptr<binary_tree<T>>& left() const {
return t_left;
}
const std::unique_ptr<binary_tree<T>>& right() const {
return t_right;
}
binary_tree_iterator begin() {
if(t_left == nullptr) {
return binary_tree_iterator(t_data);
} else {
return t_left->begin();
}
}
binary_tree_iterator end() {
if(t_right == nullptr) {
return binary_tree_iterator(t_data);
} else {
return t_right->end();
}
}
};
I've declared my iterator class inside of my container class. This may have been a mistake but either way I'm not sure how to define my overloaded increment function. Once I've found begin() I've lost my way back. It seems like unique_ptr() is designed for one way pointing. Assuming I have to use unique_ptr in this fashion, is there some work around here? I've thought about giving each instance of binary_tree a head member that points back from whence it came, but each node should only be accessible from the node above it. I make some sort of index but that seems to completely defeat the purpose of this container type. I'm solving exercise so I'm restricted to using the unique_ptr.

You defined your iterator as containing the data value in your tree.
This is not what iterators are all about. Iterators do not contain the value they're referencing, but rather a reference (in the common meaning of the word, and not a C++ term) to it, typically a pointer.
Of course you can't figure out what to do with ++. For your iterator, it is natural to expect that the ++ operator will advance the iterator to the next node in your tree, but since the iterator does not contain a pointer to anything, you have nothing to advance there, and run into a mental block.
You will need to redesign your iterator so that it contains a pointer to your binary_tree; its * overload dereferences; and the ++ advances to the next element in your binary tree, which it will then be able to do, using its pointer.
At this point you will run into another mental block. Iterating through an entire binary tree requires, at some point, to back up to parent nodes in the tree. After all, after recursing into the left part of the binary tree, at some point, after iterating through the binary tree you will need to, somehow, in some way, wind up in the right part of the binary tree. However, as designed, your binary_tree has no means of navigating to any node's parent. That's another design flaw you will need to address, in some fashion.
It is possible, I suppose, to implement this entire backtracking in the iterator itself, having the iterator record each node its visited, so it can back up to it, when needed. But iterators are supposed to be lightweight objects, barely more than a pointer themselves, and not a full blown data structure that implements complicated operations.
In summary, you have several holes in the design of your binary tree that you will need to address, before you can implement an effective iterator for it.

Using Hashtables to get random access to linked list

One of major drawbacks to linked lists is that the access time to elements is linear. Hashtables, on the other hand, have access in constant time. However, linked lists have constant insertion and deletion time given an adjacent node in the list. I am trying to construct a FIFO datastructure with constant access time, and constant insertion/deletion time. I came up with the following code:
unordered_map<string key, T*> hashTable;
list<T> linkedList;
T Foo;
linkedList.push_front(Foo);
hashTable.insert(pair<string, T*>("A", &Foo);
T Bar;
linkedList.push_front(Bar);
hashTable.insert(pair<string, T*>("B", &Bar);
However, this code feels like it is really dangerous. The idea was that I could use the hashtable to access any given element in constant time, and since insertion always occurs at the start of the list, and deletion from the end of the list, I can insert and delete elements in constant time. Is there anything inherently poor about the above code? If I wanted to instead store pointers to the nodes in the linkedList to have constant insertion/deletion time from any node would I just store list< T >::iterator*?

If you build a lookup table you will end up with 2 data structures holding the same data which does not make any sense.
The best thing you can do is making your linked list ordered and build a sparse lookup table in some way to select the start node to search in order to amortize some run time.

It is unclear how you want to access the list elements. From the code you provided, I assume you want the first pushed node as "A", the second pushed node as "B", etc. Then my question is what happens when we delete the first pushed node? Does the second pushed node become "A"?
If the node identifiers don't change when updating the list, your approach seems alright.
If the node identifiers has to change on list update, then here is a lightweight and limited approach:
const int MAX_ELEMENTS = 3;
vector<int> arr(MAX_ELEMENTS + 1);
int head = 0, tail = 0;
int get_size() {
if (head > tail)
return tail + ((int)arr.size() - head);
return tail - head;
}
void push(int value) {
if (get_size() == MAX_ELEMENTS) {
// TODO: handle push to full queue
cout << "FULL QUEUE\n";
return;
}
arr[tail++] = value;
if (tail == (int)arr.size()) {
tail = 0;
}
}
int pop() {
if (get_size() == 0) {
// TODO: handle access to empty queue
cout << "EMPTY QUEUE\n";
return -1;
}
int result = arr[head++];
if (head == (int)arr.size()) {
head = 0;
}
return result;
}
int get_item_at(int id) {
if (id >= get_size()) {
// TODO: index out of range
cout << "INDEX OUT OF RANGE\n";
return -1;
}
int actual_id = head + id;
if (actual_id >= (int)arr.size()) {
actual_id -= (int)arr.size();
}
return arr[actual_id];
}
The above approach will keep indices up-to-date (eg. get_item_at(0) will always return the first node in the queue). You can map ids to any suitable id you want like "A" -> 0, "B" -> 1, etc. The limitation of this solution is that you won't be able to store more than MAX_ELEMENTS in the queue.
If I wanted to instead store pointers to the nodes in the linkedList to have constant insertion/deletion time from any node would I just store list< T >::iterator*?
If identifiers must change with insertion/deletion, then it is going to take O(n) time anyways.

This idea makes sense, especially for large lists where order of insertion matters but things might be removed in the middle. What you want to do is create your own data structure:
template<typename T> class constantAccessLinkedList {
public:
void insert(const T& val) {
mData.push_back(val); //insert into rear of list, O(1).
mLookupTable.insert({val, mData.back()}); //insert iterator into map. O(1).
}
std::list<T>::iterator find(const T& val) {
return mLookupTable[val]; //O(1) lookup time for list member.
}
void delete(const T& val) {
auto iter = mLookupTable.find(val); //O(1)get list iterator
mLookupTable.erase(val); //O(1)remove from LUT.
mData.erase(iter); //O(1) erase from list.
}
private:
std::list<T> mData;
std::unordered_map<T, std::list<T>::iterator> mLookupTable;
};
The reason you can do this with list and unordered_map is because list::iterators are not invalidated on modifying the underlying container.
This will keep your data in order in the list, but will provide constant access time to iterators, and allow you to remove elements in constant time.

Implementing an iterator in a doubly linked circular list / ring (problem with .end())

I am trying to implement the Iterator class inside my BiRing<Key, Info> class.
What I am wondering is how can I implement an iterator which goes through all elements of a ring?
I have searched for solutions on the internet, but I have not been able to find any relevant ones.
Main difficulty that I am having is how to deal with .end() method. I have read that it's implemented by returning a pseudo-next element of the last element, but I have not seen any guide on how to do that on any custom container, let alone a doubly linked ring.
Can anyone briefly explain how those pseudo-last elements would be created in a general case and in my case and is there some other way to implement the iterator so that I am able to go through all elements of the ring?
My class (simplified):
template<typename Key, typename Info>
class BiRing
{
private:
struct Node
{
// key and info, previous and next, constructor
};
Node* any;
public:
struct Iterator
{
private:
Node* iter;
public:
Iterator(Node* any) : iter(any) {}
void operator++() {
if (iter == nullptr) throw ("nullptr iterator");
iter = iter->next;
}
// *, == and != operators
};
Iterator begin() { return Iterator(any); }
//Iterator end() { return ??? }
// other methods
};
UPDATE:
Having realised that "pseudo-last element" literally means "another node which is processed differently under different contexts", I solved the problem by adding an empty node called pseudoLast.
Rest of the class BiRing and Iterator follow up on that logic. Every Iterator object has Node* ptr and Node* toSkip which, subsequently, allows me to check if an iterator object is used alongside correct BiRing object; for instance, == and != operators will throw an exception if I were to compare a biRing begin()/end() with an iterator which does not belong to that object.
In case anyone wants to check out how I have implemented it:
Pastebin link.

Circular iteration of std::list

In my application I need the ability to traverse a doubly linked list starting from any arbitrary member of the list and continuing past the end(), wrapping around to the begin() and continue until the traversal reaches where it started.
I decided to use std::list for the underlying data structure and wrote a circulate routine to achieve this. However it's showing certain unexpected behavior when it's wrapping around from end() to begin(). Here's my implementation
template <class Container, class BiDirIterator>
void circulate(Container container, BiDirIterator cursor,
std::function<void(BiDirIterator current)> processor)
{
BiDirIterator start = cursor;
do {
processor(cursor);
cursor++;
if (cursor == container.end()) {
cursor = container.begin(); // [A]
}
} while (cursor != start);
}
// ...
typedef int T;
typedef std::list<T> TList;
typedef TList::iterator TIter;
int count = 0;
TList l;
l.push_back(42);
circulate<TList, TIter>(
l, l.begin(),
[&](TIter cur) {
std::cout << *cur << std::endl;
count++;
}
);
The output is:
42
-842150451
When I step through the code I see that the line marked [A] is never reached. The cursor is never equal to container.end(). Surprisingly, invoking ++ on that cursor, takes it to container.begin() in next pass automatically. (I suppose that's specific to this STL implementation).
How can I fix this behavior?

The issue here is that you are taking Container by value. This causes a copy so the iterators returned by container.end() and container.begin() are not that same as the iterator passed to the function. Instead if you pass Container by reference then the code works correctly.
Live Example

Stuck on a Iterator Implementation of a Trie

I have to implement a homemade Trie and I'm stuck on the Iterator part. I can't seem to figure out the increment method for the trie.
I hope someone can help me clear things out.
Here's the code for the Iterator:
template <typename T> class Trie<T>::IteratorPrefixe{
friend class Trie<T>;
public:
IteratorPrefixe() : tree(NULL), currentNode(NULL), currentKey("") {};
pair<string, T*> operator*() {return make_pair(currentKey, currentNode -> element);} ;
IteratorPrefixe operator++()throw(runtime_error);
void operator=(IteratorPrefixe iter) {tree = iter.tree; currentNode = iter.currentNode; currentKey = iter.currentKey;};
bool operator==(IteratorPrefixe iter) {return tree == iter.tree && currentNode == iter.currentNode;};
bool operator!=(IteratorPrefixe iter) {return tree != iter.tree || currentNode != iter.currentNode;};
private:
Trie<T> * tree;
Trie<T> * currentNode;
string currentKey;
};
And here's my Trie:
template <typename T> class Trie {
friend class IteratorPrefixe;
public:
// Create a Trie<T> from the alphabet of nbletters, where nbletters must be
// between 1 and NBLETTERSMAX inclusively
Trie(unsigned nbletters) throw(runtime_error);
// Add a key element of which is given in the first argument and content second argument
// The content must be defined (different from NULL pointer)
// The key is to be composed of valid letters (the letters between A + inclusive and exclusive nbletters
// Eg if nblettres is 3, a, b and c are the only characters permitted;
// If nblettres is 15, only the letters between a and o inclusive are allowed.
// Returns true if the insertion was achieved, returns false otherwise.
bool addElement(string, T*) throw(runtime_error);
// Deletes a key element of which is given as an argument and returns the contents of the node removed
// The key is to be composed of letters valid (see above)
// Can also delete at the same time the reference of the ancestors, if these ancestors are no longer used.
// Returns NULL if the item has no delete
T* removeElement(string cle) throw(runtime_error);
// Find a key element of which is given as an argument and returns the associated content
// The key is to be composed of letters valid (see above)
// Returns NULL if the key does not exist
T* searchElement(string cle) throw();
// Iterator class to browse the Trie <T> in preorder mode
class IteratorPrefixe;
// Returns an iterator pointing to the first element
IteratorPrefixe pbegin() throw(runtime_error);
// Returns an iterator pointing beyond the last item
IteratorPrefixe pend() throw();
private:
unsigned nbLetters;
T* element;
vector<Trie<T> *> childs;
Trie<T> * parent;
// This function removes a node and its ancestors if became unnecessary. It is essentially the same work
// as deleteElement that is how to designate remove a node that is changing. Moreover, unlike
// deleteElement, it does not return any information on the node removed.
void remove(Trie<T> * node) throw();
// This function is seeking a node based on a given key. It is essentially the same work
// searchElement but that returns a reference to the node found (or null if the node does not exist)
// The key is to be composed of letters valid (see above)
Trie<T>* search(string key) throw(runtime_error);
};

I'm glad to see Tries are still taught, they're an important data structure that is often neglected.
There may be a design problem in your code since you should probably have a Trie class and a Node class. The way you wrote it it looks like each node in your Trie is it's own trie, which can work, but will make some things complicated.
It's not really clear from your question what it is that you are having the problem with: figuring the order, or figuring the actual code?
From the name of the iterator, it sounds like it would have to work in prefix order. Since your trie stores words and its child nodes are organized by letters, then you are essentially expected to go over all the words in an alphabetic order. Every incrementation will bring you to the next word.
THe invariant about your iterator is that at any point (as long as it is valid), it should be pointing at a node with a "terminator character" for a valid word. Figuring that word merely involves scanning upwards through the parent chain till you find your entire string. Moving to the next word means doing a DFS search: go up once, scan for links in later "brothers", see if you find a word, if not recursively go up, etc.

You may want to see my modified trie implementations at:
jdkoftinoff's trie
Specifically, you may find the discussion I had on comp.lang.c++.moderated about implementing iterators for trie's in a STL compliant way, which is a problem since all stl containers unfortunately are forced to use std::pair<>, and the iterator therefor must contain the value instead of just a reference to the single node in the trie.

For one thing, the code shown does not actually describe a trie. Rather, it appears to be a tree containing a pair of elements in each node (T* and unsigned). You can by discipline use a tree of tuples as a trie, but it's only by convention, not enforcement. This is part of why you're having such a hard time implementing operator++.
What you need to do is have each Trie contain a left-right disjoint ADT, rather than just the raw elements. It's a layer of abstraction which is more commonly found in functional languages (e.g. Scala's Either). Unfortunately, C++'s type system isn't quite powerful enough to do something that elegant. However, there's nothing preventing you from doing this:
template <class L, class R>
class Either
{
public:
Either(L *l) : left(l), right(0)
{}
Either(R *r) : left(0), right(r)
{}
L *get_left() const
{
return left;
}
R *get_right() const
{
return right;
}
bool is_left() const
{
return left != 0;
}
bool is_right() const
{
return right != 0;
}
private:
L *left;
R *right;
};
Then your Trie's data members would be defined as follows:
private:
Either<unsigned, T*> disjoint;
vector<Trie<T> *> children; // english pluralization
Trie<T> * parent;
I'm playing fast and loose with your pointers, but you get the gist of what I'm saying. The important bit is that no given node can contain both an unsigned and a T*.
Try this, and see if that helps. I think you'll find that being able to easily determine whether you are on a leaf or a branch will help you tremendously in your attempt to iterate.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js