Implementing state-machines (ala generators) using iterators - c++

Lately I have been needing to implement small classes that generate a bunch of numbers. It would be very convenient if C++ had generators like python, but unfortunately it is not so.
So I have been thinking of how to best implement these types of objects, for easy iteration and composition. When we think of iterators over containers, they essentially only hold the index to an element, and the bulk of the information is in the container itself. This allows for several iterators to be referencing different elements within a collection at the same time.
When it come to state machines, it becomes apparent that the iterator will have to hold the whole state as several iterators need to be able to be independent. In that sense, the state machine class is more of a "builder" of these iterators which are the actual state machines.
As a toy example I have implemented the range generator (ala xrange in python), that can be used in loops:
// using range-for from c++11
for (auto i : range<int>(1, 30)) {
cout << i << endl;
}
The code can be found on my bitbucket.
That said, it is awkward to store the whole state within the iterator, since the end() iterator is created just for the sake of comparing the ending state, which wastes space if the state is a large collection of members.
Has there been anything done with simple linear state machines, and looping over them with iterators?

If you support only forward iteration, you might be able to get away with using a different type for end() then for begin(). Here's the basic idea
class iterator;
class iterator_end {
typedef ... value_type;
...
iterator& operator++ () { throw ... }
value_type operator* () { throw ... }
bool operator== (const iterator& e) const { return e == *this; }
}
class iterator {
typedef ... value_type;
..
iterator& operator++ () { ... }
value_type operator* () { ... }
bool operator== (const iterator_end& e) const { return are_we_done_yet }
}
class statemachine {
iterator begin() const { ... }
iterator_end end() const { ... }
}
I've never tries this though, so I can't guarantee that this will work. Your state-machine's iterator and const_iterator typedefs will specify a different type than end() returns, which may or may not cause trouble.
Another possibility is to go with a variation on pimpl which uses boost::optional. Put the iterator state into a separate class, and store it within a boost::optional within the iterator. Leave the state unset for the iterator returned by end(). You won't save any memory, but you avoid heap allocations (boost::optional doesn't do any, it uses placement new!) and initialization overhead.

Related

Can I define begin and end on an input iterator?

Lets say I have an input iterator type MyInputIter (which I use to traverse a tree-like structure) that satisfies the std::input_iterator concept.
Are there any reasons why I shouldn't define begin() and end() on the iterator itself?
struct MyInputIter
{
// iterator stuff omitted
auto begin() const { return *this; }
auto end() const { return MySentinel{}; }
};
Reason being that I don't have to create another type just to wrap begin and end so I can use it in a for loop:
MyInputIter iterate(TreeNode root, FilterPattern pattern)
{
return MyInputIter{ root, pattern };
}
void foo()
{
for (auto item : iterate(someRandomTreeNode, "*/*.bla"))
process(item);
}
while also being able to use it as an iterator:
std::vector<TreeNode> vec(iterate(someRandomTreeNode, "*"), MySentinel{});
Are there any reasons why I shouldn't define begin() and end() on the iterator itself?
Potential issues to consider:
Implementing those functions for the iterator may be expensive. Either because of need to traverse the structure to find them, or because of extra state stored in the iterator.
It may be confusing since it deviates from common patterns. Edit: As pointed out by 康桓瑋, there's precedent for iterators that are ranges in std::filesystem::directory_iterator, so this may not a significant issue in general. There is another consideration whether your range implementation works in an expected way.
Reason being that I don't have to create another type
As far as I can tell, you don't need to create another type. You can use:
std::ranges::subrange(MyInputIter{ root, pattern }, MySentinel{})

Should I write iterators for a class that is just a wrapper of a vector?

Should I write iterators for a class that is just a wrapper of a vector?
The only private member of my class called Record is a vector.
I want to be able to do this:
for (auto& elem : record) {
// do something with elem
}
where record is of type Record. To do this I need to implement iterators
for the Record class. But, I can also do this:
for (auto& elem : record.elems) {
// do something with elem
}
where record.elems is the vector I mentioned. But this way
I will need to make it public.
Another approach is:
for (auto& elem : record.getElems()) {
// do something with elem
}
this way I get iteration, and the member stays private.
Probably this is the best way to go.
My main question is, why would I bother to implement iterators for the Record class
when it is just a wrapper for a vector? My colleague insists I implement iterators,
and he can't even explain me why, it is just stupid.
Implementing iterators will help the encapsulation of your class. If at any point in time, you decide to replace the backing vector with another data type, you might break dependencies.
Actually, you do not need to implement the iterator. You can typedef your iterators to be the vector iterators and create inline begin() and end() methods to return the vector's iterators.
EDIT: As requested a simple example:
template <class T>
class vectorWrapper
{
typedef typename std::vector<T> backingType;
backingType v;
public:
typedef typename backingType::iterator iterator;
typedef typename backingType::const_iterator const_iterator;
iterator begin() { return v.begin(); }
const_iterator begin() const { return v.begin(); }
const_iterator cbegin() const { return v.cbegin(); } // C++11
iterator end() { return v.end(); }
const_iterator end() const { return v.end(); }
const_iterator cend() const { return v.cend(); } // C++11
};
If needed (or wanted) reverse_iterators can be added the same way. Since the methods are likely to be inlined by the compiler, there is little to no overhead in using this.
If your class should look a range, you are best off to provide access to iterators via begin() and end() members. The key advantage is that objects of the class can be used consistent with other ranges: there isn't much doubt of how the class is meant to use and the effort is small: unless you don't expect your class to be used, you should make it as easy and obvious to use. Of course, if you don't expect you class to be used why bother writing it in the first place?
Whether these need to have a different type than those of the vector depends on whether you need to impose any constraints. If the class does 't impose constraints on the elements or on their order it is sufficient to pass on the iterators. Otherwise it may be necessary to create custom iterators.

How to check if simple C++ standard library iterator can be advanced?

Suppose I have a class encapsulating std container:
class Stash
{
list<int> Data;
public:
list<int>::const_iterator GetAccess() const { return Data.begin(); }
};
It's very convenient way of forcing the user to read the data in form of the iterator. However, I can't find the way other that comparing the iterator to container.end(). So, I would like to know if there's an option to do it solely by stdlib or I have to write iterator class myself (with can_advance method, for example).
Relevant question might be this one, but it asks whether the iterator is valid, not whether it can advance. I weren't able to find any information about the latter.
You can't do this, a single iterator does not contain the information when it is at the end of the sequence its pointing into.
Normally, this is solved by either providing a range (think std::make_pair(cont.begin(), cont.end())), or providing begin() and end() methods to your class, effectively making it a range.
Iterators work in pairs: an iterator that points to the beginning of a sequence and an iterator that points past the end of the sequence. That's why all the containers have begin() and end() member functions: so you can look at the sequence of values that the container manages.
It would be far more idiomatic to change the name of GetAccess to begin and to add end. Having end() would also make it possible to apply standard algorithms to the data.
What you seem to be asking for is a "lookahead" iterator. You could write a class to "adapt" an iterator to have lookahead, where the adaptor just stays one step ahead of your code:
template<class FwdIter>
class lookahead_iterator
{
public:
lookahead_iterator(const FwdIter& begin): cur_iter(begin), next_iter(++begin) {}
operator FwdIter() const { return cur_iter; }
lookahead_iterator<FwdIter>& operator ++() { cur_iter = next_iter++; return *this; }
// Other methods as needed.
private:
FwdIter cur_iter;
FwdIter next_iter;
};
Needless to say, this gets a lot more complicated if you need more than a forward iterator.

Expose public view of privately scoped Boost.BiMap iterator

I have a privately scoped Boost.BiMap in a class, and I would like to export a public view of part of this map. I have two questions about the following code:
class Object {
typedef bimap<
unordered_set_of<Point>,
unordered_multiset_of<Value>
> PointMap;
PointMap point_map;
public:
??? GetPoints(Value v) {
...
}
The first question is if my method of iteration to get the Point's associated with a Value is correct. Below is the code I'm using to iterate over the points. My question is if I am iterating correctly because I found that I had to include the it->first == value condition, and wasn't sure if this was required given a better interface that I may not know about.
PointMap::right_const_iterator it;
it = point_map.right.find(value);
while (it != point_map.right.end() && it->first == val) {
/* do stuff */
}
The second question is what is the best way to provide a public view of the GetPoints (the ??? return type above) without exposing the bimap iterator because it seems that the caller would have to know about point_map.right.end(). Any efficient structure such as a list of references or a set would work, but I'm a bit lost on how to create the collection.
Thanks!
The first question:
Since you are using the unordered_multiset_of collection type for the right side of your bimap type, it means that it will have an interface compatible with std::unordered_multimap. std::unordered_multimap has the member function equal_range(const Key& key) which returns a std::pair of iterators, one pointing to the first element that has the desired key and one that points to one past the end of the range of elements that have the same key. Using that you can iterate over the range with the matching key without comparing the key to the value in the iteration condition.
See http://www.boost.org/doc/libs/1_41_0/libs/bimap/doc/html/boost_bimap/the_tutorial/controlling_collection_types.html and http://en.cppreference.com/w/cpp/container/unordered_multimap/equal_range for references.
The second question:
Constructing a list or other actual container of pointers or references to the elements with the matching values and returning that is inefficient since it's always going to require O(n) space, whereas just letting the user iterate over the range in the original bimap only requires returning two iterators, which only require O(1) memory.
You can either write a member function that returns the iterators directly, e.g.
typedef PointMap::right_const_iterator match_iterator;
std::pair<match_iterator, match_iterator> GetPoints(Value v) {
return point_map.right.equal_range(v);
}
or you can write a proxy class that presents a container-like interface by having begin() and end() member functions returning those two iterators, and have your GetPoints() member function return an object of that type:
class MatchList {
typedef PointMap::right_const_iterator iterator;
std::pair<iterator, iterator> m_iters;
public:
MatchList(std::pair<iterator, iterator> const& p) : m_iters(p) {}
MatchList(MatchList const&) = delete;
MatchList(MatchList&&) = delete;
MatchList& operator=(MatchList const&) = delete;
iterator begin() { return m_iters.first; }
iterator end() { return m_iters.second; }
};
It's a good idea to make it uncopyable, unmovable and unassignable (like I've done above by deleting the relevant member functions) since the user may otherwise keep a copy of the proxy class and try to access it later when the iterators could be invalidated.
The first way means writing less code, the second means presenting a more common interface to the user (and allows for hiding more stuff in the proxy class if you need to modify the implementation later).

STL iterator as return value

I have class A, that contains std::vector and I would like to give an access to the vector from outside class A.
The first thing that came to my mind is to make a get function that returns iterator to the vector, but walk through the vector I will need two iterators (begin and end).
I was wondering is there any way (technique or patters) to iterate whole vector with only one iterator? Or maybe some other way to get access to vector, of course without using vector as return value :)
Why not add both a begin() and end() function to your class that simply return v.begin(); and return v.end();? (Assuming v is your vector.)
class MyVectorWrapper
{
public:
// Give yourself the freedom to change underlying types later on:
typedef vector<int>::const_iterator const_iterator;
// Since you only want to read, only provide the const versions of begin/end:
const_iterator begin() const { return v.begin(); }
const_iterator end() const { return v.end(); }
private:
// the underlying vector:
vector<int> v;
}
Then you could iterate through your class like any other:
MyVectorWrapper myV;
for (MyVectorWrapper::const_iterator iter = myV.begin(); iter != myV.end(); ++i)
{
// do something with iter:
}
It seems potentially unwise to give that much access to the vector, but if you're going to why not just return a pointer or reference to the vector? (Returning the vector itself is going to be potentially expensive.) Alternately, return a std::pair<> of iterators.
An iterator is just an indicator for one object; for example, a pointer can be a perfectly good random-access iterator. It doesn't carry information about what sort of container the object is in.
The STL introduced the idiom of using two iterators for describing a range. Since then begin() and end() are the getters. The advantage of this is that a simple pair of pointers into a C array can be used as perfect iterators. The disadvantage is that you need to carry around two objects. C++1x will, AFAIK, introduce a range concept to somewhat ease the latter.
It sounds like what you need to accomplish will be potential violation of the so called Law of Demeter. This law is often an overkill, but oftentimes it is reasonable to obey it. Although you may grant read-only access by giving out only const-iterators you are still at risk that your iterators will get invalidated (easily with vectors) without third party knowing it. As you can never predict how your program will evolve over-time it is better to avoid providing features that can be misused.
Common motto: make your class interface easy to use and hard to misuse. The second part is important for overall product quality.
More:
template <typename T,typename A = std::allocator<T>>
class yetAnotherVector
{
public:
typedef std::_Vector_iterator<T, A> iterator;
protected:
std::vector<T,A> mV;
public:
void add(const T& t)
{
if(!isExist(t))mV.push_back(t);
}
bool isExist(const T& t)
{
std::vector<T,A>::iterator itr;
for(itr = mV.begin(); itr != mV.end(); ++itr)
{
if((*itr) == t)
{
return true;
}
}
return false;
}
iterator begin(void){return mV.begin();}
iterator end(void){return mV.end();}
};