C++/STL public iterators for deeply nested private data - c++

Consider the below class definition featuring deeply nested private data.
template <typename T, typename U, typename V>
class NestedData {
private:
typedef std::vector<V> L3;
typedef std::map<U, L3> L2;
typedef std::map<T, L2> L1;
L1 inner_data;
};
Suppose we wish to add a public interface to iterate over each of the three levels. That is, we want to allow the client to build three nested loops that iterate over each value of type V, knowing the associated values of types U and T. We could add the below methods:
L1::const_iterator begin() const { return inner_data.begin(); }
L1::const_iterator end() const { return inner_data.end(); }
In this case, the results from the iterator would be pairs, with the right side being referenced to type L2. Although the reference is const, it exposes the full API of the std::map type, which is undesirable for a public interface. A cleaner solution would offer only a wrapper to makes it possible to iterate over L2. However, I have not seen any standard idioms for this kind of encapsulation of STL containers, and even the best case, implementation of a solution seems to create a high degree of code bloat.
What would be a common solution to this issue that plays well with general C++ conventions?

Just expose the map itself through const getter. Exposing the iterator still binds your users to the underlying datatype, but it's crippling them - for example, they can't use find on it - so there is a drawback without any benefit.
If you can redesign your class in such a way that inner map fully becomes an implementation detail, and clients would never need to see it - than you will remove iterators and setters, and expose some meaningful functions instead.

I would create a custom iterator that would provide the T, U, V values as you iterate. This would completely insulate your user from the implementation details of your data. If you truly need nested type logic. Then I would use a custom iterator that returned an adapter object that would return the final iterator to the list.

Related

How to hide my inner collection but allow the user iterate over him ?

I have many class that has inside list<cats> , and i want to allow users that use my class to iterate over the cats i have, so i give them access to a const iterator by writing the following function :
list<cats>::const_iterator GetIterator()
{
//return the iterator
}
Now i want to change my implementation of the class and use vector instead of list, so i need to return a const iterator of vector.
The problem is that every one that uses my class now need to change their code from list<cats>::const_iterator to vector<cat>::const_iterator.
If all the iterators where to inherit from "forward iterator" it would be very useful.
My major question is how i solve this problem, i don't want to inherit from a collection.
Another very related question is why the designers of STL chose not to inherit iterators one from another ? (for example random access iterators could have inherit from forward iterators, he have all it's functionality)
I did pretty extensive search before i asked my question but couldn't find the solution.
The closest thing i found to my question is this, but it is not exactly my question.
Give access to encapsulated container
Expose an iterator type from your class, e.g:
class a
{
vector<int> g;
public:
typdef vector<int>::const_iterator const_iterator;
const_iterator begin() const
{ return g.begin(); }
: // etc
};
If the users can recompile the code you can use
typedef list<cats> CatList;
in your include file. Then, if you want to change the container, change it to
typedef vector<cats> CatList;
and the users would use e.g.
CatList::iterator it;
However this is not a good practice; the iterators of different containers may look equal but behave differently e.g. deleting an item from a vector invalidates all iterators but doing the same on list affects only iterator of deleted item. Not mentioning the case if some day you want to use std::map<some_key,cats>.
I agree with Nim's answer and wanted to offer the style I follow which I think makes maintenance easier since now changing the underlying container type requires no work at all for users of your class and very little work for you when maintaining your class.
class A
{
// change only this to change the underlying container:
// Don't Repeat Yourself!
using container_t = std::vector<int>;
public:
// constructor from a copy of a container
A(container_t source)
: _myContainer { std::move(source) }
{}
// this derived type is part of the public interface
using const_iterator_t = container_t::const_iterator;
// following your function naming style...
const_iterator_t GetIterator() const { return /* some iterator of _myContainer */; }
const_iterator_t GetEnd() const { return std::end(_myContainer); }
private:
container_t _myContainer;
};

Polymorphic converting iterator

Consider the following (simplified) scenario:
class edgeOne {
private:
...
public:
int startNode();
int endNode();
};
class containerOne {
private:
std::vector<edgeOne> _edges;
public:
std::vector<edgeOne>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeOne>::const_iterator edgesEnd(){
return _edges.end();
};
};
class edgeTwo {
private:
...
public:
int startNode();
int endNode();
};
class containerTwo {
private:
std::vector<edgeTwo> _edges;
public:
std::vector<edgeTwo>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeTwo>::const_iterator edgesEnd(){
return _edges.end();
};
};
I.e., I have two mostly identical edge types and two mostly identical container types. I can iterate over each kind individually. So far, so fine.
But now my use case is the following: Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
So my idea is the following: I want to have an iterator with the following properties:
- Regarding its traversal behavior, it behaves either like a std::vector<edgeOne>::const_iterator or a std::vector<edgeTwo>::const_iterator, depending on how it was initialized.
- Instead of returning a const edgeOne & or const edgeTwo &, operator* should return a std::pair<int,int>, i.e., apply a conversion.
I found the Boost.Iterator Library, in particular:
iterator_facade, which helps to build a standard-conforming iterator and
transform_iterator, which could be used to transform edgeOne and edgeTwo to std::pair<int,int>,
but I am not completely sure how the complete solution should look like. If I build nearly the entire iterator myself, is there any benefit to use transform_iterator, or will it just make the solution more heavy-weight?
I guess the iterator only needs to store the following data:
A flag (bool would be sufficient for the moment, but probably an enum value is easier to extend if necessary) indicating whether the value type is edgeOne or edgeTwo.
A union with entries for both iterator types (where only the one that matches the flag will ever be accessed).
Anything else can be computed on-the-fly.
I wonder if there is an existing solution for this polymorphic behavior, i.e. an iterator implementation combining two (or more) underlying iterator implementation with the same value type. If such a thing exists, I could use it to just combine two transform_iterators.
Dispatching (i.e., deciding whether a containerOne or a containerTwo object needs to be accessed) could easily be done by a freestanding function ...
Any thoughts or suggestions regarding this issue?
How about making your edgeOne & edgeTwo polymorphic ? and using pointers in containers ?
class edge
class edgeOne : public edge
class edgeTwo : public edge
std::vector<edge*>
Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
By "code duplication", do you mean source code or object code? Is there some reason to go past a simple template? If you're concerned about the two template instantiations constituting "code duplicaton" you can move most of the processing to an out-of-line non-templated do_whatever_with(int, int) support function....
template <typename Container>
void iterator_and_process_edges(const Container& c)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
do_whatever_with(i->startNode(), i->endNode());
}
if (criteria)
iterate_and_process_edges(getContainerOne());
else
iterate_and_process_edges(getContainerTwo());
My original aim was to hide the dispatching functionality from the code that just needs to access the start and end node. The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Not sure I follow you there, but I'm trying. So, "code that just needs to access the start and end node". It's not clear whether by access you mean to get startNode and endNode for a container element, or to use those values. I'd already factored out a do_whatever_with function that used them, so by elimination your request appears to want to isolate the code extracting the nodes from an Edge - done below in a functor:
template <typename Edge>
struct Processor
{
void operator()(Edge& edge) const
{
do_whatever_with(edge.startNode(), edge.endNode());
}
};
That functor can then be applied to each node in the Container:
template <typename Container, class Processor>
void visit(const Container& c, const Processor& processor)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
processor(*i);
}
"hide the dispatching functionality from the code that just needs to access the start and end node" - seems to me there are various levels of dispatching - on the basis of criteria, then on the basis of iteration (every layer of function call is a "dispatch" in one sense) but again by elimination I assume it's the isolation of the iteration as above that you're after.
if (criteria)
visit(getContainerOne(), Process<EdgeOne>());
else
visit(getContainerTwo(), Process<EdgeTwo>());
The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Can't say I agree with you, but then it depends whether you can see any maintenance issue with my first approach (looks dirt simple to me - a layer less than this latest incarnation and considerably less fiddly), or some potential for reuse. The visit implementation above is intended to be reusable, but reusing a single for-loop (that would simplify further if you have C++11) isn't useful IMHO.
Are you more comfortable with this modularisation, or am I misunderstanding your needs entirely somehow?
template<typename T1, typename T2>
boost::variant<T1*, T2*> get_nth( boost::variant< std::vector<T1>::iterator, std::vector<T2>::iterator > begin, std::size_t n ) {
// get either a T1 or a T2 from whichever vector you actually have a pointer to
}
// implement this using boost::iterator utilities, I think fascade might be the right one
// This takes a function that takes an index, and returns the nth element. It compares for
// equality based on index, and moves position based on index:
template<typename Lambda>
struct generator_iterator {
std::size_t index;
Lambda f;
};
template<typename Lambda>
generator_iterator< typename std::decay<Lambda>::type > make_generator_iterator( Lambda&&, std::size_t index=0 );
boost::variant< it1, it2 > begin; // set this to either one of the begins
auto double_begin = make_generator_iterator( [begin](std::size_t n){return get_nth( begin, n );} );
auto double_end = double_begin + number_of_elements; // where number_of_elements is how big the container we are type-erasing
Now we have an iterator that can iterate over one, or the other, and returns a boost::variant<T1*, T2*>.
We can then write a helper function that uses a visitor to extract the two fields you want from the returned variant, and treat it like an ADT. If you dislike ADTs, you can instead write up a class that wraps the variant and provides methods, or even change the get_nth to be less generic and instead return a struct with your data already produced.
There is going to be the equivalent of a branch on each access, but there is no virtual function overhead in this plan. It does currently requires an auto typed variable, but you can write an explicit functor to replace the lambda [begin](std::size_t n){return get_nth( begin, n );} and even that issue goes away.
Easier solutions:
Write a for_each function that iterates over each of the containers, and passes in the processed data to the passed in function.
struct simple_data { int x,y; };
std::function<std::function<void(simple_data)>> for_each() const {
auto begin = _edges.begin();
auto end = _edges.end();
return [begin,end](std::function<void(simple_data)> f) {
for(auto it = begin; it != end; ++it) {
simple_data d;
d.x = it->getX();
d.y = it->getY();
f(d);
}
};
};
and similar in the other. We now can iterate over the contents without caring about the details by calling foo->foreach()( [&](simple_data d) { /*code*/ } );, and because I stuffed the foreach into a std::function return value instead of doing it directly, we can pass the concept of looping around to another function.
As mentioned in comments, other solutions can include using boost's type-erased iterator wrappers, or writing a python-style generator mutable generator that returns either a simple_data. You could also directly use the boost iterator-creation functions to create an iterator over boost::variant<T1, T2>.

Wrapping STL in order to extend

So I read that inheritance from STL is a bad idea. But what about wrapping STL classes inside other classes in order to extend them. This with the main purpose of separating levels of abstraction, among other things.
Something like:
template<class T>
class MyVector{
public:
T& last(){
return data[data.size() - 1]
}
bool include(T& element);
//etc.
private:
vector<T> data;
}
Is this a good idea? Is this something c++ programmers do often?
Thank you.
Yes, wrapping is better than inheritance, but only if you need to add state to an existing data structure. Otherwise, add non-member functions. This is explained in more detail in item 35 of C++ Coding Standards (Amazon)
To add state, prefer composition instead of inheritance. Admittedly, it's tedious to have to write passthrough functions for the member functions
you want to keep, but such an implementation is vastly better and safer than
using public or nonpublic inheritance.
template<typename T>
class MyExtendedVector
{
public:
// pass-through functions for std::vector members
void some_existing_fun() { return vec_.some_existing_fun(); }
// new functionality that uses extra state
void some_new_fun() { // your implementation here }
private:
std::vector<T> vec_;
// extra state e.g. to do logging, caching, or whatever
};
To add behavior, prefer to add nonmem-ber functions instead of member
functions.
However make sure to make your algorithms as generic as possible:
// if you CAN: write algorithm that uses iterators only
// (and delegates to iterator category that container supports)
template<typename Iterator, typename T>
bool generic_contains(Iterator first, Iterator last, T const& elem)
{
// implement using iterators + STL algorithms only
return std::find(first, last, elem) != last;
}
// if you MUST: write algorithm that uses specific interface of container
template<typename T>
void vector_contains(std::vector<T> const& v, T const& elem)
{
// implement using public interface of Container + STL algorithms
return generic_contains(v.begin(), v.end(), elem);
}
I can only speak for myself but I haven't done this, and I probably wouldn't suggest it in general.
In almost every case an iterator-based algorithm is simpler to implement and separates the algorithm from the container. For example, assuming that your include method is simply to determine if an element is in the vector you would use either find, binary_search, or lower_bound depending on your container's contents and search needs.
Occasionally I have implemented a class that looks like a container to the outside world by providing begin/end methods. In this case it sometimes does have a standard container underlying, but you only implement a minimal public interface to represent what your class actually models.
I would avoid doing this, as the last person replied. I understand where you are going where you want your container to wrap some of the more complex operations into simplier methods and are thinking that you could change the underlying container at some point without having to change your interfaces. However, having said that, your object should model your business requirements and then the implementation of those objects would use whatever data container and access patterns are best. I guess what I am saying is that you won't end up re-using this new vector class as the data access of your business requirements will be different each time and you will use a some standard generic container like std::vector again and the iterator based algos to access the data.
Now if there is some algo that doesn't exist, you can write that iterator based algo for that specific project and then keep that algo code which you may be able to re-use. Below shows a set grep algo that I wrote that was based on a set intersection but wasn't doing exactly what I wanted. I could reuse this algo another time.
#include <utility>
#include <algorithm>
// this is a set grep meaning any items that are in set one
// will be pulled out if they match anything in set 2 based on operator pred
template<typename _InputIterator1, typename _InputIterator2,
typename _OutputIterator, typename _Compare>
_OutputIterator
setGrep(_InputIterator1 __first1, _InputIterator1 __last1,
_InputIterator2 __first2, _InputIterator2 __last2,
_OutputIterator __result, _Compare __comp)
{
while (__first1 != __last1 && __first2 != __last2)
if (__comp(*__first1, *__first2))
++__first1;
else if (__comp(*__first2, *__first1))
++__first2;
else
{
*__result = *__first1;
++__first1;
++__result;
}
return __result;
}

Advantages of an empty class in C++

What could be the possible advantages/uses of having an empty class?
P.S:
This question might sound trivial to some of you but it is just for learning purpose and has no practical significance. FYI googling didn't help.
One use would be in template (meta-)programming: for instance, iterator tags are implemented as empty classes. The only purpose here is to pass around information at compilation time so you can check, if an iterator passed to e.g. a template function meets specific requirements.
EXAMPLE:
This is really simplified, just to ge an idea. Here the purpose of the tag class is to decide, which implementation of an algorithm to use:
class forward_iterator_tag {};
class random_access_iterator_tag {};
class MySimpleForwardIterator {
public:
typedef typename forward_iterator_tag tag;
// ...
};
class MySimpleRandomIterator {
public:
typedef typename random_access_iterator_tag tag;
// ...
};
template<class iterator, class tag>
void myfunc_int(iterator it, tag t) {
// general implementation of myfunc
}
template<class iterator>
void myfunc_int<iterator, forward_iterator_tag>(iterator it) {
// Implementation for forward iterators
}
template<class iterator>
void myfunc_int<iterator, random_access_iterator_tag>(iterator it) {
// Implementation for random access iterators
}
template<class iterator>
void myfunc(iterator it) {
myfunc_int<iterator, typename iterator::tag>(it);
}
(I hope I got this right, it's been a while since I used this ...)
With this code, you can call myfunc on an arbitrary iterator, and let the compiler choose the correct implementation depending on the iterator type (i.e. tag).
The following can be used to have a boost::variant which can hold an (SQL) NULL value for example.
class Null { };
typedef boost::variant<Null, std::string, int> Value;
To make it more useful things like operator== and operator<< are handy. For example:
std::ostream& operator<<(std::ostream &lhs, const Null &rhs)
{
lhs << "*NULL*";
return lhs;
}
int main()
{
Variant v("hello");
std::cout << v << std::endl;
v = Null();
std::cout << v << std::endl;
...
}
Will give:
hello
*NULL*
In the STL, Standard Template Library of the C++, for example you have
template<class _Arg,
class _Result>
struct unary_function
{ // base class for unary functions
typedef _Arg argument_type;
typedef _Result result_type;
};
When defining a functor, you can inherit unary_function, and then you have the typedef defined automatically at your disposal.
An empty class could be used as a "token" defining something unique; in certain patterns, you want an implementation-agnostic representation of a unique instance, which has no value to the developer other than its uniqueness. One example is Unit of Work; you may not care one bit about what's going on inside your performer, but you want to tell that performer that the tasks you're telling it to perform are part of an atomic set. An empty class representing the Unit of Work to the outside world may be perfect in this case; almost anything a Unit of Work object could store or do (encapsulating a DB transaction, exposing Commit/Rollback behaviors) would start tying you to a particular implementation, but an object reference is useful to provide a unique but copyable and passable reference to the atomic set of tasks.
You can use it like a placeholder for checking purpose or as enabler to special functionality. For example in Java exist the "empty" interface Serializable used to specify if a class is serializable.
"empty" classes means classes which have no data members?
They typically declare typedefs or member functions, and you can extend them with your own classes.
Here is an interesting link with answers to why its allowed. You might find this helpful to find situations where it might be useful.
As others have said, often an empty class (or struct) is used a placeholder, a differentiator, a token, etc.
For example, a lot of people are unaware that there are "nothrow" versions of operator new. The syntax to invoke nothrow new is:
p = new(std::nothrow) Bar;
and std::nothrow is defined simply as
struct nothrow_t {}; //defined in namespace std
The answer by MartinStettner is fine though just to highlight an important point here: The concept of iterator tags or for that matter any tags in C++, is not strictly dependent on empty classes. The C++ tags, if stl writers would have wanted to, could well have been non-empty classes; that should work but then it won't add any additional value; at least for compile time acrobatics that it is usually reserved for.
For the sake of typeid
Suppose we have comparable interface Id. We need fill some container with unique instances of this interface. How to guarantee the uniqueness of Id instances produced by independent software parts? «Independent parts» means some different dynamic libraries, compiled by different programmers from different locations
One of decisions is to compare typeid of some type first. If typeid matches, convert and compare other implementation specific properties. C++ language guarantees uniqueness of any type within process memory. Which type should be used for this purpose? Any type with minimum resource consumption — empty one

Returning an 'any kind of input iterator' instead of a vector::iterator or a list::iterator [duplicate]

This question already has answers here:
Generic iterator
(3 answers)
Closed 8 years ago.
Suppose I want to implement in C++ a data-structure to store oriented graphs. Arcs will be stored in Nodes thanks to STL containers. I'd like users to be able to iterate over the arcs of a node, in an STL-like way.
The issue I have is that I don't want to expose in the Node class (that will actually be an abstract base class) which STL container I will actually use in the concrete class. I therefore don't want to have my methods return std::list::iterator or std::vector::iterator...
I tried this:
class Arc;
typedef std::iterator<std::random_access_iterator_tag, Arc*> ArcIterator; // Wrong!
class Node {
public:
ArcIterator incomingArcsBegin() const {
return _incomingArcs.begin();
}
private:
std::vector<Arc*> _incomingArcs;
};
But this is not correct because a vector::const_iterator can't be used to create an ArcIterator. So what can be this ArcIterator?
I found this paper about Custom Iterators for the STL but it did not help. I must be a bit heavy today... ;)
Try this:
class Arc;
class Node {
private:
std::vector<Arc*> incoming_;
public:
typedef std::vector<Arc*>::iterator iterator;
iterator incoming_arcs_begin()
{ return incoming_.begin(); }
};
And use Node::iterator in the rest of the code. When/if you change the container, you have to change the typedef in a single place. (You could take this one step further with additional typedef for the storage, in this case vector.)
As for the const issue, either define vector's const_iterator to be your iterator, or define double iterator types (const and non-const version) as vector does.
Have a look at Adobe's any_iterator: this class uses a technique called type erase by which the underyling iterator type is hidden behind an abstract interface. Beware: the use of any_iterator incurs a runtime penalty due to virtual dispatching.
I want to think there should be a way to do this through straight STL, similar to what you are trying to do.
If not, you may want to look into using boost's iterator facades and adaptors where you can define your own iterators or adapt other objects into iterators.
To hide the fact that your iterators are based on std::vector<Arc*>::iterator you need an iterator class that delegates to std::vector<Arc*>::iterator. std::iterator does not do this.
If you look at the header files in your compiler's C++ standard library, you may find that std::iterator isn't very useful on its own, unless all you need is a class that defines typedefs for iterator_category, value_type, etc.
As Doug T. mentioned in his answer, the boost library has classes that make it easier to write iterators. In particular, boost::indirect_iterator might be helpful if you want your iterators to return an Arc when dereferenced instead of an Arc*.
Consider using the Visitor Pattern and inverting the relationship: instead of asking the graph structure for a container of data, you give the graph a functor and let the graph apply that functor to its data.
The visitor pattern is a commonly used pattern on graphs, check out boost's graph library documentation on visitors concepts.
If you really don't want the client's of that class to know that it uses a vector underneath, but still want them to be able to somehow iterate over it, you most likely will need to create a class that forwards all its methods to std::vector::iterator.
An alternative would be to templatize Node based on the type of container it should use underneath. Then the clients specifically know what type of container it is using because they told them to use it.
Personally I don't think it usually makes sense to encapsulate the vector away from the user, but still provide most (or even some) of its interface. Its too thin of an encapsulation layer to really provide any benefit.
I looked in the header file VECTOR.
vector<Arc*>::const_iterator
is a typedef for
allocator<Arc*>::const_pointer
Could that be your ArcIterator? Like:
typedef allocator<Arc*>::const_pointer ArcIterator;
You could templatize the Node class, and typedef both iterator and const_iterator in it.
For example:
class Arc {};
template<
template<class T, class U> class Container = std::vector,
class Allocator = std::allocator<Arc*>
>
class Node
{
public:
typedef typename Container<Arc*, Allocator>::iterator ArcIterator;
typedef typename Container<Arc*, Allocator>::Const_iterator constArcIterator;
constArcIterator incomingArcsBegin() const {
return _incomingArcs.begin();
}
ArcIterator incomingArcsBegin() {
return _incomingArcs.begin();
}
private:
Container<Arc*, Allocator> _incomingArcs;
};
I haven't tried this code, but it gives you the idea. However, you have to notice that using a ConstArcIterator will just disallow the modification of the pointer to the Arc, not the modification of the Arc itself (through non-const methods for example).
C++0x will allow you do this with automatic type determination.
In the new standard, this
for (vector::const_iterator itr = myvec.begin(); itr != myvec.end(); ++itr
can be replaced with this
for (auto itr = myvec.begin(); itr != myvec.end(); ++itr)
By the same token, you will be able to return whatever iterator is appropriate, and store it in an auto variable.
Until the new standard kicks in, you would have to either templatize your class, or provide an abstract interface to access the elements of your list/vector. For instance, you can do that by storing an iterator in member variable, and provide member functions, like begin() and next(). This, of course, would mean that only one loop at a time can safely iterate over your elements.
Well because std::vector is guaranteed to have contiguous storage, it should be perfect fine to do this:
class Arc;
typedef Arc* ArcIterator;
class Node {
public:
ArcIterator incomingArcsBegin() const {
return &_incomingArcs[0]
}
ArcIterator incomingArcsEnd() const {
return &_incomingArcs[_incomingArcs.size()]
}
private:
std::vector<Arc*> _incomingArcs;
};
Basically, pointers function enough like random access iterators that they are a sufficient replacement.