iterator with special case implementation - c++

Is there a way to build an iterator class that has two implementations : a general implementation for a container containing any number of elements and a special case (very fast) implementation when the container contains a single element wihtout using virtual functions and dynamic polymorphism ?
For the moment, I have :
struct Container {
struct FastIterator;
struct SlowIterator;
void add(...) { ... }
SlowIterator begin_slow() { ... }
FastIterator begin_fast() { ... }
};
instead I would like to have :
struct Container {
struct Iterator;
void add(...) { ... }
Iterator begin() { // select between fast and slow based on the contents of the container }
};
so that :
void f() {
Container c;
c.add(...);
Container::Iterator it = c.begin(); // uses FastIterator hidden by the Iterator type
}
void f2() {
Container c;
c.add(...);
c.add(...);
Container::Iterator it = c.begin(); // use SlowIterator hidden by the iterator type
}
Of course, the obvious way would be to use virtual function or a delegate in the Iterator implementation to switch from one case to the other, however I tested that this slows down a lot the iteration compared to directly using the Slow/Fast iterators.
Since all the information to decide which implementation to use is available during the call to begin(), I would think there is a way to use some kind of compile time polymorphism/trick to avoid any kind of indirection.
Also, I really don't want the user to have to decide if it should call begin_fast() or begin_slow(), this should be automatically handled and hidden by the Iterator class.
Is there a way ?
Thanks

Sure.
Your container becomes a std::variant of two different states, the "single element" state and the "many element" state (and maybe the "zero element" state).
The member function add can convert the zero or single-element container into a single or multi-element function. Similarly, a remove might do the opposite in some cases.
The variant itself doesn't have a begin or end. Instead, users must std::visit it with a function object that can accept either.
template<class T>
struct Container:
std::variant<std::array<T,0>, std::array<T,1>, std::vector<T>>
{
void add(T t) {
std::visit(
overload(
[&](std::array<T,0>& self) {
*this = std::array<T,1>{{std::move(t)}};
},
[&](std::array<T,1>& self) {
std::array<T,1> tmp = std::move(self);
*this = std::vector<T>{tmp[0], std::move(t)};
},
[&](std::vector<T>& self) {
self.push_back( std::move(t) );
}
),
*this
);
}
};
boost has a variant that works similarly. overload is merely
struct tag {};
template<class...Fs>
struct overload_t {overload_t(tag){}};
template<class F0, class F1, class...Fs>
struct overload_t: overload_t<F0>, overload_t<F1, Fs...> {
using overload_t<F0>::operator();
using overload_t<F1, Fs...>::operator();
template<class A0, class A1, class...Args>
overload_t( tag, A0&&a0, A1&&a1, Args&&...args ):
overload_t<F0>( tag{}, std::forward<A0>(a0)),
overload_t<F1, Fs...>(tag{}, std::forward<A1>(a1), std::forward<Args>(args)...)
{}
};
template<class F>
struct overload_t:F {
using F::operator();
template<class A>
overload_t( tag, A&& a ):F(std::forward<A>(a)){}
};
template<class...Fs>
overload_t<std::decay_t<Fs>...> overload(Fs&&...fs) {
return {tag{}, std::forward<Fs>(fs)...};
}
overload is ridiculously easier in c++17:
template<class...Fs>
struct overload:Fs{
using Fs::operator();
};
template<class...Fs>
overload->overload<Fs...>;
and use {} instead of ().
Use of this in c++14 looks like:
Container<int> bob = get_container();
std::visit( [](auto&& bob){
for (int x:bob) {
std::cout << x << "\n";
}
}, bob );
and for the 0 and 1 case, the size of the loop will be known exactly to the compiler.
In c++11 you'll have to write an external template function object instead of an inline lambda.
You could move the variant part out of the Container and into what begin returns (inside the iterator), but that would require a complex branching iterator implementation or for callers to visit on the iterator. And as the begin/end iterator types are probably tied, you'd want to return a range anyhow so the visit makes sense. And that gets you half way back to the Container solution anyhow.
You could also implement this outside of variant, but as a general rule earlier operations on a variable cannot change the later type in the same scope of code. It can be used to dispatch on a callable object passed in "continuation passing style", where both implementations will be compiled but one chosen at runtime (via branch). It may be possible for a compiler to realize which branch the visit will go down and dead-code eliminate the other, but the other branch still needs to be valid code.
If you want fully dynamicly typed objects, you are going to lose a factor of 2 to 10 speed at least (which is what languages who support this do), which is hard to recover by iteration efficiency on one element loops. That would be related to storing the variant-equivalent (maybe a virtual interface or whatever) in the iterator returned and making it complexly handle the branch at runtime. As your goal is performance, this isn't practical.
In theory, C++ could have the ability to change the type of variables based on operations on them. Ie, a theoretical language in which
Container c;
is of type "empty container", then:
c.add(foo);
now c changes static type to "single element container", then
c.add(foo);
and c changes static type to "multi-element container".
But that isn't the C++ type model. You can emulate it like above (at runtime), but it isn't the same.

Related

Type erasure: Retrieving value - type check at compile time

I have a limited set of very different types, from which I want to store instances in a single collection, specifically a map. To this end, I use the type erasure idiom, ie. I have a non-templated base class from which the templated, type specific class inherits:
struct concept
{
virtual std::unique_ptr<concept> copy() = 0; // example member function
};
template <typename T>
struct model : concept
{
T value;
std::unique_ptr<concept> copy() override { ... }
}
I then store unique_ptrs to concept in my map. To retrieve the value, I have a templated function which does a dynamic cast to the specified type.
template <typename T>
void get(concept& c, T& out) {
auto model = dynamic_cast<model<T>>(&c);
if (model == nullptr) throw "error, wrong type";
out = model->value;
}
What I don't like about this solution is, that specifying a wrong T is only detected at runtime. I'd really really like this to be done at compile time.
My options are as I see the following, but I don't think they can help here:
Using ad hoc polymorphism by specifying free functions with each type as an overload, or a template function, but I do not know where to store the result.
Using CRTP won't work, because then the base class would need to be templated.
Conceptually I would need a virtual function which takes an instance of a class where the result will be stored. However since my types are fundamentally different, this class would need to be templated, which does not work with virtual.
Anyways, I'm not even sure if this is logically possible, but I would be very glad if there was a way to do this.
For a limited set of types, your best option is variant. You can operate on a variant most easily by specifying what action you would take for every single variant, and then it can operate on a variant correctly. Something along these lines:
std::unordered_map<std::string, std::variant<Foo, Bar>> m;
m["a_foo"] = Foo{};
m["a_bar"] = Bar{};
for (auto& e : m) {
std::visit(overloaded([] (Foo&) { std::cerr << "a foo\n"; }
[] (Bar&) { std::cerr << "a bar\n"; },
e.second);
}
std::variant is c++17 but is often available in the experimental namespace beforehand, you can also use the version from boost. See here for the definition of overloaded: http://en.cppreference.com/w/cpp/utility/variant/visit (just a small utility the standard library unfortunately doesn't provide).
Of course, if you are expecting that a certain key maps to a particular type, and want to throw an error if it doesn't, well, there is no way to handle that at compile time still. But this does let you write visitors that do the thing you want for each type in the variant, similar to a virtual in a sense but without needing to actually have a common interface or base class.
You cannot do compile-time type checking for an erased type. That goes against the whole point of type erasure in the first place.
However, you can get an equivalent level of safety by providing an invariant guarantee that the erased type will match the expected type.
Obviously, wether that's feasible or not depends on your design at a higher level.
Here's an example:
class concept {
public:
virtual ~concept() {}
};
template<typename T>
struct model : public concept {
T value;
};
class Holder {
public:
template<typename T>
void addModel() {
map.emplace(std::type_index(typeid(T)), std::make_unique<model<T><());
}
template<typename T>
T getValue() {
auto found = types.find(std::type_index(typeid(T)));
if(found == types.end()) {
throw std::runtime_error("type not found");
}
// no need to dynamic cast here. The invariant is covering us.
return static_cast<model<T>*>(found->second.get())->value;
}
private:
// invariant: map[type] is always a model<type>
std::map<std::type_index, std::unique_ptr<concept>> types;
};
The strong encapsulation here provides a level of safety almost equivalent to a compile-time check, since map insertions are aggressively protected to maintain the invariant.
Again, this might not work with your design, but it's a way of handling that situation.
Your runtime check occurs at the point where you exit type erasure.
If you want to compile time check the operation, move it within the type erased boundaries, or export enough information to type erase later.
So enumerate the types, like std variant. Or enumerate the algorithms, like you did copy. You can even mix it, like a variant of various type erased sub-algorithms for the various kinds of type stored.
This does not support any algorithm on any type polymorphism; one of the two must be enumerated for things to resolve at compile time and not have a runtime check.

a dictionary with compile time keys, or types as keys

Take the following "type" dictionary -- its keys are supposed to be types, and values an instance of that type:
class TypeDictionary {
public:
template<class T> void insert(T t);
template<class T> T& get();
// implementation be here -- and the question is about this
};
struct Foo;
struct Bar;
void userOfTypeDictionary() {
TypeDictionary td;
td.insert( Foo() );
td.insert( Bar() );
td.insert( double(3.14) );
// and other unknown (to TypeDictionary)
// list of types attached
// later on, in a different scope perhaps
Foo& f = td.get<Foo>() ;
Bar& f = td.get<Bar>();
double pi = tg.get<double>();
// ...
}
This particular TypeDictionary has a growing set of types "registered", but of course, the facility should allow an arbitrary set of types, and potentially a different set for each instance of this class.
As for a realistic motivating use case, think of a plugin manager. At any given time of its life an arbitrary heterogeneous set of objects are attached to it, and the it is the manager's job to manage the life time of the attached objects and return (a reference) to them when queried in a type-safe manner.
Any ideas on whether such a thing is possible? It is fine if a strategy involves using a library, like Boost.Fusion or similar.
A minimal but probably sub-optimal way of doing this is using the type-wise versions of std::get that are only available in C++14. This way your dictionary would simply derive or wrap a tuple:
template<typename... T>
struct dict
{
std::tuple<T...> t;
template<typename E>
void insert(E e) { std::get<E>(t) = e; }
template<typename E>
E& get() { return std::get<E>(t); }
};
struct Foo {};
struct Bar {};
using TypeDictionary = dict<Foo, Bar, double>;
The above gives the desired behavior with clang -std=c++1y but fails with g++ up to 4.8.2 at least.
Anyhow, the idea is that you have a predefined list of type-keys given up-front, and your dictionary behaves like a fixed-size flat set, meaning that it includes size for its elements (at least non-empty ones) even if you don't call insert(), in which case these elements are default-constructed.
I have paid no particular attention here to giving const/non-const versions of methods, or move operations. The above is only to get the idea. Of course, this is so simple that you could just use std::tuple and std::get() directly.
To have the dictionary with a fixed type regardless of its element types is quite harder; it's a matter of type erasure and in this respect it resembles std::function. To get a really dynamic type-safe heterogeneous container (with no casts, virtual functions etc.) is not possible with the given syntax I think. What is possible is to get a new dictionary of a new (enlarged) type after each insert() operation, e.g.
auto new_d = d.insert(3.14);
This would simply concatenate the existing tuple<T...> with the new element to give a new tuple<T...,double>.

Ad hoc polymorphism and heterogeneous containers with value semantics

I have a number of unrelated types that all support the same operations through overloaded free functions (ad hoc polymorphism):
struct A {};
void use(int x) { std::cout << "int = " << x << std::endl; }
void use(const std::string& x) { std::cout << "string = " << x << std::endl; }
void use(const A&) { std::cout << "class A" << std::endl; }
As the title of the question implies, I want to store instances of those types in an heterogeneous container so that I can use() them no matter what concrete type they are. The container must have value semantics (ie. an assignment between two containers copies the data, it doesn't share it).
std::vector<???> items;
items.emplace_back(3);
items.emplace_back(std::string{ "hello" });
items.emplace_back(A{});
for (const auto& item: items)
use(item);
// or better yet
use(items);
And of course this must be fully extensible. Think of a library API that takes a vector<???>, and client code that adds its own types to the already known ones.
The usual solution is to store (smart) pointers to an (abstract) interface (eg. vector<unique_ptr<IUsable>>) but this has a number of drawbacks -- from the top of my head:
I have to migrate my current ad hoc polymorphic model to a class hierarchy where every single class inherits from the common interface. Oh snap! Now I have to write wrappers for int and string and what not... Not to mention the decreased reusability/composability due to the free member functions becoming intimately tied to the interface (virtual member functions).
The container loses its value semantics: a simple assignment vec1 = vec2 is impossible if we use unique_ptr (forcing me to manually perform deep copies), or both containers end up with shared state if we use shared_ptr (which has its advantages and disadvantages -- but since I want value semantics on the container, again I am forced to manually perform deep copies).
To be able to perform deep copies, the interface must support a virtual clone() function which has to be implemented in every single derived class. Can you seriously think of something more boring than that?
To sum it up: this adds a lot of unnecessary coupling and requires tons of (arguably useless) boilerplate code. This is definitely not satisfactory but so far this is the only practical solution I know of.
I have been searching for a viable alternative to subtype polymorphism (aka. interface inheritance) for ages. I play a lot with ad hoc polymorphism (aka. overloaded free functions) but I always hit the same hard wall: containers have to be homogeneous, so I always grudgingly go back to inheritance and smart pointers, with all the drawbacks already listed above (and probably more).
Ideally, I'd like to have a mere vector<IUsable> with proper value semantics, without changing anything to my current (absence of) type hierarchy, and keep ad hoc polymorphism instead of requiring subtype polymorphism.
Is this possible? If so, how?
Different alternatives
It is possible. There are several alternative approaches to your problem. Each one has different advantages and drawbacks (I will explain each one):
Create an interface and have a template class which implements this interface for different types. It should support cloning.
Use boost::variant and visitation.
Blending static and dynamic polymorphism
For the first alternative you need to create an interface like this:
class UsableInterface
{
public:
virtual ~UsableInterface() {}
virtual void use() = 0;
virtual std::unique_ptr<UsableInterface> clone() const = 0;
};
Obviously, you don't want to implement this interface by hand everytime you have a new type having the use() function. Therefore, let's have a template class which does that for you.
template <typename T> class UsableImpl : public UsableInterface
{
public:
template <typename ...Ts> UsableImpl( Ts&&...ts )
: t( std::forward<Ts>(ts)... ) {}
virtual void use() override { use( t ); }
virtual std::unique_ptr<UsableInterface> clone() const override
{
return std::make_unique<UsableImpl<T>>( t ); // This is C++14
// This is the C++11 way to do it:
// return std::unique_ptr<UsableImpl<T> >( new UsableImpl<T>(t) );
}
private:
T t;
};
Now you can actually already do everything you need with it. You can put these things in a vector:
std::vector<std::unique_ptr<UsableInterface>> usables;
// fill it
And you can copy that vector preserving the underlying types:
std::vector<std::unique_ptr<UsableInterface>> copies;
std::transform( begin(usables), end(usables), back_inserter(copies),
[]( const std::unique_ptr<UsableInterface> & p )
{ return p->clone(); } );
You probably don't want to litter your code with stuff like this. What you want to write is
copies = usables;
Well, you can get that convenience by wrapping the std::unique_ptr into a class which supports copying.
class Usable
{
public:
template <typename T> Usable( T t )
: p( std::make_unique<UsableImpl<T>>( std::move(t) ) ) {}
Usable( const Usable & other )
: p( other.clone() ) {}
Usable( Usable && other ) noexcept
: p( std::move(other.p) ) {}
void swap( Usable & other ) noexcept
{ p.swap(other.p); }
Usable & operator=( Usable other )
{ swap(other); }
void use()
{ p->use(); }
private:
std::unique_ptr<UsableInterface> p;
};
Because of the nice templated contructor you can now write stuff like
Usable u1 = 5;
Usable u2 = std::string("Hello usable!");
And you can assign values with proper value semantics:
u1 = u2;
And you can put Usables in an std::vector
std::vector<Usable> usables;
usables.emplace_back( std::string("Hello!") );
usables.emplace_back( 42 );
and copy that vector
const auto copies = usables;
You can find this idea in Sean Parents talk Value Semantics and Concepts-based Polymorphism. He also gave a very brief version of this talk at Going Native 2013, but I think this is to fast to follow.
Moreover, you can take a more generic approach than writing your own Usable class and forwarding all the member functions (if you want to add other later). The idea is to replace the class Usable with a template class. This template class will not provide a member function use() but an operator T&() and operator const T&() const. This gives you the same functionality, but you don't need to write an extra value class every time you facilitate this pattern.
A safe, generic, stack-based discriminated union container
The template class boost::variant is exactly that and provides something like a C style union but safe and with proper value semantics. The way to use it is this:
using Usable = boost::variant<int,std::string,A>;
Usable usable;
You can assign from objects of any of these types to a Usable.
usable = 1;
usable = "Hello variant!";
usable = A();
If all template types have value semantics, then boost::variant also has value semantics and can be put into STL containers. You can write a use() function for such an object by a pattern that is called the visitor pattern. It calls the correct use() function for the contained object depending on the internal type.
class UseVisitor : public boost::static_visitor<void>
{
public:
template <typename T>
void operator()( T && t )
{
use( std::forward<T>(t) );
}
}
void use( const Usable & u )
{
boost::apply_visitor( UseVisitor(), u );
}
Now you can write
Usable u = "Hello";
use( u );
And, as I already mentioned, you can put these thingies into STL containers.
std::vector<Usable> usables;
usables.emplace_back( 5 );
usables.emplace_back( "Hello world!" );
const auto copies = usables;
The trade-offs
You can grow the functionality in two dimensions:
Add new classes which satisfy the static interface.
Add new functions which the classes must implement.
In the first approach I presented it is easier to add new classes. The second approach makes it easier to add new functionality.
In the first approach it it impossible (or at least hard) for client code to add new functions. In the second approach it is impossible (or at least hard) for client code to add new classes to the mix. A way out is the so-called acyclic visitor pattern which makes it possible for clients to extend a class hierarchy with new classes and new functionality. The drawback here is that you have to sacrifice a certain amount of static checking at compile-time. Here's a link which describes the visitor pattern including the acyclic visitor pattern along with some other alternatives. If you have questions about this stuff, I'm willing to answer.
Both approaches are super type-safe. There is not trade-off to be made there.
The run-time-costs of the first approach can be much higher, since there is a heap allocation involved for each element you create. The boost::variant approach is stack based and therefore is probably faster. If performance is a problem with the first approach consider to switch to the second.
Credit where it's due: When I watched Sean Parent's Going Native 2013 "Inheritance Is The Base Class of Evil" talk, I realized how simple it actually was, in hindsight, to solve this problem. I can only advise you to watch it (there's much more interesting stuff packed in just 20 minutes, this Q/A barely scratches the surface of the whole talk), as well as the other Going Native 2013 talks.
Actually it's so simple it hardly needs any explanation at all, the code speaks for itself:
struct IUsable {
template<typename T>
IUsable(T value) : m_intf{ new Impl<T>(std::move(value)) } {}
IUsable(IUsable&&) noexcept = default;
IUsable(const IUsable& other) : m_intf{ other.m_intf->clone() } {}
IUsable& operator =(IUsable&&) noexcept = default;
IUsable& operator =(const IUsable& other) { m_intf = other.m_intf->clone(); return *this; }
// actual interface
friend void use(const IUsable&);
private:
struct Intf {
virtual ~Intf() = default;
virtual std::unique_ptr<Intf> clone() const = 0;
// actual interface
virtual void intf_use() const = 0;
};
template<typename T>
struct Impl : Intf {
Impl(T&& value) : m_value(std::move(value)) {}
virtual std::unique_ptr<Intf> clone() const override { return std::unique_ptr<Intf>{ new Impl<T>(*this) }; }
// actual interface
void intf_use() const override { use(m_value); }
private:
T m_value;
};
std::unique_ptr<Intf> m_intf;
};
// ad hoc polymorphic interface
void use(const IUsable& intf) { intf.m_intf->intf_use(); }
// could be further generalized for any container but, hey, you get the drift
template<typename... Args>
void use(const std::vector<IUsable, Args...>& c) {
std::cout << "vector<IUsable>" << std::endl;
for (const auto& i: c) use(i);
std::cout << "End of vector" << std::endl;
}
int main() {
std::vector<IUsable> items;
items.emplace_back(3);
items.emplace_back(std::string{ "world" });
items.emplace_back(items); // copy "items" in its current state
items[0] = std::string{ "hello" };
items[1] = 42;
items.emplace_back(A{});
use(items);
}
// vector<IUsable>
// string = hello
// int = 42
// vector<IUsable>
// int = 3
// string = world
// End of vector
// class A
// End of vector
As you can see, this is a rather simple wrapper around a unique_ptr<Interface>, with a templated constructor that instantiates a derived Implementation<T>. All the (not quite) gory details are private, the public interface couldn't be any cleaner: the wrapper itself has no member functions except construction/copy/move, the interface is provided as a free use() function that overloads the existing ones.
Obviously, the choice of unique_ptr means that we need to implement a private clone() function that is called whenever we want to make a copy of an IUsable object (which in turn requires a heap allocation). Admittedly one heap allocation per copy is quite suboptimal, but this is a requirement if any function of the public interface can mutate the underlying object (ie. if use() took non-const references and modified them): this way we ensure that every object is unique and thus can freely be mutated.
Now if, as in the question, the objects are completely immutable (not only through the exposed interface, mind you, I really mean the whole objects are always and completely immutable) then we can introduce shared state without nefarious side effects. The most straightforward way to do this is to use a shared_ptr-to-const instead of a unique_ptr:
struct IUsableImmutable {
template<typename T>
IUsableImmutable(T value) : m_intf(std::make_shared<const Impl<T>>(std::move(value))) {}
IUsableImmutable(IUsableImmutable&&) noexcept = default;
IUsableImmutable(const IUsableImmutable&) noexcept = default;
IUsableImmutable& operator =(IUsableImmutable&&) noexcept = default;
IUsableImmutable& operator =(const IUsableImmutable&) noexcept = default;
// actual interface
friend void use(const IUsableImmutable&);
private:
struct Intf {
virtual ~Intf() = default;
// actual interface
virtual void intf_use() const = 0;
};
template<typename T>
struct Impl : Intf {
Impl(T&& value) : m_value(std::move(value)) {}
// actual interface
void intf_use() const override { use(m_value); }
private:
const T m_value;
};
std::shared_ptr<const Intf> m_intf;
};
// ad hoc polymorphic interface
void use(const IUsableImmutable& intf) { intf.m_intf->intf_use(); }
// could be further generalized for any container but, hey, you get the drift
template<typename... Args>
void use(const std::vector<IUsableImmutable, Args...>& c) {
std::cout << "vector<IUsableImmutable>" << std::endl;
for (const auto& i: c) use(i);
std::cout << "End of vector" << std::endl;
}
Notice how the clone() function has disappeared (we don't need it any more, we just share the underlying object and it's no bother since it's immutable), and how copy is now noexcept thanks to shared_ptr guarantees.
The fun part is, the underlying objects have to be immutable, but you can still mutate their IUsableImmutable wrapper so it's still perfectly OK to do this:
std::vector<IUsableImmutable> items;
items.emplace_back(3);
items[0] = std::string{ "hello" };
(only the shared_ptr is mutated, not the underlying object itself so it doesn't affect the other shared references)
Maybe boost::variant?
#include <iostream>
#include <string>
#include <vector>
#include "boost/variant.hpp"
struct A {};
void use(int x) { std::cout << "int = " << x << std::endl; }
void use(const std::string& x) { std::cout << "string = " << x << std::endl; }
void use(const A&) { std::cout << "class A" << std::endl; }
typedef boost::variant<int,std::string,A> m_types;
class use_func : public boost::static_visitor<>
{
public:
template <typename T>
void operator()( T & operand ) const
{
use(operand);
}
};
int main()
{
std::vector<m_types> vec;
vec.push_back(1);
vec.push_back(2);
vec.push_back(std::string("hello"));
vec.push_back(A());
for (int i=0;i<4;++i)
boost::apply_visitor( use_func(), vec[i] );
return 0;
}
Live example: http://coliru.stacked-crooked.com/a/e4f4ccf6d7e6d9d8
The other answers earlier (use vtabled interface base class, use boost::variant, use virtual base class inheritance tricks) are all perfectly good and valid solutions for this problem, each with a difference balance of compile time versus run time costs. I would suggest though that instead of boost::variant, on C++ 11 and later use eggs::variant instead which is a reimplementation of boost::variant using C++ 11/14 and it is enormously superior on design, performance, ease of use, power of abstraction and it even provides a fairly full feature subset on VS2013 (and a full feature set on VS2015). It's also written and maintained by a lead Boost author.
If you are able to redefine the problem a bit though - specifically, that you can lose the type erasing std::vector in favour of something much more powerful - you could use heterogenous type containers instead. These work by returning a new container type for each modification of the container, so the pattern must be:
newtype newcontainer=oldcontainer.push_back(newitem);
These were a pain to use in C++ 03, though Boost.Fusion makes a fair fist of making them potentially useful. Actually useful usability is only possible from C++ 11 onwards, and especially so from C++ 14 onwards thanks to generic lambdas which make working with these heterogenous collections very straightforward to program using constexpr functional programming, and probably the current leading toolkit library for that right now is proposed Boost.Hana which ideally requires clang 3.6 or GCC 5.0.
Heterogeneous type containers are pretty much the 99% compile time 1% run time cost solution. You'll see a lot of compiler optimiser face plants with current compiler technology e.g. I once saw clang 3.5 generate 2500 opcodes for code which should have generated two opcodes, and for the same code GCC 4.9 spat out 15 opcodes 12 of which didn't actually do anything (they loaded memory into registers and did nothing with those registers). All that said, in a few years time you will be able to achieve optimal code generation for heterogeneous type containers, at which point I would expect they'll become the next gen form of C++ metaprogramming where instead of arsing around with templates we'll be able to functionally program the C++ compiler using actual functions!!!
Heres an idea I got recently from std::function implementation in libstdc++:
Create a Handler<T> template class with a static member function that knows how to copy, delete and perform other operations on T.
Then store a function pointer to that static functon in the constructor of your Any class. Your Any class doesn't need to know about T then, it just needs this function pointer to dispatch the T-specific operations. Notice that the signature of the function is independant of T.
Roughly like so:
struct Foo { ... }
struct Bar { ... }
struct Baz { ... }
template<class T>
struct Handler
{
static void action(Ptr data, EActions eAction)
{
switch (eAction)
{
case COPY:
call T::T(...);
case DELETE:
call T::~T();
case OTHER:
call T::whatever();
}
}
}
struct Any
{
Ptr handler;
Ptr data;
template<class T>
Any(T t)
: handler(Handler<T>::action)
, data(handler(t, COPY))
{}
Any(const Any& that)
: handler(that.handler)
, data(handler(that.data, COPY))
{}
~Any()
{
handler(data, DELETE);
}
};
int main()
{
vector<Any> V;
Foo foo; Bar bar; Baz baz;
v.push_back(foo);
v.push_back(bar);
v.push_back(baz);
}
This gives you type erasure while still maintaining value semantics, and does not require modification of the contained classes (Foo, Bar, Baz), and doesn't use dynamic polymorphism at all. It's pretty cool stuff.

Polymorphic converting iterator

Consider the following (simplified) scenario:
class edgeOne {
private:
...
public:
int startNode();
int endNode();
};
class containerOne {
private:
std::vector<edgeOne> _edges;
public:
std::vector<edgeOne>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeOne>::const_iterator edgesEnd(){
return _edges.end();
};
};
class edgeTwo {
private:
...
public:
int startNode();
int endNode();
};
class containerTwo {
private:
std::vector<edgeTwo> _edges;
public:
std::vector<edgeTwo>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeTwo>::const_iterator edgesEnd(){
return _edges.end();
};
};
I.e., I have two mostly identical edge types and two mostly identical container types. I can iterate over each kind individually. So far, so fine.
But now my use case is the following: Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
So my idea is the following: I want to have an iterator with the following properties:
- Regarding its traversal behavior, it behaves either like a std::vector<edgeOne>::const_iterator or a std::vector<edgeTwo>::const_iterator, depending on how it was initialized.
- Instead of returning a const edgeOne & or const edgeTwo &, operator* should return a std::pair<int,int>, i.e., apply a conversion.
I found the Boost.Iterator Library, in particular:
iterator_facade, which helps to build a standard-conforming iterator and
transform_iterator, which could be used to transform edgeOne and edgeTwo to std::pair<int,int>,
but I am not completely sure how the complete solution should look like. If I build nearly the entire iterator myself, is there any benefit to use transform_iterator, or will it just make the solution more heavy-weight?
I guess the iterator only needs to store the following data:
A flag (bool would be sufficient for the moment, but probably an enum value is easier to extend if necessary) indicating whether the value type is edgeOne or edgeTwo.
A union with entries for both iterator types (where only the one that matches the flag will ever be accessed).
Anything else can be computed on-the-fly.
I wonder if there is an existing solution for this polymorphic behavior, i.e. an iterator implementation combining two (or more) underlying iterator implementation with the same value type. If such a thing exists, I could use it to just combine two transform_iterators.
Dispatching (i.e., deciding whether a containerOne or a containerTwo object needs to be accessed) could easily be done by a freestanding function ...
Any thoughts or suggestions regarding this issue?
How about making your edgeOne & edgeTwo polymorphic ? and using pointers in containers ?
class edge
class edgeOne : public edge
class edgeTwo : public edge
std::vector<edge*>
Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
By "code duplication", do you mean source code or object code? Is there some reason to go past a simple template? If you're concerned about the two template instantiations constituting "code duplicaton" you can move most of the processing to an out-of-line non-templated do_whatever_with(int, int) support function....
template <typename Container>
void iterator_and_process_edges(const Container& c)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
do_whatever_with(i->startNode(), i->endNode());
}
if (criteria)
iterate_and_process_edges(getContainerOne());
else
iterate_and_process_edges(getContainerTwo());
My original aim was to hide the dispatching functionality from the code that just needs to access the start and end node. The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Not sure I follow you there, but I'm trying. So, "code that just needs to access the start and end node". It's not clear whether by access you mean to get startNode and endNode for a container element, or to use those values. I'd already factored out a do_whatever_with function that used them, so by elimination your request appears to want to isolate the code extracting the nodes from an Edge - done below in a functor:
template <typename Edge>
struct Processor
{
void operator()(Edge& edge) const
{
do_whatever_with(edge.startNode(), edge.endNode());
}
};
That functor can then be applied to each node in the Container:
template <typename Container, class Processor>
void visit(const Container& c, const Processor& processor)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
processor(*i);
}
"hide the dispatching functionality from the code that just needs to access the start and end node" - seems to me there are various levels of dispatching - on the basis of criteria, then on the basis of iteration (every layer of function call is a "dispatch" in one sense) but again by elimination I assume it's the isolation of the iteration as above that you're after.
if (criteria)
visit(getContainerOne(), Process<EdgeOne>());
else
visit(getContainerTwo(), Process<EdgeTwo>());
The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Can't say I agree with you, but then it depends whether you can see any maintenance issue with my first approach (looks dirt simple to me - a layer less than this latest incarnation and considerably less fiddly), or some potential for reuse. The visit implementation above is intended to be reusable, but reusing a single for-loop (that would simplify further if you have C++11) isn't useful IMHO.
Are you more comfortable with this modularisation, or am I misunderstanding your needs entirely somehow?
template<typename T1, typename T2>
boost::variant<T1*, T2*> get_nth( boost::variant< std::vector<T1>::iterator, std::vector<T2>::iterator > begin, std::size_t n ) {
// get either a T1 or a T2 from whichever vector you actually have a pointer to
}
// implement this using boost::iterator utilities, I think fascade might be the right one
// This takes a function that takes an index, and returns the nth element. It compares for
// equality based on index, and moves position based on index:
template<typename Lambda>
struct generator_iterator {
std::size_t index;
Lambda f;
};
template<typename Lambda>
generator_iterator< typename std::decay<Lambda>::type > make_generator_iterator( Lambda&&, std::size_t index=0 );
boost::variant< it1, it2 > begin; // set this to either one of the begins
auto double_begin = make_generator_iterator( [begin](std::size_t n){return get_nth( begin, n );} );
auto double_end = double_begin + number_of_elements; // where number_of_elements is how big the container we are type-erasing
Now we have an iterator that can iterate over one, or the other, and returns a boost::variant<T1*, T2*>.
We can then write a helper function that uses a visitor to extract the two fields you want from the returned variant, and treat it like an ADT. If you dislike ADTs, you can instead write up a class that wraps the variant and provides methods, or even change the get_nth to be less generic and instead return a struct with your data already produced.
There is going to be the equivalent of a branch on each access, but there is no virtual function overhead in this plan. It does currently requires an auto typed variable, but you can write an explicit functor to replace the lambda [begin](std::size_t n){return get_nth( begin, n );} and even that issue goes away.
Easier solutions:
Write a for_each function that iterates over each of the containers, and passes in the processed data to the passed in function.
struct simple_data { int x,y; };
std::function<std::function<void(simple_data)>> for_each() const {
auto begin = _edges.begin();
auto end = _edges.end();
return [begin,end](std::function<void(simple_data)> f) {
for(auto it = begin; it != end; ++it) {
simple_data d;
d.x = it->getX();
d.y = it->getY();
f(d);
}
};
};
and similar in the other. We now can iterate over the contents without caring about the details by calling foo->foreach()( [&](simple_data d) { /*code*/ } );, and because I stuffed the foreach into a std::function return value instead of doing it directly, we can pass the concept of looping around to another function.
As mentioned in comments, other solutions can include using boost's type-erased iterator wrappers, or writing a python-style generator mutable generator that returns either a simple_data. You could also directly use the boost iterator-creation functions to create an iterator over boost::variant<T1, T2>.

C++ : Using different iterator types in subclasses without breaking the inheritance mechanism

I'm trying to achieve the following: Given an abstract class MemoryObject, that every class can inherit from, I have two subclasses: A Buffer and a BigBuffer:
template <typename T>
class MemoryObject
{
public:
typedef typename std::vector<T>::iterator iterator;
typedef typename std::vector<T>::const_iterator const_iterator;
[...] //Lot of stuff
virtual iterator begin() = 0;
virtual iterator end() = 0;
};
A Buffer:
template <typename T>
class Buffer: public MemoryObject<T>
{
public:
typedef typename std::vector<T>::iterator iterator;
iterator begin() { return buffer_.begin(); }
iterator end() { return buffer_.end(); };
[...] //Lot of stuff
private:
std::vector<T> buffer_;
};
And finally:
template <typename T>
class BigBuffer: public MemoryObject<T>
{
public:
[...] //Omitted, for now
private:
std::vector<Buffer<T>*> chunks_;
};
As you can see, a BigBuffer holds a std::vector of Buffer<T>*, so you can view a BigBuffer as an aggregation of Buffer(s). Futhermore, I have a bunch of functions that must work of every MemoryObject, so this is a real signature:
template <class KernelType, typename T>
void fill(CommandQueue<KernelType>& queue, MemoryObject<T>& obj, const T& value)
{
//Do something with obj
}
What's the point? - You may ask. The point is that I must implement iterators over these classes. I've already implemented them for Buffer, and is exactly what I need: be able to iterate over a Buffer, and access to ranges (for example b.begin(), b.begin() + 50).
Obviously I can't do the same for BigBuffer, because the real data (that is inside each Buffer' private variable buffer_) is scattered accross the memory. So I need a new class, let's call it BigBufferIterator, which can overload operator like * or +, allowing me to "jump" from a memory chunk to another without incurring in in segmentation fault.
The problems are two:
The iterator type of MemoryObject is different from the iterator
type of BigBuffer: the former is a std::vector<T>::iterator, the
latter a BigBufferIterator. My compiler obviously complains
I want be able to preserve the genericity of my functions signatures
passing to them only a MemoryObject<T>&, not specializing them for
each class type.
I've tried to solve the first problem adding a template parameter classed Iterator, and giving it a default argument to each class, with a model loosely based to Alexandrescu's policy-based model. This solution solved the first issue, but not the second: my compiled still complains, telling me: "Not known conversion from BigBuffer to MemoryObject", when I try to pass a BigBuffer to a function (for example, the fill() ). This is because Iterator types are different..
I'm really sorry for this poem, but It was the only way to proper present my problem to you. I don't know why someone would even bother in reading all this stuff, but I'll take pot luck.
Thanks in advance, only just for having read till this point.
Humbly,
Alfredo
They way to go is to use the most general definition as the iterator type of the base. That is, you want to treat the data in a Buffer as just one segment while the BigBuffer is a sequence of the corresponding segments. This is a bit unfortunate because it means that you treat your iterator for the single buffer in Buffer as if it may be multiple buffers, i.e. you have a segmented structure with just one segment. However, compared to the alternative (i.e. a hierarchy of iterators with virtual functions wrapped by a handle giving value semantics to this mess) you are actually not paying to bad a cost.
I encountered the same problem in a different context; let me restate it.
You have a Base class (which could be abstract), which is iterable via its BaseIterator.
You have a Child subclass, which differs in implementation slightly, and for which you have a different specialized ChildIterator.
You have custom functions that work with Base, and rely on its iterability.
It is not feasible to generate a template specialization of the custom functions for each subclass of Base. Possible reasons may be:
huge code duplication;
you distribute this code as a library and other developers are going to subclass Base;
other classes will take some reference or pointer to Base and apply those custom functions, or more generically:
Base implements some logic that is going to be uses in contexts where do not know any of the subclasses (and writing completely templated code is too cumbersome).
Edit: Another possibility would be using type erasure. You would hide the real iterator that you're using behind a class of a fixed type. You would have to pay the cost of the virtual functions call though. Here is an implementation of a any_iterator class which implements exactly iterator type erasure, and some more background on it.
The only effective solution I could find was to write a multi-purpose iterator that can iterate over all possible containers, possibly exploiting their internals (to avoid copy-pasting the iterator code for every subclass of Base):
// A bigger unit of memory
struct Buffer;
class Iterator {
// This allows you to know which set of methods you need to call
enum {
small_chunks,
big_chunks
} const _granularity;
// Merge the data into a union to save memory
union {
// Data you need to know to iterate over ints
struct {
std::vector<int> const *v;
std::vector<int>::const_iterator it;
} _small_chunks;
// Data you need to know to iterate over buffer chunks
struct {
std::vector<Buffer *> const *v;
std::vector<Buffer *>::const_iterator it;
} _big_chunks;
};
// Every method will have to choose what to do
void increment() {
switch (_granularity) {
case small_chunks:
_small_chunks.it++;
break;
case big_chunks:
_big_chunks.it++;
break;
}
}
public:
// Cctors are almost identical, but different overloads choose
// different granularity
Iterator(std::vector<int> const &container)
: _granularity(small_chunks) // SMALL
{
_small_chunks.v = &container;
_small_chunks.it = container.begin();
}
Iterator(std::vector<Buffer *> const &container)
: _granularity(big_chunks) // BIG
{
_big_chunks.v = &container;
_big_chunks.it = container.begin();
}
// ... Implement all your methods
};
You can use the same class for both Base and Child, but you need to initialize it differently. At this point you can make begin and end virtual and return an Iterator constructed differently in each subclass:
class Base {
public:
virtual Iterator begin() const = 0;
};
class IntChild : public Base {
std::vector<int> _small_mem;
public:
virtual Iterator begin() const {
// Created with granularity 'small_chunks'
return Iterator(_small_mem);
}
// ...
};
class BufferChild : public Base {
std::vector<Buffer *> _big_mem;
public:
virtual Iterator begin() const {
// Created with granularity 'big_chunks'
return Iterator(_big_mem);
}
// ...
};
You will pay a small price in performance (because of the switch statements) and in code duplication, but you will be able to use any generic algorithm from <algorithm>, to use range-loop, to use Base only in every function, and it's not stretching the inheritance mechanism.
// Anywhere, just by knowing Base:
void count_free_chunks(Base &mem) {
// Thanks to polymorphism, this code will always work
// with the right chunk size
for (auto const &item : mem) {
// ...this works
}
// This also works:
return std::count(mem.begin(), mem.end(), 0);
}
Note that the key point here is that begin() and end() must return the same type. The only exception would be if Child's methods would shadow Base's method (which is in general not a good practice).
One possible idea would be to declare an abstract iterator, but this is not so good. You would have to use all the time a reference to the iterator. Iterator though are supposed to be carried around as lightweight types, to be assignable and easily constructible, so it would make coding a minefield.