I have a question regarding partial specialisation of templated member functions.
Background: The goal is to compute descriptive statistics of large datasets which are too large to be hold in memory at once. Therefore I have accumulator classes for the variance and the covariance where I can push in the datasets piece by piece (either one value at a time or in larger chunks). A rather simplified version computing the arithmetic mean only is
class Mean
{
private:
std::size_t _size;
double _mean;
public:
Mean() : _size(0), _mean(0)
{
}
double mean() const
{
return _mean;
}
template <class T> void push(const T value)
{
_mean += (value - _mean) / ++_size;
}
template <class InputIt> void push(InputIt first, InputIt last)
{
for (; first != last; ++first)
{
_mean += (*first - _mean) / ++_size;
}
}
};
One particular advantage of this kind of accumulator class is the possibility to push values of different datatypes into the same accumulator class.
Problem: This works fine for all integral datatypes. However the accumulator classes should be able to handle complex numbers as well by first calculating the absolute value |z| and then pushing it to the accumulator. For pushing single values it's easy to provide an overloaded method
template <class T> void push(const std::complex<T> z)
{
T a = std::real(z);
T b = std::imag(z);
push(std::sqrt(a * a + b * b));
}
For pushing chunks of data via iterators however the case not quite as simple. In order to overload correctly a partial specialisation is required since we need to know the actual (fully specialised) complex number type. The usual way would be to delegate the actual code in an internal struct and specialise it accordingly
// default version for all integral types
template <class InputIt, class T>
struct push_impl
{
static void push(InputIt first, InputIt last)
{
for (; first != last; ++first)
{
_mean += (*first - _mean) / ++_size;
}
}
};
// specialised version for complex numbers of any type
template <class InputIt, class T>
struct push_impl<InputIt, std::complex<T>>
{
static void push(InputIt first, InputIt last)
{
for (; first != last; ++first)
{
T a = std::real(*first);
T b = std::imag(*first);
_mean += (std::sqrt(a * a + b * b) - _mean) / ++_size;
}
}
};
In the accumulator class the templated methods of the delegation struct are then called by
template <class InputIt>
void push(InputIt first, InputIt last)
{
push_impl<InputIt, typename std::iterator_traits<InputIt>::value_type>::push(first, last);
}
However there is one problem with this technique which is how to access the private members of the accumulator class. Since they are different classes no direct access is possible and furthermore the methods of push_impl need to be static and cannot access the non-static members of the accumulator.
I can think of the following four solutions to the problem which all have their own advantages and disadvantages:
Create an instance of push_impl in each call to push with (possible) decrease in performance due to the extra copy.
Have an instance of push_impl as member variable of the accumulator class, which would prevent me from pushing different datatypes into the accumulator since the instance would have to be fully specialised.
Make all members of the accumulator class public and pass *this to push_impl::push() calls. This is a particular bad solution due to the break in encapsulation.
Implement the iterator version in terms of the single value version, i.e. call the push() method for each element with (possible) decrease in performance due to the extra function call.
Note that the mentioned decreases performance are theoretical in their nature and might be no problem at all due to clever inlining by the compiler, however the actual push methods might be much more complex than the example from above.
Is one solution preferable to the others or do I miss something?
Best regards and many thanks.
As commented, you don't need to use partial specialization for this at all, indeed partial specialization is usually pretty easy to avoid, and preferred to avoid.
private:
template <class T>
struct tag{}; // trivial nested struct
template <class I, class T>
void push_impl(I first, I last, tag<T>) { ... } // generic implementation
template <class I, class T>
void push_impl(I first, I last, tag<std::complex<T>>) { ... } // complex implementation
public:
template <class InputIt>
void push(InputIt first, InputIt last)
{
push_impl(first, last,
tag<typename std::iterator_traits<InputIt>::value_type> {});
}
Since push_impl is a (private) member function you don't need to do anything special any more.
Compared to your proposed solutions, this has no extra performance cost. It's the same number of function calls, the only difference is passing a stateless type by value, which is a wholly trivial optimization for the compiler. And there's no sacrifice in encapsulation either. And slightly less boilerplate.
push_impl can be made either an inner class template (if you use c++11) or a friend class template of your accumulator class (this seems like a good case for using friend declarations, since push_impl is essentially an integral part of your accumulator class implementation, separated purely for language reasons). Then you can use your option #3 (passing this to static methods of push_impl), but without making accumulator members public.
Option #4 doesn't seem too bad either (since it avoids code duplication), but as you mentioned the performance impacts would need to be measured.
Personally I'd be likely to choose your option 4, after all the only part of the iterator version that actually varies with type is the logic in the "single value version"
However another option is to write your iterator versions to receive the mean and size by reference, mean and size can then be updated without them having to be made public.
This will also help with testing as it allows push_impl to be tested separately (although with this approach you might consider that's no longer the best name for the function)
As an aside, it would be better for your push_impl to be templated only on iterator type, you can deduce the value type inside push_impl in the same way that you currently so in your calling example, but with only the iterator type as a parameter there's no chance of accidently calling it with the wrong value type (which might not always cause a compilation error if the value type can be converted to the type you pass as "T")
Related
Is there a way to build an iterator class that has two implementations : a general implementation for a container containing any number of elements and a special case (very fast) implementation when the container contains a single element wihtout using virtual functions and dynamic polymorphism ?
For the moment, I have :
struct Container {
struct FastIterator;
struct SlowIterator;
void add(...) { ... }
SlowIterator begin_slow() { ... }
FastIterator begin_fast() { ... }
};
instead I would like to have :
struct Container {
struct Iterator;
void add(...) { ... }
Iterator begin() { // select between fast and slow based on the contents of the container }
};
so that :
void f() {
Container c;
c.add(...);
Container::Iterator it = c.begin(); // uses FastIterator hidden by the Iterator type
}
void f2() {
Container c;
c.add(...);
c.add(...);
Container::Iterator it = c.begin(); // use SlowIterator hidden by the iterator type
}
Of course, the obvious way would be to use virtual function or a delegate in the Iterator implementation to switch from one case to the other, however I tested that this slows down a lot the iteration compared to directly using the Slow/Fast iterators.
Since all the information to decide which implementation to use is available during the call to begin(), I would think there is a way to use some kind of compile time polymorphism/trick to avoid any kind of indirection.
Also, I really don't want the user to have to decide if it should call begin_fast() or begin_slow(), this should be automatically handled and hidden by the Iterator class.
Is there a way ?
Thanks
Sure.
Your container becomes a std::variant of two different states, the "single element" state and the "many element" state (and maybe the "zero element" state).
The member function add can convert the zero or single-element container into a single or multi-element function. Similarly, a remove might do the opposite in some cases.
The variant itself doesn't have a begin or end. Instead, users must std::visit it with a function object that can accept either.
template<class T>
struct Container:
std::variant<std::array<T,0>, std::array<T,1>, std::vector<T>>
{
void add(T t) {
std::visit(
overload(
[&](std::array<T,0>& self) {
*this = std::array<T,1>{{std::move(t)}};
},
[&](std::array<T,1>& self) {
std::array<T,1> tmp = std::move(self);
*this = std::vector<T>{tmp[0], std::move(t)};
},
[&](std::vector<T>& self) {
self.push_back( std::move(t) );
}
),
*this
);
}
};
boost has a variant that works similarly. overload is merely
struct tag {};
template<class...Fs>
struct overload_t {overload_t(tag){}};
template<class F0, class F1, class...Fs>
struct overload_t: overload_t<F0>, overload_t<F1, Fs...> {
using overload_t<F0>::operator();
using overload_t<F1, Fs...>::operator();
template<class A0, class A1, class...Args>
overload_t( tag, A0&&a0, A1&&a1, Args&&...args ):
overload_t<F0>( tag{}, std::forward<A0>(a0)),
overload_t<F1, Fs...>(tag{}, std::forward<A1>(a1), std::forward<Args>(args)...)
{}
};
template<class F>
struct overload_t:F {
using F::operator();
template<class A>
overload_t( tag, A&& a ):F(std::forward<A>(a)){}
};
template<class...Fs>
overload_t<std::decay_t<Fs>...> overload(Fs&&...fs) {
return {tag{}, std::forward<Fs>(fs)...};
}
overload is ridiculously easier in c++17:
template<class...Fs>
struct overload:Fs{
using Fs::operator();
};
template<class...Fs>
overload->overload<Fs...>;
and use {} instead of ().
Use of this in c++14 looks like:
Container<int> bob = get_container();
std::visit( [](auto&& bob){
for (int x:bob) {
std::cout << x << "\n";
}
}, bob );
and for the 0 and 1 case, the size of the loop will be known exactly to the compiler.
In c++11 you'll have to write an external template function object instead of an inline lambda.
You could move the variant part out of the Container and into what begin returns (inside the iterator), but that would require a complex branching iterator implementation or for callers to visit on the iterator. And as the begin/end iterator types are probably tied, you'd want to return a range anyhow so the visit makes sense. And that gets you half way back to the Container solution anyhow.
You could also implement this outside of variant, but as a general rule earlier operations on a variable cannot change the later type in the same scope of code. It can be used to dispatch on a callable object passed in "continuation passing style", where both implementations will be compiled but one chosen at runtime (via branch). It may be possible for a compiler to realize which branch the visit will go down and dead-code eliminate the other, but the other branch still needs to be valid code.
If you want fully dynamicly typed objects, you are going to lose a factor of 2 to 10 speed at least (which is what languages who support this do), which is hard to recover by iteration efficiency on one element loops. That would be related to storing the variant-equivalent (maybe a virtual interface or whatever) in the iterator returned and making it complexly handle the branch at runtime. As your goal is performance, this isn't practical.
In theory, C++ could have the ability to change the type of variables based on operations on them. Ie, a theoretical language in which
Container c;
is of type "empty container", then:
c.add(foo);
now c changes static type to "single element container", then
c.add(foo);
and c changes static type to "multi-element container".
But that isn't the C++ type model. You can emulate it like above (at runtime), but it isn't the same.
I have a template function that I want to store a pointer to inside a std::vector.
The function looks like this:
template<typename T> void funcName(T& aT, std::vector<std::string>& fileName){...}
Now I want to store multiple pointers to functions of this kind inside a std::vector. For non-template functions I would do it like this:
typedef std::vector<std::string> string_vt;
typedef void func_t(T&, string_vt&);
typedef func_t* funcPointer;
typedef std::vector<funcPointer> funcPointer_vt;
But what is the correct syntax for template functions? How can I store them?
EDIT: First of all, thank you for your fast response. This was my first Question on Stack Overflow, so I am sorry for not providing enough information.
The set of T is finite, it can either be of type ClassA or type classB. In these function templates I want to do changes to T (so either ClassA or ClassB) with some hard coded data. I have 8 of these functions, which basically initiate a default constructed T with data specific to the function. In my program, I want to initiate 2*8 default constructed T's (8 ClassA and 8 ClassB). Therefore I run a for loop, calling one function after the other, to initiate my T objects with the function's body data.
for(int i = 0; i < initT.size(); ++i){
init_T[i]<T>(someT, fileName);
}
The for loop has as much iterations as there are function pointers inside the vector. At every iteration the function is called with some previously default constructed T and some other parameter. At the end the goal is to have 8 initiated T's with data specific to the function.
EDIT2: In case it helps, here is some actual source code. Inside the following function template I want to access my vector of function pointers in order to call the respective function.
template<typename T_Relation, typename T_Relation_Vec, bool row>
void bulk_load(initRelation_vt& aInitFunctions, T_Relation_Vec& aRel_Vec, const bool aMeasure, const uint aRuns, const char* aPath)
{
for(size_t i = 0; i < aRuns; ++i)
{
MemoryManager::freeAll();
aRel_Vec.clear();
string_vt fileNames;
for(size_t j = 0; j < aInitFunctions.size(); ++j)
{
aRel_Vec.emplace_back(T_Relation());
aInitFunctions[j]<T_Relation>(aRel_Vec[j], fileNames);
BulkLoader bl(fileNames[j].c_str(), tuples, aRel_Vec[j], delimiter, seperator);
Measure lMeasure;
if(aMeasure)
{
lMeasure.start();
}
try
{
bl.bulk_load();
if(row)
{
BulkInsertSP bi;
bi.bulk_insert(bl, aRel_Vec[j]);
}
else
{
BulkInsertPAX bi;
bi.bulk_insert(bl, aRel_Vec[j]);
}
}
catch(std::exception& ex)
{
std::cerr << "ERROR: " << ex.what() << std::endl;
}
lMeasure.stop();
if(aMeasure)
{
std::ofstream file;
file.open (aPath, std::ios::out | std::ios::app);
//print_result(file, flag, lMeasure.mTotalTime());
file.close();
}
}
}
}
This line is where the vector of function template pointers is accessed.
aInitFunctions[j]<T_Relation>(aRel_Vec[j], fileNames);
Templates are an advanced technique for static polymorphism. In a typed language, like C++, without static polymorphism you would have to separately define every entity used and precisely indicate every entity referred to.
Mechanisms of static polymorphism in C++ allow to automate indication of function or method and defer it until build via overloading. It allows you to define multiple entities sharing some characteristics at once via templates and defer definition of particular specializations until build, inferred from use.
(Notice that in various scenarios, static polymorphism allows separate code, so that changes to use and to definition are independent, which is very useful.)
The important implication of this mechanism is that every specialization of your template may be of different type. It is unclear, as of when I'm responding, whether you want to store pointers to a single or multiple types of specialization in one type of container. The possibilities depend also on parameter and result types of the function template.
A function in C++ has a type that is a combination of list of its parameter types and its return type. In other words, two functions that take and return the same types are of the same type. If your function template neither took or returned template parameter type (ie. T) nor templated type (eg. std::vector<T>), every specialization of this function template would be taking and returning the same types and would therefore be a function of the same type.
template <typename T>
int func() { ... }
This (arguably useless) function template takes no arguments and returns int, whatever T is used to specialize the template. Therefore a pointer to it could be used wherever the parameter is defined as int (*f)(). In this case you could keep pointer to any specialization in one vector.
typedef std::vector<std::string> string_vt;
typedef int func_t();
typedef func_t* funcPointer;
typedef std::vector<funcPointer> funcPointer_vt;
funcPointer x = &func<int>;
funcPointer y = &func<float>;
As can be seen, every specialization of your function template is of the same type and both pointers fit in the same container.
Next case - what if function header depends on a template parameter? Every specialization would have a different signature, that is a different function type. The pointers to all of them would be of different types - so it wouldn't be possible to even typedef this pointer once.
template <typename T>
void func(std::vector<T> param) { ... }
In this case function template specialization is of different type depending on T used to specialize.
typedef int func_t_int(std::vector<int>);
typedef func_t_int* funcPointerInt;
typedef std::vector<funcPointerInt> funcPointerInt_vt;
typedef float func_t_float(std::vector<float>);
typedef func_t_float* funcPointerFloat;
typedef std::vector<funcPointerFloat> funcPointerFloat_vt;
funcPointerInt x = &func<int>;
funcPointerFloat x = &func<float>;
Specializations are of different types, because they take different type of vectors. Pointers do not fit in the same container.
It's mention-worthy at this point, that in this case it's not necessary to define every pointer type separately. They could be a template type:
template <typename T>
using funcPointer = void (*)(std::vector<T>);
Which now allows funcPointer<int> to be used as a type qualifier, in place of earlier funcPointerInt.
funcPointer<float> y = &func<float>;
In more complicated situations a template could be created, whose every specialization is of a different type, and then would use a single instance of concrete vector to store various pointers to functions of type of only one of the specializations of your template. Although a simple template like in the example can only produce a single function per type, because every specialization yields one type of function and one function of that type, it's not impossible to conceive a scenario where various pointers to functions are obtained, both to specializations and usual functions, perhaps from various sources. So the technique could be useful.
But yet another scenario is that despite every specialization of the template being of different type, there's a need to store pointers to various specializations in single std::vector. In this case dynamic polymorphism will be helpful. To store values of different types, fe. pointers to functions of different types, in one type of variable, requires inheritance. It is possible to store any subclass in a field defined as superclass. Note however, that this is unlikely to accomplish anything really and probably not what you're really looking for.
I see two general possibilities now. Either use a class template with a method, which inherits from a non-template class.
template <typename T>
class MyClass : BaseClass
{
public:
T operator()(const T& param, int value);
}
MyClass<int> a;
MyClass<float> b;
BaseClass* ptr = &a;
ptr = &b;
While every specialization of this class may be of a different type, they all share superclass BaseClass, so a pointer to a BaseClass can actually point to any of them, and a std::vector<funcPointerBase> can be used to store them. By overloading operator() we have create an object that mimics a function. The interesting property of such a class is that it can have multiple instances created with parameter constructors. So effectively class template produces specializations of multiple types, and in turn every specialized class can produce instances of varying parametrization.
template <typename T>
class MyClass : BaseClass
{
int functor_param;
public:
MyClass(int functor_param);
T operator()(const T& param, int value);
}
This version allows creation of instances that work differently:
MyClass<int> a(1);
MyClass<int> b(2);
MyClass<float> c(4);
MyClass<int>* ptr = &a;
ptr = &b;
ptr = &c;
I am no expert on functors, just wanted to present the general idea. If it seems interesting, I suggest researching it now.
But technically we're not storing function pointers, just regular object pointers. Well, as stated before, we need inheritance to use one type of variable to store values of various types. So if we're not using inheritance to exchange our procedural functions for something dynamically polymorphic, we must do the same to pointers.
template <typename T>
T func(std::pair < T, char>) {}
template <typename T>
using funcPointer = T(*)(std::pair<T, char>);
template <typename T>
class MyPointer : BasePointer
{
funcPointer<T> ptr;
public:
MyPointer(funcPointer<T> ptr);
T()(std::pair <T, char>) operator*(std::pair <T, char> pair)
{
*ptr(pair);
}
};
This, again, allows creation of single std::vector<BasePointer> to store all possible pseudo-function-pointers.
Now the very important bit. How would You go about calling those, in either scenario? Since in both cases they are stored in a single std::vector<>, they are treated as if they were of the base type. A specific function call needs parameters of specific type and returns a specific type. If there was anything that all subclasses can do in the same way, it could be exposed by defining such a method in base class (in either scenario using functors or pointer..ors?), but a specific specialized function call is not that kind of thing. Every function call that You would want to perform in the end, after all this struggle, would be of a different type, requiring different type of parameters and/or returning different type of value. So they could never all fit into the same place in usual, not templated code, the same circumstances in execution. If they did, then dynamic polymorphism wouldn't be necessary to solve this problem in the first place.
One thing that could be done - which is greatly discouraged and probably defeats the purpose of dynamic polymorphism - is to detect subclass type at runtime and proceed accordingly. Research that, if you're convinced you have a good case for using this. Most likely though, it's probably a big anti-pattern.
But technically, anything you may want to do is possible somehow.
If I have correctly understood you, I may have a really simple and efficient solution:
template<class...Ts>
struct functor{
//something like a dynamic vtable
std::tuple<void(*)(Ts&,std::vector<std::string>&)...> instantiated_func_ptr;
template<class T>
void operator ()(T& aT,std::vector<std::string>& fileName){
get<void(*)(T&,std::vector<std::string>&)>(instantiated_func_ptr)
(aT,fileName);
}
};
VoilĂ !!
Until c++17, get<typename> is not defined so we have to define it (before the definition of the template functor above):
template<class T,class...Ts>
struct find_type{
//always fail if instantiated
static_assert(sizeof...(Ts)==0,"type not found");
};
template<class T,class U,class...Ts>
struct find_type<T,U,Ts...>:std::integral_constant<size_t,
find_type<T,Ts...>::value+1>{};
template<class T,class...Ts>
struct find_type<T,T,Ts...>:std::integral_constant<size_t,0>{};
template<class T,class...Ts>
constexpr decltype(auto) get(const std::tuple<Ts...>& t){
return get<find_type<T,Ts...>::value>(t);
}
And an example to show how to use it:
struct A{
void show() const{
std::cout << "A" << "\n";
}
};
struct B{
void show() const{
std::cout << "B" << "\n";
}
};
template<class T>
void func1(T& aT,std::vector<std::string>& fileName){
std::cout << "func1: ";
aT.show();
}
template<class T>
void func2(T& aT,std::vector<std::string>& fileName){
std::cout << "func2: ";
aT.show();
}
template<class T>
void func3(T& aT,std::vector<std::string>& fileName){
std::cout << "func3: ";
aT.show();
}
using functorAB = functor<A,B>;
int main(){
auto functor1=functorAB{{func1,func1}};//equivalent to functorAB{{func1<A>,func1<B>}}
auto functor2=functorAB{{func2,func2}};
auto functor3=functorAB{{func3,func3}};
auto v=std::vector<functorAB>{functor1,functor2,functor3};
auto a=A{};
auto b=B{};
auto fileNames = std::vector<std::string>{"file1","file2"};
for(auto& tf:v)
tf(a,fileNames);
for(auto& tf:v)
tf(b,fileNames);
}
In practice it is just a reproduction of the virtual call mechanism,
the tuple in functor is kind of virtual table. This code is not
more efficient than if you had written an abstract functor with virtual
operator() for each of your class A and B and then implemented it for each of
your functions... but it is much more concise, easier to maintain and may produce less binary code.
Consider the following (simplified) scenario:
class edgeOne {
private:
...
public:
int startNode();
int endNode();
};
class containerOne {
private:
std::vector<edgeOne> _edges;
public:
std::vector<edgeOne>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeOne>::const_iterator edgesEnd(){
return _edges.end();
};
};
class edgeTwo {
private:
...
public:
int startNode();
int endNode();
};
class containerTwo {
private:
std::vector<edgeTwo> _edges;
public:
std::vector<edgeTwo>::const_iterator edgesBegin(){
return _edges.begin();
};
std::vector<edgeTwo>::const_iterator edgesEnd(){
return _edges.end();
};
};
I.e., I have two mostly identical edge types and two mostly identical container types. I can iterate over each kind individually. So far, so fine.
But now my use case is the following: Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
So my idea is the following: I want to have an iterator with the following properties:
- Regarding its traversal behavior, it behaves either like a std::vector<edgeOne>::const_iterator or a std::vector<edgeTwo>::const_iterator, depending on how it was initialized.
- Instead of returning a const edgeOne & or const edgeTwo &, operator* should return a std::pair<int,int>, i.e., apply a conversion.
I found the Boost.Iterator Library, in particular:
iterator_facade, which helps to build a standard-conforming iterator and
transform_iterator, which could be used to transform edgeOne and edgeTwo to std::pair<int,int>,
but I am not completely sure how the complete solution should look like. If I build nearly the entire iterator myself, is there any benefit to use transform_iterator, or will it just make the solution more heavy-weight?
I guess the iterator only needs to store the following data:
A flag (bool would be sufficient for the moment, but probably an enum value is easier to extend if necessary) indicating whether the value type is edgeOne or edgeTwo.
A union with entries for both iterator types (where only the one that matches the flag will ever be accessed).
Anything else can be computed on-the-fly.
I wonder if there is an existing solution for this polymorphic behavior, i.e. an iterator implementation combining two (or more) underlying iterator implementation with the same value type. If such a thing exists, I could use it to just combine two transform_iterators.
Dispatching (i.e., deciding whether a containerOne or a containerTwo object needs to be accessed) could easily be done by a freestanding function ...
Any thoughts or suggestions regarding this issue?
How about making your edgeOne & edgeTwo polymorphic ? and using pointers in containers ?
class edge
class edgeOne : public edge
class edgeTwo : public edge
std::vector<edge*>
Based on some criteria, I get either a containerOne or a containerTwo object. I need to iterate over the edges. But because the types are different, I cannot easily do so without code duplication.
By "code duplication", do you mean source code or object code? Is there some reason to go past a simple template? If you're concerned about the two template instantiations constituting "code duplicaton" you can move most of the processing to an out-of-line non-templated do_whatever_with(int, int) support function....
template <typename Container>
void iterator_and_process_edges(const Container& c)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
do_whatever_with(i->startNode(), i->endNode());
}
if (criteria)
iterate_and_process_edges(getContainerOne());
else
iterate_and_process_edges(getContainerTwo());
My original aim was to hide the dispatching functionality from the code that just needs to access the start and end node. The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Not sure I follow you there, but I'm trying. So, "code that just needs to access the start and end node". It's not clear whether by access you mean to get startNode and endNode for a container element, or to use those values. I'd already factored out a do_whatever_with function that used them, so by elimination your request appears to want to isolate the code extracting the nodes from an Edge - done below in a functor:
template <typename Edge>
struct Processor
{
void operator()(Edge& edge) const
{
do_whatever_with(edge.startNode(), edge.endNode());
}
};
That functor can then be applied to each node in the Container:
template <typename Container, class Processor>
void visit(const Container& c, const Processor& processor)
{
for (auto i = c.edgesBegin(); i != c.edgesEnd(); ++i)
processor(*i);
}
"hide the dispatching functionality from the code that just needs to access the start and end node" - seems to me there are various levels of dispatching - on the basis of criteria, then on the basis of iteration (every layer of function call is a "dispatch" in one sense) but again by elimination I assume it's the isolation of the iteration as above that you're after.
if (criteria)
visit(getContainerOne(), Process<EdgeOne>());
else
visit(getContainerTwo(), Process<EdgeTwo>());
The dispatching is generic, whereas what happens inside the loop is not, so IMO this is a good reason to separate both.
Can't say I agree with you, but then it depends whether you can see any maintenance issue with my first approach (looks dirt simple to me - a layer less than this latest incarnation and considerably less fiddly), or some potential for reuse. The visit implementation above is intended to be reusable, but reusing a single for-loop (that would simplify further if you have C++11) isn't useful IMHO.
Are you more comfortable with this modularisation, or am I misunderstanding your needs entirely somehow?
template<typename T1, typename T2>
boost::variant<T1*, T2*> get_nth( boost::variant< std::vector<T1>::iterator, std::vector<T2>::iterator > begin, std::size_t n ) {
// get either a T1 or a T2 from whichever vector you actually have a pointer to
}
// implement this using boost::iterator utilities, I think fascade might be the right one
// This takes a function that takes an index, and returns the nth element. It compares for
// equality based on index, and moves position based on index:
template<typename Lambda>
struct generator_iterator {
std::size_t index;
Lambda f;
};
template<typename Lambda>
generator_iterator< typename std::decay<Lambda>::type > make_generator_iterator( Lambda&&, std::size_t index=0 );
boost::variant< it1, it2 > begin; // set this to either one of the begins
auto double_begin = make_generator_iterator( [begin](std::size_t n){return get_nth( begin, n );} );
auto double_end = double_begin + number_of_elements; // where number_of_elements is how big the container we are type-erasing
Now we have an iterator that can iterate over one, or the other, and returns a boost::variant<T1*, T2*>.
We can then write a helper function that uses a visitor to extract the two fields you want from the returned variant, and treat it like an ADT. If you dislike ADTs, you can instead write up a class that wraps the variant and provides methods, or even change the get_nth to be less generic and instead return a struct with your data already produced.
There is going to be the equivalent of a branch on each access, but there is no virtual function overhead in this plan. It does currently requires an auto typed variable, but you can write an explicit functor to replace the lambda [begin](std::size_t n){return get_nth( begin, n );} and even that issue goes away.
Easier solutions:
Write a for_each function that iterates over each of the containers, and passes in the processed data to the passed in function.
struct simple_data { int x,y; };
std::function<std::function<void(simple_data)>> for_each() const {
auto begin = _edges.begin();
auto end = _edges.end();
return [begin,end](std::function<void(simple_data)> f) {
for(auto it = begin; it != end; ++it) {
simple_data d;
d.x = it->getX();
d.y = it->getY();
f(d);
}
};
};
and similar in the other. We now can iterate over the contents without caring about the details by calling foo->foreach()( [&](simple_data d) { /*code*/ } );, and because I stuffed the foreach into a std::function return value instead of doing it directly, we can pass the concept of looping around to another function.
As mentioned in comments, other solutions can include using boost's type-erased iterator wrappers, or writing a python-style generator mutable generator that returns either a simple_data. You could also directly use the boost iterator-creation functions to create an iterator over boost::variant<T1, T2>.
I have a std::vector<T> of some type that's part of a class and that I need to iterate through in a lot of different places in my code, so I thought I'd be smart and create a function IterateAttributes, and pass it a boost::function object that I can in the loop and pass a single element and then I can pass any function to do work on the elements.
This seems a good idea until you have to implement it, then the problem comes of what does the passed in function return and does it need other arguments. It seems like I either have to find a way to do this more generically, like using templates, or I have to create overloads with function objects taking different args.
I think the first (more generic) options is probably better, however how would I go about that?
Below is a trial that doesn't work, however if I wanted to have a number of args, and all but the Attribute (a struct) arg mandatory. How should I go about it?
template <typename T> template <typename arg>
void ElementNode::IterateAttributes(boost::function<T (arg, Attribute)> func_)
{
std::vector<Attribute>::iterator it = v_attributes.begin();
for (; it != v_attributes.end(); it++)
{
func_(arg, *it);
}
}
Is that what you mean:
template <typename T, typename arg>
void ElementNode::IterateAttributes(boost::function<T (arg, Attribute)> func_, arg a)
{
std::vector<Attribute>::iterator it = v_attributes.begin();
for (; it != v_attributes.end(); it++)
{
func_(a, *it);
}
}
that allows only one parameter of any type - if you want you can introduce also version for more parameters.
About return value - what to do about it depends on what value it acctually is - the generic (and probably unnecesary) solution would be to return std::list<T>, but that would create more problems than it would solve i guess. If return type varies (not only in type but also in meaning) then I suggest modyfying templated function so it takes reference/pointer to overall result and updates it accordingly:
template <typename T> template <typename arg>
void ElementNode::IterateAttributes(boost::function<voidT (arg, Attribute, T&)> func_)
{
std::vector<Attribute>::iterator it = v_attributes.begin();
T result;
for (; it != v_attributes.end(); it++)
{
func_(arg, *it, result);
}
return result;
}
That's a quick workaround, it works but it's ugly, error prone, and pain to debug.
If you want variable parameter amount, then you would have to create several templates of above function - i just tested if it's possible:
template <typename T>
T boo(T){
}
template <typename T, typename TT>
TT boo(T,TT){
}
void test()
{
int i;
i= boo<int>(0);
i=boo<int,double>(0,0.0);
}
You must remember that functions passed to IterateAttributes must match exatly parameters given to Iterate function. That also means that you cannot use in it's prototype default values - probably you will have to define several overloaded versions like
void func_(Attribute,arg1, arg2,arg3){...}
void func_(Attribute A,arg1 a1,arg2 a2){func_(A,a1, a2,default3);}
void func_(Attribute A,arg1 a1){func_(A,a1, default2,default3);}
void func_(Attribute A){func_(A,default1, default2,default3);}
a) You want to iterate over the array and do something with each element there: in this case, you want functions that all take an array element and return void. Simple.
b) You want to partially apply functions with more arguments on each element: Write a custom functor around your function which stores the additional, pre-assigned arguments, or use boost::bind to effectively do the same.
Example:
vector<string> myStrings; // assign some values
// define a function with an additional argument
void myFunc(string message, string value)
{
cout << message << value << endl;
}
// allow partial application, i.e. "currying"
struct local_function
{
static string message;
static void myFunc_Curried(string value)
{
myFunc(message, value);
}
};
local_function::message = "the array contains: ";
// apply the curried function on all elements in the array
for_each(myStrings.begin(), myStrings.end(), local_function::myFunc_Curried);
The functor operates statically only for demonstration purposes. If message is bound to an instance of the struct, you will need something like boost::bind anyway to bind the instance pointer this in order to actually call the curried function. However, if the function I want to apply is used only locally, I prefer following the more readable static approach.
What you are trying to accomplish makes very good sense, and is also built directly into functional languages (for example F#). It is possible to achieve in C++, but requires some workarounds in the aforementioned case b. Please note if writing your own functor, as in my example, that it is common to place the arguments you want to curry away always at the beginning, and to "fill in" the arguments from the beginning to the end when partially applying.
Summarizing the comments and more thoughts:
Use bind to bind the other arguments, then use for_each on the resulting functor.
To handle return values, you need to think about what the return values mean. If you need to use the values in some way (say, perform a reduction, or use them to influence whether or not to continue performing the operation, etc), then you can use another functor to wrap the original to perform the thing you want.
You could do the same or more using BOOST_FOREACH or C++0x for each. That would even take less code to write.
I have a dilemma. Suppose I have a template class:
template <typename ValueT>
class Array
{
public:
typedef ValueT ValueType;
ValueType& GetValue()
{
...
}
};
Now I want to define a function that receives a reference to the class and calls the function GetValue(). I usually consider the following two ways:
Method 1:
template <typename ValueType>
void DoGetValue(Array<ValueType>& arr)
{
ValueType value = arr.GetValue();
...
}
Method 2:
template <typename ArrayType>
void DoGetValue(ArrayType& arr)
{
typename ArrayType::ValueType value = arr.GetValue();
...
}
There is almost no difference between the two methods. Even calling both functions will look exactly the same:
int main()
{
Array<int> arr;
DoGetValue(arr);
}
Now, which of the two is the best? I can think of some cons and pros:
Method 1 pros:
The parameter is a real class not a template, so it is easier for the user to understand the interface - it is very explicit that the parameter has to be Array. In method 2 you can guess it only from the name. We use ValueType in the function so it is more clear this way than when it is hidden inside Array and must be accessed using the scope operator.
In addition the typename keyword might be confusing for many non template savvy programmers.
Method 2 pros:
This function is more "true" to its purpose. When I think if it, I don't really need the class to be Array. What I really need is a class that has a method GetValue and a type ValueType. That's all. That is, this method is more generic.
This method is also less dependent on the changes in Array class. What if the template parameters of Array are changed? Why should it affect DoGetValue? It doesn't really care how Array is defined.
Evey time I have this situation I'm not sure what to choose. What is your choice?
The second one is better. In your "pros" for the first one, you say, "it is very explicit that the parameter has to be Array". But saying that the parameter has to be an Array is an unnecessary limitation. In the second example, any class with a suitable GetValue function will do. Since it's an unnecessary limitation, it's better to remove it (second one) than to make it explicit (first one). You'll write more flexible templates, which is useful in future when you want to get a value from something that isn't an Array.
If your function is very specific to ArrayType, and no other template will satisfy its interface requirements, use #1 as it's both shorter and more specific: the casual reader is informed that it operates on an ArrayType.
If there's a possibility that other templates will be compatible with DoGetValue, use #2 as it's more generic.
But no use obsessing, since it's easy enough to convert between them.
My friend proposed two more, somewhat more extreme, methods:
Method 3: gives you the ability of using types that don't have a ::ValueType.
template <typename ArrayType, typename ValueType = ArrayType::ValueType>
void DoGetValue(ArrayType& arr)
{
ValueType value = arr.GetValue();
...
}
Method 4: a cool way of forcing the array to be a class that has one template parameter.
template <template <typename> class ArrayType, typename ValueType>
void DoGetValue(ArrayType<ValueType>& arr)
{
ValueType value = arr.GetValue();
...
}