BOOST_PP_ITERATION for variable length arguments - c++

I'm wanting to incorporate a luabind into one of my projects. To do so I need to provide a function which behaves similar to call_function (see below). This function uses some template magic (courtesy of Boost) that I'd appreciate some help with. This is the first time I've really come across template metaprogramming (is that what it's called?) and so I'm a little lost. Here's a few snippets I'd appreciate help with.
#define LUABIND_TUPLE_PARAMS(z, n, data) const A##n *
#define LUABIND_OPERATOR_PARAMS(z, n, data) const A##n & a##n
I'm not really sure what this preprocessor bit is up to, I don't even know what it's called so searching is a little difficult. A is a template type. If I remember correctly #a would insert the literal text of a, but what do the multiple # do? After this preprocessor stuff comes this.
template<class Ret BOOST_PP_COMMA_IF(BOOST_PP_ITERATION()) BOOST_PP_ENUM_PARAMS(BOOST_PP_ITERATION(), class A)>
typename boost::mpl::if_<boost::is_void<Ret>
, luabind::detail::proxy_function_void_caller<boost::tuples::tuple<BOOST_PP_ENUM(BOOST_PP_ITERATION(), LUABIND_TUPLE_PARAMS, _)> >
, luabind::detail::proxy_function_caller<Ret, boost::tuples::tuple<BOOST_PP_ENUM(BOOST_PP_ITERATION(), LUABIND_TUPLE_PARAMS, _)> > >::type
call_function(lua_State* L, const char* name BOOST_PP_COMMA_IF(BOOST_PP_ITERATION()) BOOST_PP_ENUM(BOOST_PP_ITERATION(), LUABIND_OPERATOR_PARAMS, _) )
{
typedef boost::tuples::tuple<BOOST_PP_ENUM(BOOST_PP_ITERATION(), LUABIND_TUPLE_PARAMS, _)> tuple_t;
#if BOOST_PP_ITERATION() == 0
tuple_t args;
#else
tuple_t args(BOOST_PP_ENUM_PARAMS(BOOST_PP_ITERATION(), &a));
#endif
}
As you can see it makes heavy use of Boost. I've googled BOOST_PP_ITERATION but still can't really make out what it's doing. Could someone please explain to me, preferably in the context of this code, what the BOOST_PP stuff is doing, and how it manages to get the arguments into args.
My end goal is to define a call_function within my own code that will generate args which I can pass to an overload of call_function which I'll define. This means I can use the same calling convention, but can also apply some preprocessing before invoking luabind.
This question is quite specific in the way I've worded it, but I hope the concepts are general enough for it to be OK on here.

BOOST_PP_* ist not related to template metaprogramming, its a preprocessor library. Like the name says, it's working with preprocessor magic, doing some really braintwisting things to generate a bunch of similar templates. In your case, that would be the following:
//preprocessor iteration 0
template<class Ret>
typename boost::mpl::if_<boost::is_void<Ret>
, luabind::detail::proxy_function_void_caller<boost::tuples::tuple<> >
, luabind::detail::proxy_function_caller<Ret, boost::tuples::tuple<> > >::type
call_function(lua_State* L, const char* name )
{
typedef boost::tuples::tuple<> tuple_t;
tuple_t args;
}
//preprocessor iteration 1
template<class Ret , class A0>
typename boost::mpl::if_<boost::is_void<Ret>
, luabind::detail::proxy_function_void_caller<boost::tuples::tuple<const A0 *> >
, luabind::detail::proxy_function_caller<Ret, boost::tuples::tuple<const A0 *> > >::type
call_function(lua_State* L, const char* name , const A0 & a0 )
{
typedef boost::tuples::tuple<const A0 *> tuple_t;
tuple_t args(&a0);
}
and so on, up to some maximum defined elsewhere (e.g. A0, A1, A2, A3... A9 if the maximum is 10)
The ## is a token concatenation for the preprocessor, in this case concatenation A (or a) with whatever value n has (=> A0, A1, A2, ...). The whole code is in some preprocessing loop.
BOOST_PP_ITERATION() gives the current loop index (0, 1, 2...)
BOOST_PP_COMMA_IF(X) gives a comma, if the argument is not 0, e.g. the comma before "class A0" in iteration 1 in the template parameter list
BOOST_PP_ENUM(n,B,C) gives a comma separated list of B(?, N, C), where N runs from 0..(n-1), i.e. the macro B gets executed n times, so calling BOOST_PP_ENUM(3, LUABIND_TUPLE_PARAMS, _) gives const A0 *, const A1 *, const A2 *
BOOST_PP_ENUM_PARAMS(n, X) gives a comma separated list of X##n, e.g. &a0, &a1, &a2 for BOOST_PP_ENUM_PARAMS(3, &a)
Many of the use cases for that Preprocessor magic can be done with variadic templates these days, so if you are lucky you will not come across that stuff again ;) It's not easy to grasp at first sight, because preprocessing does not work like other known C++ features and has some limitations that one has to work around, making it even less easy to understand.

Related

Compile-Time Creation of Array of Templated Objects in High Level Synthesis

I'm trying to accomplish this with HLS, not with "normal" C++, so most libraries (STL, boost, etc.) won't work as they can't be synthesized (manual memory management is not allowed). I think this should be possible with template metaprogramming, but I'm a little stuck.
I want to create an array of shift registers, each with a variable depth. I have N inputs, and I want to create N shift registers, with depths 1 to N, where N is known at compile time. My shift register class basically looks like
template<int DEPTH>
class shift_register{
int registers[DEPTH];
...
};
I tried following this and adapting it: Programmatically create static arrays at compile time in C++ , however, the issue is with the last line. Each templated shift register is going to be a different type, and so can't be put together in an array. But I do need an array, as there wouldn't be a way to access each shift register.
Any help would be appreciated!
Just to clarify, my problem was the following: generate N shift_registers, templated from 1 to N, where N is a compile time constant.
For example, if I had N=4, I could easily write this as:
shift_register<1> sr1;
shift_register<2> sr2;
shift_register<3> sr3;
shift_register<4> sr4;
But this wouldn't be easy to change, if I wanted a different value for N in the future.
I ended up using the preprocessor and took the solution from here: How do I write a recursive for-loop "repeat" macro to generate C code with the CPP preprocessor?
I used the macros from that solution like this:
#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__
#define BODY(i) shift_register<i> CAT(sr,i)
REPEAT_ADD_ONE(BODY, N, 1);
And then something similar to that in order to access the shift registers, in a sort of array fashion.
This let me achieve the compile time generation that I was looking for, and get the array type access I needed.
Your question was somewhat difficult to understand but I'll do my best...
template <typename ... Args>
constexpr auto make_array(Args && ... pArgs)
{
using type = std::common_type_t<std::decay_t<Args>...>;
return std::array<type, sizeof...(Args)>{ (type)pArgs ... };
}
Then use it like this:
auto constexpr var_array_of_arrays = std::make_tuple
(
make_array(1, 2, 3, 3),
make_array(2, 3, 4),
make_array(1, 2, 3 ,4 ,3, 5)
);
To get the M'th element you access it like this, n has to actually be a compile-time constant:
std::get<M>(var_array_of_arrays);
To access the Nth element in the Mth array:
auto constexpr value = std::get<M>(var_array_of_arrays)[N]
An to improve the interface:
template <size_t M, size_t N, typename T >
constexpr decltype(auto) get_element(T && pInput)
{
return std::get<M>(std::forward<T>(pInput))[N];
}
Used like this:
auto constexpr element0_1 = get_element<0, 1>(var_array_of_arrays);
This will allow you to use an array of variable length arrays, or atleast something that behaves like that and is identical to that in memory.
A full example is here:
Online compiler
Whenever I hear "compile time number sequence" I think std::index_sequence
namespace detail {
template <typename>
struct shift_registers;
template <std::size_t ... Is> // 0, 1, ... N-1
struct shift_registers<std::index_sequence<Is...> > {
using type = std::tuple<shift_register<Is + 1>...>;
};
template <typename T>
using shift_registers_t = typename shift_registers<T>::type
}
template <std::size_t N>
using shift_registers = detail::shift_registers_t<std::make_index_sequence<N>>;

Defining macro improving syntax of specific function

I've created a function declared as:
template <typename Container, typename Task>
void parallel_for_each(Container &container, Task task,
unsigned number_of_threads = std::thread::hardware_concurrency())
It's not difficult to guess what it is supposed to do. I'd like to create a macro simplifying the syntax of this function and making the its syntax "loop-like". I've come up with an idea:
#define in ,
#define pforeach(Z,X,Y) parallel_for_each(X,[](Z)->void{Y;})
Where usage as:
pforeach(double &element, vec,
{
element *= 2;
});
works as expected, but this one:
pforeach(double &element in vec,
{
element *= 2;
element /= 2;
});
gives an error
macro "pforeach" requires 3 arguments, but only 2 given
Do you have any idea how to write a macro allowing even "nicer" syntax? Why "in" doesn't stand for comma in my code?
The reason that in is not replaced is that it appears inside an argument to your function-like macro, but for it to be replaced, those arguments have to be propagated to another macro first: Try
#define in ,
#define pforeach_(Z,X,Y) parallel_for_each(X,[](Z)->void{Y;})
#define pforeach(Z,X,Y) pforeach_(Z,X,Y)
Note: Defining in as , is not gonna end well!
An idea to add "nicer" syntax:
template <typename Container>
struct Helper {
Container&& c;
template <typename Arg>
void operator=(Arg&& arg) {
parallel_for_each(std::forward<Container>(c), std::forward<Arg>(arg));
}
};
#define CONCAT_(a,b) a##b
#define CONCAT(a,b) CONCAT_(a,b)
// Easier with Boost.PP
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC(i) CONCAT(DEC_,i)
#define pforeach(Z, ...) \
Helper<decltype((__VA_ARGS__))> CONCAT(_unused_obj, __COUNTER__){__VA_ARGS__}; \
CONCAT(_unused_obj, DEC(__COUNTER__))=[](Z)
Usable as
int a[] = {1, 2, 3};
pforeach(int i, a) {
std::cout << i << ", ";
};
pforeach(int i, std::vector<int>{1, 2, 3}) {
std::cout << -i << ", ";
};
Demo.
Has several disadvantages though. I'd just stick with what you've got so far.
Why "in" doesn't stand for comma in my code?
Because that replacement is performed after macro arguments are determined. Quoting standard draft N3797, § 16.3.1 Argument substitution:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. ... Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens
are available.
So preprocessor identifies pforeach(double &element in vec, {}) as a function-like macro call with two arguments:
First consists of tokens double, &, in and vec and bound to argument Z
Second consists of tokens { and } and bound to argument X
You're obviously miss argument Y
Do you have any idea how to write a macro allowing even "nicer" syntax?
It is hard to answer and it is matter of taste. Anyway, C++ has rich capabilities of patching syntax with operator overload, but you can't build DSL with that, so it is better to use default syntax, it is not that ugly (and also makes it easy to read):
parallel_for_each(vec, [](double& el){ el *= 2; })
There is no macro langugae. Macros are handled by the C/C++ preprocessor. The implementation of the preprocessors may vary.
Most preprocessors expect that you pass the exact number of parameters. I found that the GNU preprocessor has a less strict checking of parameters what allows a kind of variadic list. But in general a macro won't help you with your task.
I recommend to write the short statement in a function instead of a macro. An inline function is as fast and short as a macro, but type safe.
Further the function allows default parameter values. So you can skip something.
Trying to improve the idea of #Columbo :
template <typename Container>
struct __pforeach__helper {
Container &&c;
template <typename Arg>
void operator=(Arg&& arg) {
parallel_for_each(std::forward<Container>(c), std::forward<Arg>(arg));
}
};
//additional helper function
template <typename Container>
__pforeach__helper<Container> __create__pforeach__helper(Container &&c)
{
return __pforeach__helper<Container>(__pforeach__helper<Container>{c});
}
#define pforeach(Z,C) \
__create__pforeach__helper(C)=[](Z)
It doesn't rely on __COUNTER__ and doesn't require defining DEC_x macros. Any feedback is most welcome!

How can I take a decision based on a type within a macro?

In my company, I'm working on providing a faster SSE path for some hot code. I'm using the intrinsic approach which keeps to C++ and really shows impressive results.
All code only has to work on float and double, so I created a templated SSE operations class that I specialized for both. What I really don't like is that these two classes look almost identical except for the number type (float/double), the used SSE type (__m128/__m128d) and the intrisics suffix (_ps/_pd) like so:
template<>
struct SseOperations<float> : public Sse<float>
{
typedef __m128 vector;
vector load(float const * const from) const
{
return _mm_loadu_ps(from);
}
vector add(vector const & a, vector const & b) const
{
return _mm_add_ps(a, b);
}
// etc.
};
and
template<>
struct SseOperations<double> : public Sse<double>
{
typedef __m128d vector;
vector load(double const * const from) const
{
return _mm_loadu_pd(from);
}
vector add(vector const & a, vector const & b) const
{
return _mm_add_pd(a, b);
}
// etc.
};
I wouldn't know how to unify this using template magic, because of the different intrinsics suffix.
Then the ## capability of macros came to my mind, which would lend itself for that purpose. So I managed to put the complete specialized class into a macro that I could use to generate both classes with:
SSE_OPERATIONS(float, __m128, _ps);
SSE_OPERATIONS(double, __m128d, _pd);
I know macros are evil and all, but at least in this case I don't see any of the typical dangers and it gets the job done.
What bothers me now is that the second and third macro parameter are redundant; they could be deduced from the first one, only that I have absolutely no idea how. #if and its friends aren't supposed to work because sizeof() doesn't work during pre-processing.
Searching for solutions is unexpectedly hard, because of #if topics polluting the results heavily. Can anyone tell me how to do a macro level decision for this problem?
PS: I heard of Boost Preprocessor but I'm not allowed to use it.
Update: Although I'm asking for a macro solution, I would also accept a nice template solution. For that, know that I'm encapsulating at least 7 intrinsics—just in case that would bloat template code.
You can get rid of the second parameter by a trait:
template <class Scalar>
struct Vector;
template <>
struct Vector<float>
{
typedef __m128 type;
};
template <>
struct Vector<double>
{
typedef __m128d type;
};
As for the third one, you can do a really ugly special preprocessor hack trick:
#define SUFFIX_float ps
#define SUFFIX_double pd
and use ## on SUFFIX_ and the outermost macro parameter to arrive at the correct version. Of course, it would require some levels of indirection to get the macros to expand at the correct time. Using Boost.Preprocessor, in particular BOOST_PP_CAT and possibly BOOST_PP_EXPAND, might make this slightly easier.

map/fold operators (in c++)

I am writing library which can do map/fold operations on ranges. I need to do these with operators. I am not very familiar with functional programming and I've tentatively selected * for map and || for fold. So to find (brute force algorithm) maximum of cos(x) in interval: 8 < x < 9:
double maximum = ro::range(8, 9, 0.01) * std::cos || std::max;
In above, ro::range can be replaced with any STL container.
I don't want to be different if there is any convention for map/fold operators. My question is: is there a math notation or does any language uses operators for map/fold?
** EDIT **
For those who asked, below is small demo of what RO currently can do. scc is small utility which can evaluate C++ snippets.
// Can print ranges, container, tuples, etc directly (vint is vector<int>) :
scc 'vint V{1,2,3}; V'
{1,2,3}
// Classic pipe. Alogorithms are from std::
scc 'vint{3,1,2,3} | sort | unique | reverse'
{3, 2, 1}
// Assign 42 to [2..5)
scc 'vint V=range(0,9); range(V/2, V/5) = 42; V'
{0, 1, 42, 42, 42, 5, 6, 7, 8, 9}
// concatenate vector of strings ('add' is shotcut for std::plus<T>()):
scc 'vstr V{"aaa", "bb", "cccc"}; V || add'
aaabbcccc
// Total length of strings in vector of strings
scc 'vstr V{"aaa", "bb", "cccc"}; V * size || (_1+_2)'
9
// Assign to c-string, then append `"XYZ"` and then remove `"bc"` substring :
scc 'char s[99]; range(s) = "abc"; (range(s) << "XYZ") - "bc"'
aXYZ
// Remove non alpha-num characters and convert to upper case
scc '(range("abc-123, xyz/") | isalnum) * toupper'
ABC123XYZ
// Hide phone number:
scc "str S=\"John Q Public (650)1234567\"; S|isdigit='X'; S"
John Q Public (XXX)XXXXXXX
This is really more a comment than a true answer, but it's too long to fit in a comment.
At least if my memory for the terminology serves correctly, map is essentially std::transform, and fold is std::accumulate. Assuming that's correct, I think trying to write your own would be ill-advised at best.
If you want to use map/fold style semantics, you could do something like this:
std::transform(std::begin(sto), std::end(sto), ::cos);
double maximum = *std::max_element(std::begin(sto), std::end(sto));
Although std::accumulate is more like a general-purpose fold, std::max_element is basically a fold(..., max); If you prefer a single operation, you could do something like:
double maximum = *(std::max_element(std::begin(sto), std::end(sto),
[](double a, double b) { return cos(a) < cos(b); });
I urge you to reconsider overloading operators for this purpose. Either example I've given above should be clear to almost any reasonable C++ programmer. The example you've given will be utterly opaque to most.
On a more general level, I'd urge extreme caution when overloading operators. Operator overloading is great when used correctly -- being able to overload operators for things like arbitrary precision integers, matrices, complex numbers, etc., renders code using those types much more readable and understandable than code without overloaded operators.
Unfortunately, when you use operators in unexpected ways, precisely the opposite is true -- and these uses are certainly extremely unexpected -- in fact, well into the range of "quite surprising". There might be question (but at least a little justification) if these operators were well understood in specific areas, but contrary to other uses in C++. In this case, however, you seem to be inventing a notation "out of whole cloth" -- I'm not aware of anybody using any operator C++ supports overloading to mean either fold or map (nor anything visually similar or analogous in any other way). In short, using overloading this way is a poor and unjustified idea.
Of the languages I know, there is no standard way for folding. Scala uses operators /: and :\ as well as metthod names, Lisp has reduce, Haskell has foldl.
map on the other hand is more common to find simply as map in all the languages I know.
Below is an implementation of fold in quasi-human-readable infix C++ syntax. Note that the code is not very robust and only serves to demonstrate the point. It is made to support the more usual 3-argument fold operators (the range, the binary operation, and the neutral element).
This is easily the funnies way to abuse (have you just said "rape"?) operator overloading, and one of the best ways to shoot yourself in the foot with a 900 pound artillery shell.
enum { fold } fold_t;
template <typename Op>
struct fold_intermediate_1
{
Op op;
fold_intermediate_1 (Op op) : op(op) {}
};
template <typename Cont, typename Op, bool>
struct fold_intermediate_2
{
const Cont& cont;
Op op;
fold_intermediate_2 (const Cont& cont, Op op) : cont(cont), op(op) {}
};
template <typename Op>
fold_intermediate_1<Op> operator/(fold_t, Op op)
{
return fold_intermediate_1<Op>(op);
}
template <typename Cont, typename Op>
fold_intermediate_2<Cont, Op, true> operator<(const Cont& cont, fold_intermediate_1<Op> f)
{
return fold_intermediate_2<Cont, Op, true>(cont, f.op);
}
template <typename Cont, typename Op, typename Init>
Init operator< (fold_intermediate_2<Cont, Op, true> f, Init init)
{
return foldl_func(f.op, init, std::begin(f.cont), std::end(f.cont));
}
template <typename Cont, typename Op>
fold_intermediate_2<Cont, Op, false> operator>(const Cont& cont, fold_intermediate_1<Op> f)
{
return fold_intermediate_2<Cont, Op, false>(cont, f.op);
}
template <typename Cont, typename Op, typename Init>
Init operator> (fold_intermediate_2<Cont, Op, false> f, Init init)
{
return foldr_func(f.op, init, std::begin(f.cont), std::end(f.cont));
}
foldr_func and foldl_func (the actual algorithms of left and right folds) are defined elsewhere.
Use it like this:
foo myfunc(foo, foo);
container<foo> cont;
foo zero, acc;
acc = cont >fold/myfunc> zero; // right fold
acc = cont <fold/myfunc< zero; // left fold
The word fold is used as a kind of poor man's new reserved word here. One can define several variations of this syntax, including
<<fold/myfunc<< >>fold/myfunc>>
<foldl/myfunc> <foldr/myfunc>
|fold<myfunc| |fold>myfunc|
The inner operator must have the same or greater precedence as the outer one(s). It's the limitation of C++ grammar.
For map, only one intermediate is needed and the syntax could be e.g.
mapped = cont |map| myfunc;
Implementing it is a simple exercise.
Oh, and please don't use this syntax in production, unless you know very well what you are doing, and probably even if you do ;)

Generating permutations via templates

I'd like a function, or function object, that can generate a permutation of its inputs with the permutation specified at compile time. To be clear, I am not looking to generate all of the permutations, only a specific one. For instance, permute<1,4,3,2>( a, b, c, d ) would return (a,d,c,b). Obviously, it is straightforward to do this with a permutation of a specific length, e.g. 2, like this
#include <boost/tuple.hpp>
template< unsigned a, unsigned b>
struct permute {
template< class T >
boost::tuple< T, T > operator()( T ta, T tb ) {
boost::tuple< T, T > init = boost::make_tuple( ta, tb );
return boost::make_tuple( init.get< a >(), init.get< b >() );
}
};
But, how would I go about doing this for an arbitrary length permuation? Also, is there a cleaner way of writing the above code? Yes, the above code is not restricted to making permutations as permute<2,2>(a,b) is allowed, but I don't see that as a flaw. However, can it be restricted to only allowing actual permutations?
C++0x provides variadic templates, which you should be able to use to handle an arbitrary length permutation. They were added specifically because the current version of C++ doesn't have a clean way of dealing with this kind of problem.