When to Overload the Comma Operator? - c++

I see questions on SO every so often about overloading the comma operator in C++ (mainly unrelated to the overloading itself, but things like the notion of sequence points), and it makes me wonder:
When should you overload the comma? What are some examples of its practical uses?
I just can't think of any examples off the top of my head where I've seen or needed to something like
foo, bar;
in real-world code, so I'm curious as to when (if ever) this is actually used.

I have used the comma operator in order to index maps with multiple indices.
enum Place {new_york, washington, ...};
pair<Place, Place> operator , (Place p1, Place p2)
{
return make_pair(p1, p2);
}
map< pair<Place, Place>, double> distance;
distance[new_york, washington] = 100;

Let's change the emphasis a bit to:
When should you overload the comma?
The answer: Never.
The exception: If you're doing template metaprogramming, operator, has a special place at the very bottom of the operator precedence list, which can come in handy for constructing SFINAE-guards, etc.
The only two practical uses I've seen of overloading operator, are both in Boost:
Boost.Assign
Boost.Phoenix – it's fundamental here in that it allows Phoenix lambdas to support multiple statements

Boost.Assign uses it, to let you do things like:
vector<int> v;
v += 1,2,3,4,5,6,7,8,9;
And I've seen it used for quirky language hacks, I'll see if I can find some.
Aha, I do remember one of those quirky uses: collecting multiple expressions. (Warning, dark magic.)

The comma has an interesting property in that it can take a parameter of type void. If it is the case, then the built-in comma operator is used.
This is handy when you want to determine if an expression has type void:
namespace detail_
{
template <typename T>
struct tag
{
static T get();
};
template <typename T, typename U>
tag<char(&)[2]> operator,(T, tag<U>);
template <typename T, typename U>
tag<U> operator,(tag<T>, tag<U>);
}
#define HAS_VOID_TYPE(expr) \
(sizeof((::detail_::tag<int>(), \
(expr), \
::detail_::tag<char>).get()) == 1)
I let the reader figure out as an exercise what is going on. Remember that operator, associates to the right.

Similar to #GMan's Boost.Assign example, Blitz++ overloads the comma operator to provide a convenient syntax for working with multidimensional arrays. For example:
Array<double,2> y(4,4); // A 4x4 array of double
y = 1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1;

In SOCI - The C++ Database Access Library it is used for the implementation of the inbound part of the interface:
sql << "select name, salary from persons where id = " << id,
into(name), into(salary);
From the rationale FAQ:
Q: Overloaded comma operator is just obfuscation, I don't like it.
Well, consider the following:
"Send the query X to the server Y and put result into variable Z."
Above, the "and" plays a role of the comma. Even if overloading the comma operator is not a very popular practice in C++, some libraries do this, achieving terse and easy to learn syntax. We are pretty sure that in SOCI the comma operator was overloaded with a good effect.

I use the comma operator for printing log output. It actually is very similar to ostream::operator<< but I find the comma operator actually better for the task.
So I have:
template <typename T>
MyLogType::operator,(const T& data) { /* do the same thing as with ostream::operator<<*/ }
It has these nice properties
The comma operator has the lowest priority. So if you want to stream an expression, things do not mess up if you forget the parenthesis. Compare:
myLog << "The mask result is: " << x&y; //operator precedence would mess this one up
myLog, "The result is: ", x&y;
you can even mix comparisons operators inside without a problem, e.g.
myLog, "a==b: ", a==b;
The comma operator is visually small. It does not mess up with reading when gluing many things together
myLog, "Coords=", g, ':', s, ':', p;
It aligns with the meaning of the comma operator, i.e. "print this" and then "print that".

One possibility is the Boost Assign library (though I'm pretty sure some people would consider this abuse rather than a good use).
Boost Spirit probably overloads the comma operator as well (it overloads almost everything else...)

Along the same lines, I was sent a github pull request with comma operator overload. It looked something like following
class Mylogger {
public:
template <typename T>
Mylogger & operator,(const T & val) {
std::cout << val;
return * this;
}
};
#define Log(level,args...) \
do { Mylogger logv; logv,level, ":", ##args; } while (0)
then in my code I can do:
Log(2, "INFO: setting variable \", 1, "\"\n");
Can someone explain why this is a good or bad usage case?

One of the practical usage is for effectively using it with variable arguments in macro. By the way, variable arguments was earlier an extension in GCC and now a part of C++11 standard.
Suppose we have a class X, which adds object of type A into it. i.e.
class X {
public: X& operator+= (const A&);
};
What if we want to add 1 or more objects of A into X buffer;?
For example,
#define ADD(buffer, ...) buffer += __VA_ARGS__
Above macro, if used as:
ADD(buffer, objA1, objA2, objA3);
then it will expand to:
buffer += objA1, objeA2, objA3;
Hence, this will be a perfect example of using comma operator, as the variable arguments expand with the same.
So to resolve this we overload comma operator and wrap it around += as below
X& X::operator, (const A& a) { // declared inside `class X`
*this += a; // calls `operator+=`
}

Here is an example from OpenCV documentation (http://docs.opencv.org/modules/core/doc/basic_structures.html#mat). The comma operator is used for cv::Mat initialization:
// create a 3x3 double-precision identity matrix
Mat M = (Mat_<double>(3,3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);

Related

How to overload >> operator to take a comma separated variable argument list

--Quick Before
So before anyone says this question has been answered on another post it hasn't... It was a homework question in the other post and the original question was never answered only told they were wrong.
--Question
I am trying to overload the >> operator to be able to pass in n-number of variables seperated by commas into an object like so...
Mat M = (Mat_<double>(3,3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);
I am trying to reuse their usage of the comma seperated argument list but I can't seem to get it to work.
When I overload the << operator like so
void operator<< (const double& is)
{
std::cout << "hiya " << is << std::endl;
}
and attempt to use it like so
mat << 1.0, 2.0;
only the first value is passed to the operator... The second value is never 'used' as I believe that << has a higher presidence than ,
So my question is what are they doing in libraries like eigen and openCV to be able to have this functionality. I have looked through their code to attempt to understand it but it appears to require a deeper understanding of how C++ works that I don't have and I was hoping someone here could shed some light on it.
Thanks in advance for any advice.
You'll have to overload the insertion operator (<<) and the comma operator (,) such that
mat << 1.0, 2.0;
is translated as:
mat.operator<<(1.0).operator,(2.0);
or
operator,(operator<<(mat, 1.0), 2.0);
Here's a demonstrative program that illustrates the idea without doing anything useful.
struct Foo
{
};
Foo& operator<<(Foo& f, double)
{
std::cout << "In operator<<(Foo& f, double)\n";
return f;
}
Foo& operator,(Foo& f, double)
{
std::cout << "In operator,(Foo& f, double)\n";
return f;
}
int main()
{
Foo f;
f << 10, 20, 30;
}
and its output
In operator<<(Foo& f, double)
In operator,(Foo& f, double)
In operator,(Foo& f, double)
You would have to create a temporary from the first argument and surround the entire comma list in parenthesis:
myObj >> (some_temporary(3), 1, ...);
which would require that some_temporary be either a type or helper-function-returning-object that overloads the comma operator and your >> would need to be able to take that type. Otherwise the precedence of >> would "win" and therefore be evaluated before the comma expression is seen.
An uglier alternative would be to have your >> return a type that overloads operator ,() but I believe the first is preferable (actually, I would say this entire scheme is un-preferable).

Class to represent reciprocals

I have code where I do a lot things like
a = 1/((1/b)+(1/c))
I can't help it, I cringe at the use of so many divisions, when I could do it like this
oneOverA = oneOverB + oneOverC
In fact i can do pretty much everything with reciprocals. For instance with comparisons
A < B iff oneOverB < oneOverA
again without need to divide. And so on.
However, this would be extremely error-prone. Imagine if I pass the reversed version to a function which expects the "straight" one, or forget to reverse the order while comparing, etc.
But, C++ exists to make magic happen, right?
And if I implenent a very simple class header-only, pretty much everything should be optimized away; leaving it just as I would write it if I was as infallible as the compiler.
So I tried a class like
template<typename T = float> //T must be a type for which 1/T is representable as another T such that 1(1/T)) = T. es. float
class ReversibleNum{
private:
const T reverse;
public:
ReversibleNum(T orig): reverse(1/orig) {}
operator T () { return 1/reverse; }
bool operator < (const ReversibleNum<T>& oth) { return oth.reverse < reverse; }
friend T operator / (const T one, const ReversibleNum<T>& two);
}
inline T operator / (const T one, const ReversibleNum<T>& two) { return one * two.reverse; }
However, when I try
ReversibleNum<float> a(5);
1.0/a
it uses "operator float" to convert and then divide, rather than "operator /" which is what I want.
I could probably make "operator T" explicit. However I like the idea of being able to use it seamlessy and have the conversion just happen; but only when there really is no alternative.
Is there something I can do to make my code work?
Or maybe some different, more advanced magic I coud use to achieve my aim?

map/fold operators (in c++)

I am writing library which can do map/fold operations on ranges. I need to do these with operators. I am not very familiar with functional programming and I've tentatively selected * for map and || for fold. So to find (brute force algorithm) maximum of cos(x) in interval: 8 < x < 9:
double maximum = ro::range(8, 9, 0.01) * std::cos || std::max;
In above, ro::range can be replaced with any STL container.
I don't want to be different if there is any convention for map/fold operators. My question is: is there a math notation or does any language uses operators for map/fold?
** EDIT **
For those who asked, below is small demo of what RO currently can do. scc is small utility which can evaluate C++ snippets.
// Can print ranges, container, tuples, etc directly (vint is vector<int>) :
scc 'vint V{1,2,3}; V'
{1,2,3}
// Classic pipe. Alogorithms are from std::
scc 'vint{3,1,2,3} | sort | unique | reverse'
{3, 2, 1}
// Assign 42 to [2..5)
scc 'vint V=range(0,9); range(V/2, V/5) = 42; V'
{0, 1, 42, 42, 42, 5, 6, 7, 8, 9}
// concatenate vector of strings ('add' is shotcut for std::plus<T>()):
scc 'vstr V{"aaa", "bb", "cccc"}; V || add'
aaabbcccc
// Total length of strings in vector of strings
scc 'vstr V{"aaa", "bb", "cccc"}; V * size || (_1+_2)'
9
// Assign to c-string, then append `"XYZ"` and then remove `"bc"` substring :
scc 'char s[99]; range(s) = "abc"; (range(s) << "XYZ") - "bc"'
aXYZ
// Remove non alpha-num characters and convert to upper case
scc '(range("abc-123, xyz/") | isalnum) * toupper'
ABC123XYZ
// Hide phone number:
scc "str S=\"John Q Public (650)1234567\"; S|isdigit='X'; S"
John Q Public (XXX)XXXXXXX
This is really more a comment than a true answer, but it's too long to fit in a comment.
At least if my memory for the terminology serves correctly, map is essentially std::transform, and fold is std::accumulate. Assuming that's correct, I think trying to write your own would be ill-advised at best.
If you want to use map/fold style semantics, you could do something like this:
std::transform(std::begin(sto), std::end(sto), ::cos);
double maximum = *std::max_element(std::begin(sto), std::end(sto));
Although std::accumulate is more like a general-purpose fold, std::max_element is basically a fold(..., max); If you prefer a single operation, you could do something like:
double maximum = *(std::max_element(std::begin(sto), std::end(sto),
[](double a, double b) { return cos(a) < cos(b); });
I urge you to reconsider overloading operators for this purpose. Either example I've given above should be clear to almost any reasonable C++ programmer. The example you've given will be utterly opaque to most.
On a more general level, I'd urge extreme caution when overloading operators. Operator overloading is great when used correctly -- being able to overload operators for things like arbitrary precision integers, matrices, complex numbers, etc., renders code using those types much more readable and understandable than code without overloaded operators.
Unfortunately, when you use operators in unexpected ways, precisely the opposite is true -- and these uses are certainly extremely unexpected -- in fact, well into the range of "quite surprising". There might be question (but at least a little justification) if these operators were well understood in specific areas, but contrary to other uses in C++. In this case, however, you seem to be inventing a notation "out of whole cloth" -- I'm not aware of anybody using any operator C++ supports overloading to mean either fold or map (nor anything visually similar or analogous in any other way). In short, using overloading this way is a poor and unjustified idea.
Of the languages I know, there is no standard way for folding. Scala uses operators /: and :\ as well as metthod names, Lisp has reduce, Haskell has foldl.
map on the other hand is more common to find simply as map in all the languages I know.
Below is an implementation of fold in quasi-human-readable infix C++ syntax. Note that the code is not very robust and only serves to demonstrate the point. It is made to support the more usual 3-argument fold operators (the range, the binary operation, and the neutral element).
This is easily the funnies way to abuse (have you just said "rape"?) operator overloading, and one of the best ways to shoot yourself in the foot with a 900 pound artillery shell.
enum { fold } fold_t;
template <typename Op>
struct fold_intermediate_1
{
Op op;
fold_intermediate_1 (Op op) : op(op) {}
};
template <typename Cont, typename Op, bool>
struct fold_intermediate_2
{
const Cont& cont;
Op op;
fold_intermediate_2 (const Cont& cont, Op op) : cont(cont), op(op) {}
};
template <typename Op>
fold_intermediate_1<Op> operator/(fold_t, Op op)
{
return fold_intermediate_1<Op>(op);
}
template <typename Cont, typename Op>
fold_intermediate_2<Cont, Op, true> operator<(const Cont& cont, fold_intermediate_1<Op> f)
{
return fold_intermediate_2<Cont, Op, true>(cont, f.op);
}
template <typename Cont, typename Op, typename Init>
Init operator< (fold_intermediate_2<Cont, Op, true> f, Init init)
{
return foldl_func(f.op, init, std::begin(f.cont), std::end(f.cont));
}
template <typename Cont, typename Op>
fold_intermediate_2<Cont, Op, false> operator>(const Cont& cont, fold_intermediate_1<Op> f)
{
return fold_intermediate_2<Cont, Op, false>(cont, f.op);
}
template <typename Cont, typename Op, typename Init>
Init operator> (fold_intermediate_2<Cont, Op, false> f, Init init)
{
return foldr_func(f.op, init, std::begin(f.cont), std::end(f.cont));
}
foldr_func and foldl_func (the actual algorithms of left and right folds) are defined elsewhere.
Use it like this:
foo myfunc(foo, foo);
container<foo> cont;
foo zero, acc;
acc = cont >fold/myfunc> zero; // right fold
acc = cont <fold/myfunc< zero; // left fold
The word fold is used as a kind of poor man's new reserved word here. One can define several variations of this syntax, including
<<fold/myfunc<< >>fold/myfunc>>
<foldl/myfunc> <foldr/myfunc>
|fold<myfunc| |fold>myfunc|
The inner operator must have the same or greater precedence as the outer one(s). It's the limitation of C++ grammar.
For map, only one intermediate is needed and the syntax could be e.g.
mapped = cont |map| myfunc;
Implementing it is a simple exercise.
Oh, and please don't use this syntax in production, unless you know very well what you are doing, and probably even if you do ;)

'Freezing' an expression

I have a C++ expression that I wish to 'freeze'. By this, I mean I have syntax like the following:
take x*x with x in container ...
where the ... indicates further (non-useful to this problem) syntax. However, if I attempt to compile this, no matter what preprocessor translations I've used to make 'take' an 'operator' (in inverted commas because it's technically not an operator, but the translation phase turns it into a class with, say, operator* available to it), the compiler still attempts to evaluate / work out where the x*x is coming from, (and, since it hasn't been declared previously (as it's declared further at the 'in' stage), it instead) can't find it and throws a compile error.
My current idea essentially involves attempting to place the expression inside a lambda (and since we can deduce the type of the container, we can declare x with the right type as, say, [](decltype(*begin(container)) x) { return x*x } -- thus, when the compiler looks at this statement, it's valid and no error is thrown), however, I'm running into errors actually achieving this.
Thus, my question is:
Is there a way / what's the best way to 'freeze' the x*x part of my expression?
EDIT:
In an attempt to clarify my question, take the following. Assume that the operator- is defined in a sane way so that the following attempts to achieve what the above take ... syntax does:
MyTakeClass() - x*x - MyWithClass() - x - MyInClass() - container ...
When this statement is compiled, the compiler will throw an error; x is not declared so x*x makes no sense (nor does x - MyInClass(), etc, etc). What I'm trying to achieve is to find a way to make the above expression compile, using any voodoo magic available, without knowing the type of x (or, in fact, that it will be named x; it could viably be named 'somestupidvariablename') in advance.
I came up with an almost solution, based on expression templates (note: these are not expression templates, they are based on expression templates). Unfortunately, I could not come up with a way that does not require you to predeclare x, but I did come up with a way to delay the type, so you only have to declare x one globally, and can use it for different types over and over in the same program/file/scope. Here is the expression type that works the magic, which I designed to be very flexible, you should be able to easily add operations and uses at will. It is used exactly how you described, except for the predeclaration of x.
Downsides I'm aware of: it does require T*T, T+T, and T(long) be compilable.
expression x(0, true); //x will be the 0th parameter. Sorry: required :(
int main() {
std::vector<int> container;
container.push_back(-3);
container.push_back(0);
container.push_back(7);
take x*x with x in container; //here's the magic line
for(unsigned i=0; i<container.size(); ++i)
std::cout << container[i] << ' ';
std::cout << '\n';
std::vector<float> container2;
container2.push_back(-2.3);
container2.push_back(0);
container2.push_back(7.1);
take 1+x with x in container2; //here's the magic line
for(unsigned i=0; i<container2.size(); ++i)
std::cout << container2[i] << ' ';
return 0;
}
and here's the class and defines that makes it all work:
class expression {
//addition and constants are unused, and merely shown for extendibility
enum exprtype{parameter_type, constant_type, multiplication_type, addition_type} type;
long long value; //for value types, and parameter number
std::unique_ptr<expression> left; //for unary and binary functions
std::unique_ptr<expression> right; //for binary functions
public:
//constructors
expression(long long val, bool is_variable=false)
:type(is_variable?parameter_type:constant_type), value(val)
{}
expression(const expression& rhs)
: type(rhs.type)
, value(rhs.value)
, left(rhs.left.get() ? std::unique_ptr<expression>(new expression(*rhs.left)) : std::unique_ptr<expression>(NULL))
, right(rhs.right.get() ? std::unique_ptr<expression>(new expression(*rhs.right)) : std::unique_ptr<expression>(NULL))
{}
expression(expression&& rhs)
:type(rhs.type), value(rhs.value), left(std::move(rhs.left)), right(std::move(rhs.right))
{}
//assignment operator
expression& operator=(expression rhs) {
type = rhs.type;
value = rhs.value;
left = std::move(rhs.left);
right = std::move(rhs.right);
return *this;
}
//operators
friend expression operator*(expression lhs, expression rhs) {
expression ret(0);
ret.type = multiplication_type;
ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
return ret;
}
friend expression operator+(expression lhs, expression rhs) {
expression ret(0);
ret.type = addition_type;
ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
return ret;
}
//skip the parameter list, don't care. Ignore it entirely
expression& operator<<(const expression&) {return *this;}
expression& operator,(const expression&) {return *this;}
template<class container>
void operator>>(container& rhs) {
for(auto it=rhs.begin(); it!=rhs.end(); ++it)
*it = execute(*it);
}
private:
//execution
template<class T>
T execute(const T& p0) {
switch(type) {
case parameter_type :
switch(value) {
case 0: return p0; //only one variable
default: throw std::runtime_error("Invalid parameter ID");
}
case constant_type:
return ((T)(value));
case multiplication_type:
return left->execute(p0) * right->execute(p0);
case addition_type:
return left->execute(p0) + right->execute(p0);
default:
throw std::runtime_error("Invalid expression type");
}
}
//This is also unused, and merely shown as extrapolation
template<class T>
T execute(const T& p0, const T& p1) {
switch(type) {
case parameter_type :
switch(value) {
case 0: return p0;
case 1: return p1; //this version has two variables
default: throw std::runtime_error("Invalid parameter ID");
}
case constant_type:
return value;
case multiplication_type:
return left->execute(p0, p1) * right->execute(p0, p1);
case addition_type:
return left->execute(p0, p1) + right->execute(p0, p1);
default:
throw std::runtime_error("Invalid expression type");
}
}
};
#define take
#define with <<
#define in >>
Compiles and runs with correct output at http://ideone.com/Dnb50
You may notice that since the x must be predeclared, the with section is ignored entirely. There's almost no macro magic here, the macros effectively turn it into "x*x >> x << container", where the >>x does absolutely nothing at all. So the expression is effectively "x*x << container".
Also note that this method is slow, because this is an interpreter, with almost all the slowdown that implies. However, it has the bonus that it is serializable, you could save the function to a file, load it later, and execute it then.
R.MartinhoFernandes has observed that the definition of x can be simplified to merely be expression x;, and it can deduce the order of parameters from the with section, but it would require a lot of rethinking of the design and would be more complicated. I might come back and add that functionality later, but in the meantime, know that it is definitely possible.
If you can modify the expression to `take(x*x with x in container)`, than that would remove the need to predeclare `x`, with something far far simpler than expression templates.
#define with ,
#define in ,
#define take(expr, var, con) \
std::transform(con.begin(), con.end(), con.begin(), \
[](const typename con::value_type& var) -> typename con::value_type \
{return expr;});
int main() {
std::vector<int> container;
container.push_back(-3);
container.push_back(0);
container.push_back(7);
take(x*x with x in container); //here's the magic line
for(unsigned i=0; i<container.size(); ++i)
std::cout << container[i] << ' ';
}
I made an answer very similar to my previous answer, but using actual expression templates, which should be much faster. Unfortunately, MSVC10 crashes when it attempts to compile this, but MSVC11, GCC 4.7.0 and Clang 3.2 all compile and run it just fine. (All other versions untested)
Here's the usage of the templates. Implementation code is here.
#define take
#define with ,
#define in >>=
//function call for containers
template<class lhsexpr, class container>
lhsexpr operator>>=(lhsexpr lhs, container& rhs)
{
for(auto it=rhs.begin(); it!=rhs.end(); ++it)
*it = lhs(*it);
return lhs;
}
int main() {
std::vector<int> container0;
container0.push_back(-4);
container0.push_back(0);
container0.push_back(3);
take x*x with x in container0; //here's the magic line
for(auto it=container0.begin(); it!=container0.end(); ++it)
std::cout << *it << ' ';
std::cout << '\n';
auto a = x+x*x+'a'*x;
auto b = a; //make sure copies work
b in container0;
b in container1;
std::cout << sizeof(b);
return 0;
}
As you can see, this is used exactly like my previous code, except now all the functions are decided at compile time, which means this will have exactly the same speed as a lambda. In fact, C++11 lambdas were preceeded by boost::lambda which works on very similar concepts.
This is a separate answer, because the code is far different, and far more complicated/intimidating. That's also why the implementation is not in the answer itself.
I don't think it is possible to get this "list comprehesion" (not quite, but it is doing the same thing) ala haskell using the preprocessor. The preprocessor just does simple search and replace with the possibility of arguments, so it cannot perform arbitrary replacements. Especially changing the order of parts of expression is not possible.
I cannot see a way to do this, without changing the order, since you always need x somehow to appear before x*x to define this variable. Using a lambda will not help, since you still need x in front of the x*x part, even if it is just as an argument. This makes this syntax not possible.
There are some ways around this:
Use a different preprocessor. There are preprocessors based on the ideas of Lisp-macros, which can be made syntax aware and hence can do arbitrary transformation of one syntax tree into another. One example is Camlp4/Camlp5 developed for the OCaml language. There are some very good tutorials on how to use this for arbitrary syntax transformation. I used to have an explanation on how to use Camlp4 to transform makefiles into C code, but I cannot find it anymore. There are some other tutorials on how to do such things.
Change the syntax slightly. Such list comprehension is essientially just a syntactic simplification of the usage of a Monad. With the arrival of C++11 Monads have become possible in C++. However the syntactic sugar may not be. If you decide to wrap the stuff you are trying to do in a Monad, many things will still be possible, you will just have to change the syntax slightly. Implementing Monads in C++ is anything but fun though (although I first expected otherwise). Have a look here for some example how to get Monads in C++.
The best approach is to parse it using the preprocessor.I do believe the preprocessor can be a very powerful tool for building EDSLs(embedded domain specific languages), but you must first understand the limitations of the preprocessor parsing things. The preprocessor can only parse out predefined tokens. So the syntax must be changed slightly by placing parenthesis around the expressions, and a FREEZE macro must surround it also(I just picked FREEZE, it could be called anything):
FREEZE(take(x*x) with(x, container))
Using this syntax you can convert it to a preprocessor sequence(using the Boost.Preprocessor library, of course). Once you have it as a preprocessor sequence you can apply lots of algorithms to it to transform it to however you like. A similiar approach is done with the Linq library for C++, where you can write this:
LINQ(from(x, numbers) where(x > 2) select(x * x))
Now, to convert to a pp sequence first you need to define the keywords to be parsed, like this:
#define KEYWORD(x) BOOST_PP_CAT(KEYWORD_, x)
#define KEYWORD_take (take)
#define KEYWORD_with (with)
So the way this will work is when you call KEYWORD(take(x*x) with(x, container)) it will expand to (take)(x*x) with(x, container), which is the first step towards converting it to a pp sequence. Now to keep going we need to use a while construct from the Boost.Preprocessor library, but first we need to define some little macros to help us along the way:
// Detects if the first token is parenthesis
#define IS_PAREN(x) IS_PAREN_CHECK(IS_PAREN_PROBE x)
#define IS_PAREN_CHECK(...) IS_PAREN_CHECK_N(__VA_ARGS__,0)
#define IS_PAREN_PROBE(...) ~, 1,
#define IS_PAREN_CHECK_N(x, n, ...) n
// Detect if the parameter is empty, works even if parenthesis are given
#define IS_EMPTY(x) BOOST_PP_CAT(IS_EMPTY_, IS_PAREN(x))(x)
#define IS_EMPTY_0(x) BOOST_PP_IS_EMPTY(x)
#define IS_EMPTY_1(x) 0
// Retrieves the first element of the sequence
// Example:
// HEAD((1)(2)(3)) // Expands to (1)
#define HEAD(x) PICK_HEAD(MARK x)
#define MARK(...) (__VA_ARGS__),
#define PICK_HEAD(...) PICK_HEAD_I(__VA_ARGS__,)
#define PICK_HEAD_I(x, ...) x
// Retrieves the tail of the sequence
// Example:
// TAIL((1)(2)(3)) // Expands to (2)(3)
#define TAIL(x) EAT x
#define EAT(...)
This provides some better detection of parenthesis and emptiness. And it provides a HEAD and TAIL macro which works slightly different than BOOST_PP_SEQ_HEAD. (Boost.Preprocessor can't handle sequences that have vardiac parameters). Now heres how we can define a TO_SEQ macro which uses the while construct:
#define TO_SEQ(x) TO_SEQ_WHILE_M \
( \
BOOST_PP_WHILE(TO_SEQ_WHILE_P, TO_SEQ_WHILE_O, (,x)) \
)
#define TO_SEQ_WHILE_P(r, state) TO_SEQ_P state
#define TO_SEQ_WHILE_O(r, state) TO_SEQ_O state
#define TO_SEQ_WHILE_M(state) TO_SEQ_M state
#define TO_SEQ_P(prev, tail) BOOST_PP_NOT(IS_EMPTY(tail))
#define TO_SEQ_O(prev, tail) \
BOOST_PP_IF(IS_PAREN(tail), \
TO_SEQ_PAREN, \
TO_SEQ_KEYWORD \
)(prev, tail)
#define TO_SEQ_PAREN(prev, tail) \
(prev (HEAD(tail)), TAIL(tail))
#define TO_SEQ_KEYWORD(prev, tail) \
TO_SEQ_REPLACE(prev, KEYWORD(tail))
#define TO_SEQ_REPLACE(prev, tail) \
(prev HEAD(tail), TAIL(tail))
#define TO_SEQ_M(prev, tail) prev
Now when you call TO_SEQ(take(x*x) with(x, container)) you should get a sequence (take)((x*x))(with)((x, container)).
Now, this sequence is much easier to work with(because of the Boost.Preprocessor library). You can now reverse it, transform it, filter it, fold over it, etc. This is extremely powerful, and is much more flexible than having them defined as macros. For example, in the Linq library the query from(x, numbers) where(x > 2) select(x * x) gets transformed into these macros:
LINQ_WHERE(x, numbers)(x > 2) LINQ_SELECT(x, numbers)(x * x)
Which these macros, it will then generate the lambda for list comprehension, but they have much more to work with when it generates the lambda. The same can be done in your library too, take(x*x) with(x, container) could be transformed into something like this:
FREEZE_TAKE(x, container, x*x)
Plus, you aren't defining macros like take which invade the global space.
Note: These macros here require a C99 preprocessor and thus won't work in MSVC.(There are workarounds though)

Custom C++ Preprocessor / Typeful Macros

Having seen the advantages of metaprogramming in Ruby and Python, but being bound to lower-level languages like C++ and C for actual work, I'm thinking of manners by which to combine the two. One instance comes in the simple problem for sorting lists of arbitrary structures/classes. For instance:
struct s{
int a;
int b;
};
vector<s> vec;
for(int x=0;x<10;x++){
s inst;
inst.a = x;
inst.b = x+10;
vec.push_back(inst);
}
Ultimately, I'd like to be able to sort inst arbitrarily with a minimal amount of boilerplate code. The easiest way I can see to do this is to make use of STL's sort:
sort(vec.begin(),vec.end());
Yet this requires me to write a method that can compare "struct s"s. What I'd rather do is:
sort(vec,a ASC,b DESC);
Which is very clearly not valid C++.
What is the best way to accomplish my dream? If I had some sort of typeful macro, that would reveal to me what the type of a vector's elements were, then it would be trivial to write C preprocessor macros to create the function required to do the sorting.
The alternative seems to be to write my own preprocessor. This works well, up until the point where I have to deduce the type of "vec" again. Is there an easy way to do this?
Context: Less code = less bugs, programming competitions.
For the above, you can use Boost.Lambda to write your comparison function inline, just like a Python lambda:
using namespace boost::lambda;
std::sort(vec.begin(), vec.end(), (_1 ->* &s::a) < (_2 ->* &s::a));
This of course assumes that you are sorting by a.
If the expressions you are looking for are far more complex, you are better off writing a separate function; even in languages like Python and Ruby with native support for closures, complex closures become quite unreadable anyway.
Warning: The code above is untested.
Hope this helps!
I would stick with writing a comparison operator for the struct. The bonus of having a comparison operator defined is that you don't end up with multiple lambda comparisons scattered all over the place. Chances are that you will need a comparison operator more than just once, so why not define it once in the logical place (along with the type)?
Personally, I prefer writing code once and keeping it some place that is particularly easy to find. I also favor writing code that is idiomatic with respect to the language that I am writing in. In C++, I expect constructors, destructors, less-than operators, and the like. You are better off writing a less-than operator and then letting std::sort(vec.begin(), vec.end()) do its proper job. If you really want to make your code clear, then do something like:
struct S {
int a, b;
bool less_than(S const& other) {...};
};
bool operator<(S const& left, S const& right) {
return left.less_than(right);
}
If you define a member function to do the comparison and then provide the operator at the namespace-level, life is much easier when you have to negate the comparison. For example:
void foo(std::vector<S>& svec) {
std::sort(svec.begin(), svec.end(), std::not1(&S::less_than));
}
This code is untested but you get the idea.
If you are using C++11, you can use Linq to sort it like this:
auto q = LINQ(from(x, vec) orderby(ascending x.a, descending x.b));
Or if you don't like the query syntax, you can use the extension methods as well:
auto q = vec | linq::order_by([](s x) { return x.a; })
| linq::then_by_descending([](s x) { return x.b; });
Both are functionally equivalent.
For c++ the standard library offers the algorithms header which contains many useful functions that work on various containers. An example for your purposes would be:
bool sCompare(const s & s1, const s & s2) {
return s1.a+s1.b/1000 < s2/a+s2.b/1000;
}
vector<s> vec;
...
std::sort(vec.begin(), vec.end(), sCompare);
sort has a prototype that looks something like:
template<class Iter, class Op>
void sort(Iter& start, Iter& stop, Op& op);
Most of these algorithms should work for any of the standard containers (some are specific to sorted containers, some associative, etc). I believe sort (and others) will even work with arrays (Iterators, the foundation of algorithms, are built to emulate pointers to array elements as closely as possible.)
In short, using modern c++ you will not need a special preprocessor to achieve what you're trying to do.
BTW, if you've declared that you're using std or std::sort, then sort(vec.begin(),vec.end()) is valid c++;