I have a C++ expression that I wish to 'freeze'. By this, I mean I have syntax like the following:
take x*x with x in container ...
where the ... indicates further (non-useful to this problem) syntax. However, if I attempt to compile this, no matter what preprocessor translations I've used to make 'take' an 'operator' (in inverted commas because it's technically not an operator, but the translation phase turns it into a class with, say, operator* available to it), the compiler still attempts to evaluate / work out where the x*x is coming from, (and, since it hasn't been declared previously (as it's declared further at the 'in' stage), it instead) can't find it and throws a compile error.
My current idea essentially involves attempting to place the expression inside a lambda (and since we can deduce the type of the container, we can declare x with the right type as, say, [](decltype(*begin(container)) x) { return x*x } -- thus, when the compiler looks at this statement, it's valid and no error is thrown), however, I'm running into errors actually achieving this.
Thus, my question is:
Is there a way / what's the best way to 'freeze' the x*x part of my expression?
EDIT:
In an attempt to clarify my question, take the following. Assume that the operator- is defined in a sane way so that the following attempts to achieve what the above take ... syntax does:
MyTakeClass() - x*x - MyWithClass() - x - MyInClass() - container ...
When this statement is compiled, the compiler will throw an error; x is not declared so x*x makes no sense (nor does x - MyInClass(), etc, etc). What I'm trying to achieve is to find a way to make the above expression compile, using any voodoo magic available, without knowing the type of x (or, in fact, that it will be named x; it could viably be named 'somestupidvariablename') in advance.
I came up with an almost solution, based on expression templates (note: these are not expression templates, they are based on expression templates). Unfortunately, I could not come up with a way that does not require you to predeclare x, but I did come up with a way to delay the type, so you only have to declare x one globally, and can use it for different types over and over in the same program/file/scope. Here is the expression type that works the magic, which I designed to be very flexible, you should be able to easily add operations and uses at will. It is used exactly how you described, except for the predeclaration of x.
Downsides I'm aware of: it does require T*T, T+T, and T(long) be compilable.
expression x(0, true); //x will be the 0th parameter. Sorry: required :(
int main() {
std::vector<int> container;
container.push_back(-3);
container.push_back(0);
container.push_back(7);
take x*x with x in container; //here's the magic line
for(unsigned i=0; i<container.size(); ++i)
std::cout << container[i] << ' ';
std::cout << '\n';
std::vector<float> container2;
container2.push_back(-2.3);
container2.push_back(0);
container2.push_back(7.1);
take 1+x with x in container2; //here's the magic line
for(unsigned i=0; i<container2.size(); ++i)
std::cout << container2[i] << ' ';
return 0;
}
and here's the class and defines that makes it all work:
class expression {
//addition and constants are unused, and merely shown for extendibility
enum exprtype{parameter_type, constant_type, multiplication_type, addition_type} type;
long long value; //for value types, and parameter number
std::unique_ptr<expression> left; //for unary and binary functions
std::unique_ptr<expression> right; //for binary functions
public:
//constructors
expression(long long val, bool is_variable=false)
:type(is_variable?parameter_type:constant_type), value(val)
{}
expression(const expression& rhs)
: type(rhs.type)
, value(rhs.value)
, left(rhs.left.get() ? std::unique_ptr<expression>(new expression(*rhs.left)) : std::unique_ptr<expression>(NULL))
, right(rhs.right.get() ? std::unique_ptr<expression>(new expression(*rhs.right)) : std::unique_ptr<expression>(NULL))
{}
expression(expression&& rhs)
:type(rhs.type), value(rhs.value), left(std::move(rhs.left)), right(std::move(rhs.right))
{}
//assignment operator
expression& operator=(expression rhs) {
type = rhs.type;
value = rhs.value;
left = std::move(rhs.left);
right = std::move(rhs.right);
return *this;
}
//operators
friend expression operator*(expression lhs, expression rhs) {
expression ret(0);
ret.type = multiplication_type;
ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
return ret;
}
friend expression operator+(expression lhs, expression rhs) {
expression ret(0);
ret.type = addition_type;
ret.left = std::unique_ptr<expression>(new expression(std::move(lhs)));
ret.right = std::unique_ptr<expression>(new expression(std::move(rhs)));
return ret;
}
//skip the parameter list, don't care. Ignore it entirely
expression& operator<<(const expression&) {return *this;}
expression& operator,(const expression&) {return *this;}
template<class container>
void operator>>(container& rhs) {
for(auto it=rhs.begin(); it!=rhs.end(); ++it)
*it = execute(*it);
}
private:
//execution
template<class T>
T execute(const T& p0) {
switch(type) {
case parameter_type :
switch(value) {
case 0: return p0; //only one variable
default: throw std::runtime_error("Invalid parameter ID");
}
case constant_type:
return ((T)(value));
case multiplication_type:
return left->execute(p0) * right->execute(p0);
case addition_type:
return left->execute(p0) + right->execute(p0);
default:
throw std::runtime_error("Invalid expression type");
}
}
//This is also unused, and merely shown as extrapolation
template<class T>
T execute(const T& p0, const T& p1) {
switch(type) {
case parameter_type :
switch(value) {
case 0: return p0;
case 1: return p1; //this version has two variables
default: throw std::runtime_error("Invalid parameter ID");
}
case constant_type:
return value;
case multiplication_type:
return left->execute(p0, p1) * right->execute(p0, p1);
case addition_type:
return left->execute(p0, p1) + right->execute(p0, p1);
default:
throw std::runtime_error("Invalid expression type");
}
}
};
#define take
#define with <<
#define in >>
Compiles and runs with correct output at http://ideone.com/Dnb50
You may notice that since the x must be predeclared, the with section is ignored entirely. There's almost no macro magic here, the macros effectively turn it into "x*x >> x << container", where the >>x does absolutely nothing at all. So the expression is effectively "x*x << container".
Also note that this method is slow, because this is an interpreter, with almost all the slowdown that implies. However, it has the bonus that it is serializable, you could save the function to a file, load it later, and execute it then.
R.MartinhoFernandes has observed that the definition of x can be simplified to merely be expression x;, and it can deduce the order of parameters from the with section, but it would require a lot of rethinking of the design and would be more complicated. I might come back and add that functionality later, but in the meantime, know that it is definitely possible.
If you can modify the expression to `take(x*x with x in container)`, than that would remove the need to predeclare `x`, with something far far simpler than expression templates.
#define with ,
#define in ,
#define take(expr, var, con) \
std::transform(con.begin(), con.end(), con.begin(), \
[](const typename con::value_type& var) -> typename con::value_type \
{return expr;});
int main() {
std::vector<int> container;
container.push_back(-3);
container.push_back(0);
container.push_back(7);
take(x*x with x in container); //here's the magic line
for(unsigned i=0; i<container.size(); ++i)
std::cout << container[i] << ' ';
}
I made an answer very similar to my previous answer, but using actual expression templates, which should be much faster. Unfortunately, MSVC10 crashes when it attempts to compile this, but MSVC11, GCC 4.7.0 and Clang 3.2 all compile and run it just fine. (All other versions untested)
Here's the usage of the templates. Implementation code is here.
#define take
#define with ,
#define in >>=
//function call for containers
template<class lhsexpr, class container>
lhsexpr operator>>=(lhsexpr lhs, container& rhs)
{
for(auto it=rhs.begin(); it!=rhs.end(); ++it)
*it = lhs(*it);
return lhs;
}
int main() {
std::vector<int> container0;
container0.push_back(-4);
container0.push_back(0);
container0.push_back(3);
take x*x with x in container0; //here's the magic line
for(auto it=container0.begin(); it!=container0.end(); ++it)
std::cout << *it << ' ';
std::cout << '\n';
auto a = x+x*x+'a'*x;
auto b = a; //make sure copies work
b in container0;
b in container1;
std::cout << sizeof(b);
return 0;
}
As you can see, this is used exactly like my previous code, except now all the functions are decided at compile time, which means this will have exactly the same speed as a lambda. In fact, C++11 lambdas were preceeded by boost::lambda which works on very similar concepts.
This is a separate answer, because the code is far different, and far more complicated/intimidating. That's also why the implementation is not in the answer itself.
I don't think it is possible to get this "list comprehesion" (not quite, but it is doing the same thing) ala haskell using the preprocessor. The preprocessor just does simple search and replace with the possibility of arguments, so it cannot perform arbitrary replacements. Especially changing the order of parts of expression is not possible.
I cannot see a way to do this, without changing the order, since you always need x somehow to appear before x*x to define this variable. Using a lambda will not help, since you still need x in front of the x*x part, even if it is just as an argument. This makes this syntax not possible.
There are some ways around this:
Use a different preprocessor. There are preprocessors based on the ideas of Lisp-macros, which can be made syntax aware and hence can do arbitrary transformation of one syntax tree into another. One example is Camlp4/Camlp5 developed for the OCaml language. There are some very good tutorials on how to use this for arbitrary syntax transformation. I used to have an explanation on how to use Camlp4 to transform makefiles into C code, but I cannot find it anymore. There are some other tutorials on how to do such things.
Change the syntax slightly. Such list comprehension is essientially just a syntactic simplification of the usage of a Monad. With the arrival of C++11 Monads have become possible in C++. However the syntactic sugar may not be. If you decide to wrap the stuff you are trying to do in a Monad, many things will still be possible, you will just have to change the syntax slightly. Implementing Monads in C++ is anything but fun though (although I first expected otherwise). Have a look here for some example how to get Monads in C++.
The best approach is to parse it using the preprocessor.I do believe the preprocessor can be a very powerful tool for building EDSLs(embedded domain specific languages), but you must first understand the limitations of the preprocessor parsing things. The preprocessor can only parse out predefined tokens. So the syntax must be changed slightly by placing parenthesis around the expressions, and a FREEZE macro must surround it also(I just picked FREEZE, it could be called anything):
FREEZE(take(x*x) with(x, container))
Using this syntax you can convert it to a preprocessor sequence(using the Boost.Preprocessor library, of course). Once you have it as a preprocessor sequence you can apply lots of algorithms to it to transform it to however you like. A similiar approach is done with the Linq library for C++, where you can write this:
LINQ(from(x, numbers) where(x > 2) select(x * x))
Now, to convert to a pp sequence first you need to define the keywords to be parsed, like this:
#define KEYWORD(x) BOOST_PP_CAT(KEYWORD_, x)
#define KEYWORD_take (take)
#define KEYWORD_with (with)
So the way this will work is when you call KEYWORD(take(x*x) with(x, container)) it will expand to (take)(x*x) with(x, container), which is the first step towards converting it to a pp sequence. Now to keep going we need to use a while construct from the Boost.Preprocessor library, but first we need to define some little macros to help us along the way:
// Detects if the first token is parenthesis
#define IS_PAREN(x) IS_PAREN_CHECK(IS_PAREN_PROBE x)
#define IS_PAREN_CHECK(...) IS_PAREN_CHECK_N(__VA_ARGS__,0)
#define IS_PAREN_PROBE(...) ~, 1,
#define IS_PAREN_CHECK_N(x, n, ...) n
// Detect if the parameter is empty, works even if parenthesis are given
#define IS_EMPTY(x) BOOST_PP_CAT(IS_EMPTY_, IS_PAREN(x))(x)
#define IS_EMPTY_0(x) BOOST_PP_IS_EMPTY(x)
#define IS_EMPTY_1(x) 0
// Retrieves the first element of the sequence
// Example:
// HEAD((1)(2)(3)) // Expands to (1)
#define HEAD(x) PICK_HEAD(MARK x)
#define MARK(...) (__VA_ARGS__),
#define PICK_HEAD(...) PICK_HEAD_I(__VA_ARGS__,)
#define PICK_HEAD_I(x, ...) x
// Retrieves the tail of the sequence
// Example:
// TAIL((1)(2)(3)) // Expands to (2)(3)
#define TAIL(x) EAT x
#define EAT(...)
This provides some better detection of parenthesis and emptiness. And it provides a HEAD and TAIL macro which works slightly different than BOOST_PP_SEQ_HEAD. (Boost.Preprocessor can't handle sequences that have vardiac parameters). Now heres how we can define a TO_SEQ macro which uses the while construct:
#define TO_SEQ(x) TO_SEQ_WHILE_M \
( \
BOOST_PP_WHILE(TO_SEQ_WHILE_P, TO_SEQ_WHILE_O, (,x)) \
)
#define TO_SEQ_WHILE_P(r, state) TO_SEQ_P state
#define TO_SEQ_WHILE_O(r, state) TO_SEQ_O state
#define TO_SEQ_WHILE_M(state) TO_SEQ_M state
#define TO_SEQ_P(prev, tail) BOOST_PP_NOT(IS_EMPTY(tail))
#define TO_SEQ_O(prev, tail) \
BOOST_PP_IF(IS_PAREN(tail), \
TO_SEQ_PAREN, \
TO_SEQ_KEYWORD \
)(prev, tail)
#define TO_SEQ_PAREN(prev, tail) \
(prev (HEAD(tail)), TAIL(tail))
#define TO_SEQ_KEYWORD(prev, tail) \
TO_SEQ_REPLACE(prev, KEYWORD(tail))
#define TO_SEQ_REPLACE(prev, tail) \
(prev HEAD(tail), TAIL(tail))
#define TO_SEQ_M(prev, tail) prev
Now when you call TO_SEQ(take(x*x) with(x, container)) you should get a sequence (take)((x*x))(with)((x, container)).
Now, this sequence is much easier to work with(because of the Boost.Preprocessor library). You can now reverse it, transform it, filter it, fold over it, etc. This is extremely powerful, and is much more flexible than having them defined as macros. For example, in the Linq library the query from(x, numbers) where(x > 2) select(x * x) gets transformed into these macros:
LINQ_WHERE(x, numbers)(x > 2) LINQ_SELECT(x, numbers)(x * x)
Which these macros, it will then generate the lambda for list comprehension, but they have much more to work with when it generates the lambda. The same can be done in your library too, take(x*x) with(x, container) could be transformed into something like this:
FREEZE_TAKE(x, container, x*x)
Plus, you aren't defining macros like take which invade the global space.
Note: These macros here require a C99 preprocessor and thus won't work in MSVC.(There are workarounds though)
Related
I've created a function declared as:
template <typename Container, typename Task>
void parallel_for_each(Container &container, Task task,
unsigned number_of_threads = std::thread::hardware_concurrency())
It's not difficult to guess what it is supposed to do. I'd like to create a macro simplifying the syntax of this function and making the its syntax "loop-like". I've come up with an idea:
#define in ,
#define pforeach(Z,X,Y) parallel_for_each(X,[](Z)->void{Y;})
Where usage as:
pforeach(double &element, vec,
{
element *= 2;
});
works as expected, but this one:
pforeach(double &element in vec,
{
element *= 2;
element /= 2;
});
gives an error
macro "pforeach" requires 3 arguments, but only 2 given
Do you have any idea how to write a macro allowing even "nicer" syntax? Why "in" doesn't stand for comma in my code?
The reason that in is not replaced is that it appears inside an argument to your function-like macro, but for it to be replaced, those arguments have to be propagated to another macro first: Try
#define in ,
#define pforeach_(Z,X,Y) parallel_for_each(X,[](Z)->void{Y;})
#define pforeach(Z,X,Y) pforeach_(Z,X,Y)
Note: Defining in as , is not gonna end well!
An idea to add "nicer" syntax:
template <typename Container>
struct Helper {
Container&& c;
template <typename Arg>
void operator=(Arg&& arg) {
parallel_for_each(std::forward<Container>(c), std::forward<Arg>(arg));
}
};
#define CONCAT_(a,b) a##b
#define CONCAT(a,b) CONCAT_(a,b)
// Easier with Boost.PP
#define DEC_1 0
#define DEC_2 1
#define DEC_3 2
#define DEC_4 3
#define DEC_5 4
#define DEC_6 5
#define DEC_7 6
#define DEC_8 7
#define DEC(i) CONCAT(DEC_,i)
#define pforeach(Z, ...) \
Helper<decltype((__VA_ARGS__))> CONCAT(_unused_obj, __COUNTER__){__VA_ARGS__}; \
CONCAT(_unused_obj, DEC(__COUNTER__))=[](Z)
Usable as
int a[] = {1, 2, 3};
pforeach(int i, a) {
std::cout << i << ", ";
};
pforeach(int i, std::vector<int>{1, 2, 3}) {
std::cout << -i << ", ";
};
Demo.
Has several disadvantages though. I'd just stick with what you've got so far.
Why "in" doesn't stand for comma in my code?
Because that replacement is performed after macro arguments are determined. Quoting standard draft N3797, § 16.3.1 Argument substitution:
After the arguments for the invocation of a function-like macro have been identified, argument substitution takes place. ... Before being substituted, each argument’s preprocessing tokens are completely macro replaced as if they formed the rest of the preprocessing file; no other preprocessing tokens
are available.
So preprocessor identifies pforeach(double &element in vec, {}) as a function-like macro call with two arguments:
First consists of tokens double, &, in and vec and bound to argument Z
Second consists of tokens { and } and bound to argument X
You're obviously miss argument Y
Do you have any idea how to write a macro allowing even "nicer" syntax?
It is hard to answer and it is matter of taste. Anyway, C++ has rich capabilities of patching syntax with operator overload, but you can't build DSL with that, so it is better to use default syntax, it is not that ugly (and also makes it easy to read):
parallel_for_each(vec, [](double& el){ el *= 2; })
There is no macro langugae. Macros are handled by the C/C++ preprocessor. The implementation of the preprocessors may vary.
Most preprocessors expect that you pass the exact number of parameters. I found that the GNU preprocessor has a less strict checking of parameters what allows a kind of variadic list. But in general a macro won't help you with your task.
I recommend to write the short statement in a function instead of a macro. An inline function is as fast and short as a macro, but type safe.
Further the function allows default parameter values. So you can skip something.
Trying to improve the idea of #Columbo :
template <typename Container>
struct __pforeach__helper {
Container &&c;
template <typename Arg>
void operator=(Arg&& arg) {
parallel_for_each(std::forward<Container>(c), std::forward<Arg>(arg));
}
};
//additional helper function
template <typename Container>
__pforeach__helper<Container> __create__pforeach__helper(Container &&c)
{
return __pforeach__helper<Container>(__pforeach__helper<Container>{c});
}
#define pforeach(Z,C) \
__create__pforeach__helper(C)=[](Z)
It doesn't rely on __COUNTER__ and doesn't require defining DEC_x macros. Any feedback is most welcome!
The short circuiting behaviour of the operators && and || is an amazing tool for programmers.
But why do they lose this behaviour when overloaded? I understand that operators are merely syntactic sugar for functions but the operators for bool have this behaviour, why should it be restricted to this single type? Is there any technical reasoning behind this?
All design processes result in compromises between mutually incompatible goals. Unfortunately, the design process for the overloaded && operator in C++ produced a confusing end result: that the very feature you want from && -- its short-circuiting behavior -- is omitted.
The details of how that design process ended up in this unfortunate place, those I don't know. It is however relevant to see how a later design process took this unpleasant outcome into account. In C#, the overloaded && operator is short circuiting. How did the designers of C# achieve that?
One of the other answers suggests "lambda lifting". That is:
A && B
could be realized as something morally equivalent to:
operator_&& ( A, ()=> B )
where the second argument uses some mechanism for lazy evaluation so that when evaluated, the side effects and value of the expression are produced. The implementation of the overloaded operator would only do the lazy evaluation when necessary.
This is not what the C# design team did. (Aside: though lambda lifting is what I did when it came time to do expression tree representation of the ?? operator, which requires certain conversion operations to be performed lazily. Describing that in detail would however be a major digression. Suffice to say: lambda lifting works but is sufficiently heavyweight that we wished to avoid it.)
Rather, the C# solution breaks the problem down into two separate problems:
should we evaluate the right-hand operand?
if the answer to the above was "yes", then how do we combine the two operands?
Therefore the problem is solved by making it illegal to overload && directly. Rather, in C# you must overload two operators, each of which answers one of those two questions.
class C
{
// Is this thing "false-ish"? If yes, we can skip computing the right
// hand size of an &&
public static bool operator false (C c) { whatever }
// If we didn't skip the RHS, how do we combine them?
public static C operator & (C left, C right) { whatever }
...
(Aside: actually, three. C# requires that if operator false is provided then operator true must also be provided, which answers the question: is this thing "true-ish?". Typically there would be no reason to provide only one such operator so C# requires both.)
Consider a statement of the form:
C cresult = cleft && cright;
The compiler generates code for this as thought you had written this pseudo-C#:
C cresult;
C tempLeft = cleft;
cresult = C.false(tempLeft) ? tempLeft : C.&(tempLeft, cright);
As you can see, the left hand side is always evaluated. If it is determined to be "false-ish" then it is the result. Otherwise, the right hand side is evaluated, and the eager user-defined operator & is invoked.
The || operator is defined in the analogous way, as an invocation of operator true and the eager | operator:
cresult = C.true(tempLeft) ? tempLeft : C.|(tempLeft , cright);
By defining all four operators -- true, false, & and | -- C# allows you to not only say cleft && cright but also non-short-circuiting cleft & cright, and also if (cleft) if (cright) ..., and c ? consequence : alternative and while(c), and so on.
Now, I said that all design processes are the result of compromise. Here the C# language designers managed to get short-circuiting && and || right, but doing so requires overloading four operators instead of two, which some people find confusing. The operator true/false feature is one of the least well understood features in C#. The goal of having a sensible and straightforward language that is familiar to C++ users was opposed by the desires to have short circuiting and the desire to not implement lambda lifting or other forms of lazy evaluation. I think that was a reasonable compromise position, but it is important to realize that it is a compromise position. Just a different compromise position than the designers of C++ landed on.
If the subject of language design for such operators interests you, consider reading my series on why C# does not define these operators on nullable Booleans:
http://ericlippert.com/2012/03/26/null-is-not-false-part-one/
The point is that (within the bounds of C++98) the right-hand operand would be passed to the overloaded operator function as argument. In doing so, it would already be evaluated. There is nothing the operator||() or operator&&() code could or could not do that would avoid this.
The original operator is different, because it's not a function, but implemented at a lower level of the language.
Additional language features could have made non-evaluation of the right-hand operand syntactically possible. However, they didn't bother because there are only a select few cases where this would be semantically useful. (Just like ? :, which is not available for overloading at all.
(It took them 16 years to get lambdas into the standard...)
As for the semantical use, consider:
objectA && objectB
This boils down to:
template< typename T >
ClassA.operator&&( T const & objectB )
Think about what exactly you'd like to do with objectB (of unknown type) here, other than calling a conversion operator to bool, and how you'd put that into words for the language definition.
And if you are calling conversion to bool, well...
objectA && obectB
does the same thing, now does it? So why overload in the first place?
A feature has to be thought of, designed, implemented, documented and shipped.
Now we thought of it, let's see why it might be easy now (and hard to do then). Also keep in mind that there's only a limited amount of resources, so adding it might have chopped something else (What would you like to forego for it?).
In theory, all operators could allow short-circuiting behavior with only one "minor" additional language-feature, as of C++11 (when lambdas were introduced, 32 years after "C with classes" started in 1979, a still respectable 16 after c++98):
C++ would just need a way to annotate an argument as lazy-evaluated - a hidden-lambda - to avoid the evaluation until neccessary and allowed (pre-conditions met).
What would that theoretical feature look like (Remember that any new features should be widely usable)?
An annotation lazy, which applied to a function-argument makes the function a template expecting a functor, and makes the compiler pack the expression into a functor:
A operator&&(B b, __lazy C c) {return c;}
// And be called like
exp_b && exp_c;
// or
operator&&(exp_b, exp_c);
It would look under the cover like:
template<class Func> A operator&&(B b, Func& f) {auto&& c = f(); return c;}
// With `f` restricted to no-argument functors returning a `C`.
// And the call:
operator&&(exp_b, [&]{return exp_c;});
Take special note that the lambda stays hidden, and will be called at most once.
There should be no performance-degradation due to this, aside from reduced chances of common-subexpression-elimination.
Beside implementation-complexity and conceptual complexity (every feature increases both, unless it sufficiently eases those complexities for some other features), let's look at another important consideration: Backwards-compatibility.
While this language-feature would not break any code, it would subtly change any API taking advantage of it, which means any use in existing libraries would be a silent breaking change.
BTW: This feature, while easier to use, is strictly stronger than the C# solution of splitting && and || into two functions each for separate definition.
With retrospective rationalization, mainly because
in order to have guaranteed short-circuiting (without introducing new syntax) the operators would have to be restricted to results actual first argument convertible to bool, and
short circuiting can be easily expressed in other ways, when needed.
For example, if a class T has associated && and || operators, then the expression
auto x = a && b || c;
where a, b and c are expressions of type T, can be expressed with short circuiting as
auto&& and_arg = a;
auto&& and_result = (and_arg? and_arg && b : and_arg);
auto x = (and_result? and_result : and_result || c);
or perhaps more clearly as
auto x = [&]() -> T_op_result
{
auto&& and_arg = a;
auto&& and_result = (and_arg? and_arg && b : and_arg);
if( and_result ) { return and_result; } else { return and_result || b; }
}();
The apparent redundancy preserves any side-effects from the operator invocations.
While the lambda rewrite is more verbose, its better encapsulation allows one to define such operators.
I’m not entirely sure of the standard-conformance of all of the following (still a bit of influensa), but it compiles cleanly with Visual C++ 12.0 (2013) and MinGW g++ 4.8.2:
#include <iostream>
using namespace std;
void say( char const* s ) { cout << s; }
struct S
{
using Op_result = S;
bool value;
auto is_true() const -> bool { say( "!! " ); return value; }
friend
auto operator&&( S const a, S const b )
-> S
{ say( "&& " ); return a.value? b : a; }
friend
auto operator||( S const a, S const b )
-> S
{ say( "|| " ); return a.value? a : b; }
friend
auto operator<<( ostream& stream, S const o )
-> ostream&
{ return stream << o.value; }
};
template< class T >
auto is_true( T const& x ) -> bool { return !!x; }
template<>
auto is_true( S const& x ) -> bool { return x.is_true(); }
#define SHORTED_AND( a, b ) \
[&]() \
{ \
auto&& and_arg = (a); \
return (is_true( and_arg )? and_arg && (b) : and_arg); \
}()
#define SHORTED_OR( a, b ) \
[&]() \
{ \
auto&& or_arg = (a); \
return (is_true( or_arg )? or_arg : or_arg || (b)); \
}()
auto main()
-> int
{
cout << boolalpha;
for( int a = 0; a <= 1; ++a )
{
for( int b = 0; b <= 1; ++b )
{
for( int c = 0; c <= 1; ++c )
{
S oa{!!a}, ob{!!b}, oc{!!c};
cout << a << b << c << " -> ";
auto x = SHORTED_OR( SHORTED_AND( oa, ob ), oc );
cout << x << endl;
}
}
}
}
Output:
000 -> !! !! || false
001 -> !! !! || true
010 -> !! !! || false
011 -> !! !! || true
100 -> !! && !! || false
101 -> !! && !! || true
110 -> !! && !! true
111 -> !! && !! true
Here each !! bang-bang shows a conversion to bool, i.e. an argument value check.
Since a compiler can easily do the same, and additionally optimize it, this is a demonstrated possible implementation and any claim of impossibility must be put in the same category as impossibility claims in general, namely, generally bollocks.
tl;dr: it is not worth the effort, due to very low demand (who would use the feature?) compared to rather high costs (special syntax needed).
The first thing that comes to mind is that operator overloading is just a fancy way to write functions, whereas the boolean version of the operators || and && are buitlin stuff. That means that the compiler has the freedom to short-circuit them, while the expression x = y && z with nonboolean y and z has to lead to a call to a function like X operator&& (Y, Z). This would mean that y && z is just a fancy way to write operator&&(y,z) which is just a call of an oddly named function where both parameters have to be evaluated before calling the function (including anything that would deem a short-circuiting appropiate).
However, one could argue that it should be possible to make the translation of && operators somewhat more sophisticated, like it is for the new operator which is translated into calling the function operator new followed by a constructor call.
Technically this would be no problem, one would have to define a language syntax specific for the precondition that enables short-circuiting. However, the use of short-circuits would be restricted to cases where Y is convetible to X, or else there had to be additional info of how to actually do the short circuiting (i.e. compute the result from only the first parameter). The result would have to look somewhat like this:
X operator&&(Y const& y, Z const& z)
{
if (shortcircuitCondition(y))
return shortcircuitEvaluation(y);
<"Syntax for an evaluation-Point for z here">
return actualImplementation(y,z);
}
One seldomly wants to overload operator|| and operator&&, because there seldomly is a case where writing a && b actually is intuitive in a nonboolean context. The only exceptions I know of are expression templates, e.g. for embedded DSLs. And only a handful of those few cases would benefit from short circuit evaluation. Expression templates usually don't, because they are used to form expression trees that are evaluated later, so you always need both sides of the expression.
In short: neither compiler writers nor standards authors felt the need to jump through hoops and define and implement additional cumbersome syntax, just because one in a million might get the idea that it would be nice to have short-circuiting on user defined operator&& and operator|| - just to get to the conclusion that it is not less effort than writing the logic per hand.
Lambdas is not the only way to introduce laziness. Lazy evaluation is relatively straight-forward using Expression Templates in C++. There is no need for keyword lazy and it can be implemented in C++98. Expression trees are already mentions above. Expression templates are poor (but clever) man's expression trees. The trick is to convert the expression into a tree of recursively nested instantiations of the Expr template. The tree is evaluated separately after construction.
The following code implements short-circuited && and || operators for class S as long as it provides logical_and and logical_or free functions and it is convertible to bool. The code is in C++14 but the idea is applicable in C++98 also. See live example.
#include <iostream>
struct S
{
bool val;
explicit S(int i) : val(i) {}
explicit S(bool b) : val(b) {}
template <class Expr>
S (const Expr & expr)
: val(evaluate(expr).val)
{ }
template <class Expr>
S & operator = (const Expr & expr)
{
val = evaluate(expr).val;
return *this;
}
explicit operator bool () const
{
return val;
}
};
S logical_and (const S & lhs, const S & rhs)
{
std::cout << "&& ";
return S{lhs.val && rhs.val};
}
S logical_or (const S & lhs, const S & rhs)
{
std::cout << "|| ";
return S{lhs.val || rhs.val};
}
const S & evaluate(const S &s)
{
return s;
}
template <class Expr>
S evaluate(const Expr & expr)
{
return expr.eval();
}
struct And
{
template <class LExpr, class RExpr>
S operator ()(const LExpr & l, const RExpr & r) const
{
const S & temp = evaluate(l);
return temp? logical_and(temp, evaluate(r)) : temp;
}
};
struct Or
{
template <class LExpr, class RExpr>
S operator ()(const LExpr & l, const RExpr & r) const
{
const S & temp = evaluate(l);
return temp? temp : logical_or(temp, evaluate(r));
}
};
template <class Op, class LExpr, class RExpr>
struct Expr
{
Op op;
const LExpr &lhs;
const RExpr &rhs;
Expr(const LExpr& l, const RExpr & r)
: lhs(l),
rhs(r)
{}
S eval() const
{
return op(lhs, rhs);
}
};
template <class LExpr>
auto operator && (const LExpr & lhs, const S & rhs)
{
return Expr<And, LExpr, S> (lhs, rhs);
}
template <class LExpr, class Op, class L, class R>
auto operator && (const LExpr & lhs, const Expr<Op,L,R> & rhs)
{
return Expr<And, LExpr, Expr<Op,L,R>> (lhs, rhs);
}
template <class LExpr>
auto operator || (const LExpr & lhs, const S & rhs)
{
return Expr<Or, LExpr, S> (lhs, rhs);
}
template <class LExpr, class Op, class L, class R>
auto operator || (const LExpr & lhs, const Expr<Op,L,R> & rhs)
{
return Expr<Or, LExpr, Expr<Op,L,R>> (lhs, rhs);
}
std::ostream & operator << (std::ostream & o, const S & s)
{
o << s.val;
return o;
}
S and_result(S s1, S s2, S s3)
{
return s1 && s2 && s3;
}
S or_result(S s1, S s2, S s3)
{
return s1 || s2 || s3;
}
int main(void)
{
for(int i=0; i<= 1; ++i)
for(int j=0; j<= 1; ++j)
for(int k=0; k<= 1; ++k)
std::cout << and_result(S{i}, S{j}, S{k}) << std::endl;
for(int i=0; i<= 1; ++i)
for(int j=0; j<= 1; ++j)
for(int k=0; k<= 1; ++k)
std::cout << or_result(S{i}, S{j}, S{k}) << std::endl;
return 0;
}
Short circuiting the logical operators is allowed because it is an "optimisation" in the evaluation of the associated truth tables. It is a function of the logic itself, and this logic is defined.
Is there actually a reason why overloaded && and || don't short circuit?
Custom overloaded logical operators are not obliged to follow the logic of these truth tables.
But why do they lose this behaviour when overloaded?
Hence the entire function needs to be evaluated as per normal. The compiler must treat it as a normal overloaded operator (or function) and it can still apply optimisations as it would with any other function.
People overload the logical operators for a variety of reasons. For example; they may have specific meaning in a specific domain that is not the "normal" logical ones people are accustomed to.
The short-circuiting is because of the truth table of "and" and "or". How would you know what operation the user is going to define and how would you know you won't have to evaluate the second operator?
but the operators for bool have this behaviour, why should it be restricted to this single type?
I just want to answer this one part. The reason is that the built-in && and || expressions are not implemented with functions as overloaded operators are.
Having the short-circuiting logic built-in to the compiler's understanding of specific expressions is easy. It's just like any other built-in control flow.
But operator overloading is implemented with functions instead, which have particular rules, one of which is that all the expressions used as arguments get evaluated before the function is called. Obviously different rules could be defined, but that's a bigger job.
I see questions on SO every so often about overloading the comma operator in C++ (mainly unrelated to the overloading itself, but things like the notion of sequence points), and it makes me wonder:
When should you overload the comma? What are some examples of its practical uses?
I just can't think of any examples off the top of my head where I've seen or needed to something like
foo, bar;
in real-world code, so I'm curious as to when (if ever) this is actually used.
I have used the comma operator in order to index maps with multiple indices.
enum Place {new_york, washington, ...};
pair<Place, Place> operator , (Place p1, Place p2)
{
return make_pair(p1, p2);
}
map< pair<Place, Place>, double> distance;
distance[new_york, washington] = 100;
Let's change the emphasis a bit to:
When should you overload the comma?
The answer: Never.
The exception: If you're doing template metaprogramming, operator, has a special place at the very bottom of the operator precedence list, which can come in handy for constructing SFINAE-guards, etc.
The only two practical uses I've seen of overloading operator, are both in Boost:
Boost.Assign
Boost.Phoenix – it's fundamental here in that it allows Phoenix lambdas to support multiple statements
Boost.Assign uses it, to let you do things like:
vector<int> v;
v += 1,2,3,4,5,6,7,8,9;
And I've seen it used for quirky language hacks, I'll see if I can find some.
Aha, I do remember one of those quirky uses: collecting multiple expressions. (Warning, dark magic.)
The comma has an interesting property in that it can take a parameter of type void. If it is the case, then the built-in comma operator is used.
This is handy when you want to determine if an expression has type void:
namespace detail_
{
template <typename T>
struct tag
{
static T get();
};
template <typename T, typename U>
tag<char(&)[2]> operator,(T, tag<U>);
template <typename T, typename U>
tag<U> operator,(tag<T>, tag<U>);
}
#define HAS_VOID_TYPE(expr) \
(sizeof((::detail_::tag<int>(), \
(expr), \
::detail_::tag<char>).get()) == 1)
I let the reader figure out as an exercise what is going on. Remember that operator, associates to the right.
Similar to #GMan's Boost.Assign example, Blitz++ overloads the comma operator to provide a convenient syntax for working with multidimensional arrays. For example:
Array<double,2> y(4,4); // A 4x4 array of double
y = 1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
0, 0, 0, 1;
In SOCI - The C++ Database Access Library it is used for the implementation of the inbound part of the interface:
sql << "select name, salary from persons where id = " << id,
into(name), into(salary);
From the rationale FAQ:
Q: Overloaded comma operator is just obfuscation, I don't like it.
Well, consider the following:
"Send the query X to the server Y and put result into variable Z."
Above, the "and" plays a role of the comma. Even if overloading the comma operator is not a very popular practice in C++, some libraries do this, achieving terse and easy to learn syntax. We are pretty sure that in SOCI the comma operator was overloaded with a good effect.
I use the comma operator for printing log output. It actually is very similar to ostream::operator<< but I find the comma operator actually better for the task.
So I have:
template <typename T>
MyLogType::operator,(const T& data) { /* do the same thing as with ostream::operator<<*/ }
It has these nice properties
The comma operator has the lowest priority. So if you want to stream an expression, things do not mess up if you forget the parenthesis. Compare:
myLog << "The mask result is: " << x&y; //operator precedence would mess this one up
myLog, "The result is: ", x&y;
you can even mix comparisons operators inside without a problem, e.g.
myLog, "a==b: ", a==b;
The comma operator is visually small. It does not mess up with reading when gluing many things together
myLog, "Coords=", g, ':', s, ':', p;
It aligns with the meaning of the comma operator, i.e. "print this" and then "print that".
One possibility is the Boost Assign library (though I'm pretty sure some people would consider this abuse rather than a good use).
Boost Spirit probably overloads the comma operator as well (it overloads almost everything else...)
Along the same lines, I was sent a github pull request with comma operator overload. It looked something like following
class Mylogger {
public:
template <typename T>
Mylogger & operator,(const T & val) {
std::cout << val;
return * this;
}
};
#define Log(level,args...) \
do { Mylogger logv; logv,level, ":", ##args; } while (0)
then in my code I can do:
Log(2, "INFO: setting variable \", 1, "\"\n");
Can someone explain why this is a good or bad usage case?
One of the practical usage is for effectively using it with variable arguments in macro. By the way, variable arguments was earlier an extension in GCC and now a part of C++11 standard.
Suppose we have a class X, which adds object of type A into it. i.e.
class X {
public: X& operator+= (const A&);
};
What if we want to add 1 or more objects of A into X buffer;?
For example,
#define ADD(buffer, ...) buffer += __VA_ARGS__
Above macro, if used as:
ADD(buffer, objA1, objA2, objA3);
then it will expand to:
buffer += objA1, objeA2, objA3;
Hence, this will be a perfect example of using comma operator, as the variable arguments expand with the same.
So to resolve this we overload comma operator and wrap it around += as below
X& X::operator, (const A& a) { // declared inside `class X`
*this += a; // calls `operator+=`
}
Here is an example from OpenCV documentation (http://docs.opencv.org/modules/core/doc/basic_structures.html#mat). The comma operator is used for cv::Mat initialization:
// create a 3x3 double-precision identity matrix
Mat M = (Mat_<double>(3,3) << 1, 0, 0, 0, 1, 0, 0, 0, 1);
I wonder where should we use lambda expression over functor in C++. To me, these two techniques are basically the same, even functor is more elegant and cleaner than lambda. For example, if I want to reuse my predicate, I have to copy the lambda part over and over. So when does lambda really come in to place?
A lambda expression creates an nameless functor, it's syntactic sugar.
So you mainly use it if it makes your code look better. That generally would occur if either (a) you aren't going to reuse the functor, or (b) you are going to reuse it, but from code so totally unrelated to the current code that in order to share it you'd basically end up creating my_favourite_two_line_functors.h, and have disparate files depend on it.
Pretty much the same conditions under which you would type any line(s) of code, and not abstract that code block into a function.
That said, with range-for statements in C++0x, there are some places where you would have used a functor before where it might well make your code look better now to write the code as a loop body, not a functor or a lambda.
1) It's trivial and trying to share it is more work than benefit.
2) Defining a functor simply adds complexity (due to having to make a bunch of member variables and crap).
If neither of those things is true then maybe you should think about defining a functor.
Edit: it seems to be that you need an example of when it would be nice to use a lambda over a functor. Here you go:
typedef std::vector< std::pair<int,std::string> > whatsit_t;
int find_it(std::string value, whatsit_t const& stuff)
{
auto fit = std::find_if(stuff.begin(), stuff.end(), [value](whatsit_t::value_type const& vt) -> bool { return vt.second == value; });
if (fit == stuff.end()) throw std::wtf_error();
return fit->first;
}
Without lambdas you'd have to use something that similarly constructs a functor on the spot or write an externally linkable functor object for something that's annoyingly trivial.
BTW, I think maybe wtf_error is an extension.
Lambdas are basically just syntactic sugar that implement functors (NB: closures are not simple.) In C++0x, you can use the auto keyword to store lambdas locally, and std::function will enable you to store lambdas, or pass them around in a type-safe manner.
Check out the Wikipedia article on C++0x.
Small functions that are not repeated.
The main complain about functors is that they are not in the same place that they were used. So you had to find and read the functor out of context to the place it was being used in (even if it is only being used in one place).
The other problem was that functor required some wiring to get parameters into the functor object. Not complex but all basic boilerplate code. And boiler plate is susceptible to cut and paste problems.
Lambda try and fix both these. But I would use functors if the function is repeated in multiple places or is larger than (can't think up an appropriate term as it will be context sensitive) small.
lambda and functor have context. Functor is a class and therefore can be more complex then a lambda. A function has no context.
#include <iostream>
#include <list>
#include <vector>
using namespace std;
//Functions have no context, mod is always 3
bool myFunc(int n) { return n % 3 == 0; }
//Functors have context, e.g. _v
//Functors can be more complex, e.g. additional addNum(...) method
class FunctorV
{
public:
FunctorV(int num ) : _v{num} {}
void addNum(int num) { _v.push_back(num); }
bool operator() (int num)
{
for(int i : _v) {
if( num % i == 0)
return true;
}
return false;
}
private:
vector<int> _v;
};
void print(string prefix,list<int>& l)
{
cout << prefix << "l={ ";
for(int i : l)
cout << i << " ";
cout << "}" << endl;
}
int main()
{
list<int> l={1,2,3,4,5,6,7,8,9};
print("initial for each test: ",l);
cout << endl;
//function, so no context.
l.remove_if(myFunc);
print("function mod 3: ",l);
cout << endl;
//nameless lambda, context is x
l={1,2,3,4,5,6,7,8,9};
int x = 3;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=3: ",l);
x = 4;
l.remove_if([x](int n){ return n % x == 0; });
print("lambda mod x=4: ",l);
cout << endl;
//functor has context and can be more complex
l={1,2,3,4,5,6,7,8,9};
FunctorV myFunctor(3);
myFunctor.addNum(4);
l.remove_if(myFunctor);
print("functor mod v={3,4}: ",l);
return 0;
}
Output:
initial for each test: l={ 1 2 3 4 5 6 7 8 9 }
function mod 3: l={ 1 2 4 5 7 8 }
lambda mod x=3: l={ 1 2 4 5 7 8 }
lambda mod x=4: l={ 1 2 5 7 }
functor mod v={3,4}: l={ 1 2 5 7 }
First, i would like to clear some clutter here.
There are two different things
Lambda function
Lambda expression/functor.
Usually, Lambda expression i.e. [] () {} -> return-type does not always synthesize to closure(i.e. kind of functor). Although this is compiler dependent. But you can force compiler by enforcing + sign before [] as +[] () {} -> return-type. This will create function pointer.
Now, coming to your question. You can use lambda repeatedly as follows:
int main()
{
auto print = [i=0] () mutable {return i++;};
cout<<print()<<endl;
cout<<print()<<endl;
cout<<print()<<endl;
// Call as many time as you want
return 0;
}
You should use Lambda wherever it strikes in your mind considering code expressiveness & easy maintainability like you can use it in custom deleters for smart pointers & with most of the STL algorithms.
If you combine Lambda with other features like constexpr, variadic template parameter pack or generic lambda. You can achieve many things.
You can find more about it here
As you pointed out, it works best when you need a one-off and the coding overhead of writing it out as a function isn't worth it.
Conceptually, the decision of which to use is driven by the same criterion as using a named variable versus a in-place expression or constant...
size_t length = strlen(x) + sizeof(y) + z++ + strlen('\0');
...
allocate(length);
std::cout << length;
...here, creating a length variable encourages the program to consider it's correctness and meaning in isolation of it's later use. The name hopefully conveys enough that it can be understood intuitively and independently of it's initial value. It then allows the value to be used several times without repeating the expression (while handling z being different). While here...
allocate(strlen(x) + sizeof(y) + z++ + strlen('\0'));
...the total code is reduced and the value is localised at the point it's needed. The only thing to "carry forwards" from a reading of this line is the side effects of allocation and increment (z), but there's no extra local variable with scope or later use to consider. The programmer has to mentally juggle less state while continuing their analysis of the code.
The same distinction applies to functions versus inline statements. For the purposes of answering your question, functors versus lambdas can be seen as just a particular case of this function versus inlining decision.
I tend to prefer Functors over Lambdas these days. Although they require more code, Functors yield cleaner algorithms. The below comparison between find_id and find_id2 showcase that result. While both yield sufficiently clean code, find_id2 is slightly easier to read as the MatchName(name) definition is extracted from (and secondary to) the primary algorithm.
I would argue, however, that the Functor code should be placed inside implementation files right above the function definition where it is used to provide direct access to the function definition. Otherwise a Lambda would be better for code-locality/organization.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
struct Person {
int id;
string name;
};
typedef vector<Person> People;
int find_id(string const& name, People const& people) {
auto MatchName = [name](Person const& p) -> bool
{
return p.name == name;
};
auto found = find_if(people.begin(), people.end(), MatchName);
if (found == people.end()) return -1;
return found->id;
}
struct MatchName {
string const& name;
MatchName(string const& name) : name(name) {}
bool operator() (Person const& person)
{
return person.name == name;
}
};
int find_id2(string const& name, People const& people) {
auto found = find_if(people.begin(), people.end(), MatchName(name));
if (found == people.end()) return -1;
return found->id;
}
int main() {
People people { {0, "Jim"}, {1, "Pam"}, {2, "Dwight"} };
cout << "Pam's ID is " << find_id("Pam", people) << endl;
cout << "Dwight's ID is " << find_id2("Dwight", people) << endl;
}
The Functor is self-documenting by default; but Lambda's need to be stored in variables (to be self-documenting) inside more-complex algorithm definitions. Hence, it is preferable to not use Lambda's inline as many people do (for code readability) in order to gain the self-documenting benefit as shown above in the MatchName Lambda.
When a Lambda is stored in a variable at the call-site (or used inline), primary algorithms are slightly more difficult to read. Since Lambdas are secondary in nature to algorithms where they are used, it is preferable to clean up the primary algorithms by using self-documenting subroutines (e.g. Functors). This might not matter as much in this example, but if one wanted to use more complex algorithms it can significantly reduce the burden interpreting code.
Functors can be as simple (as in the example above) or complex as they need to be. Sometimes complexity is desirable and cases for dynamic polymorphism (e.g. for strategy/decorator design patterns; or their template-equivalent policy types). This is a use-case Lambda's can not satisfy.
Functors require explicit declaration of capture variables without polluting primary algorithms. When more-and-more capture variables are required by Lambda's the tendency is to use a blanket-capture like [=]. But this reduces readability greatly as one must mentally jump between the Lambda definition and all surrounding local variables, possibly member variables, and more.
I was working on my advanced calculus homework today and we're doing some iteration methods along the lines of newton's method to find solutions to things like x^2=2. It got me thinking that I could write a function that would take two function pointers, one to the function itself and one to the derivative and automate the process. This wouldn't be too challenging, then I started thinking could I have the user input a function and parse that input (yes I can do that). But can I then dynamically create a pointer to a one-variable function in c++. For instance if x^2+x, can I make a function double function(double x){ return x*x+x;} during run-time. Is this remotely feasible, or is it along the lines of self-modifying code?
Edit:
So I suppose how this could be done if you stored the information in an array and that had a function that evaluated the information stored in this array with a given input. Then you could create a class and initialize the array inside of that class and then use the function from there. Is there a better way?
As others have said, you cannot create new C++ functions at runtime in any portable way. You can however create an expression evaluator that can evaluate things like:
(1 + 2) * 3
contained in a string, at run time. It's not difficult to expand such an evaluator to have variables and functions.
You can't dynamically create a function in the sense that you can generate raw machine code for it, but you can quite easily create mathematical expressions using polymorphism:
struct Expr
{
virtual double eval(double x) = 0;
};
struct Sum : Expr
{
Sum(Expr* a, Expr* b):a(a), b(b) {}
virtual double eval(double x) {return a->eval(x) + b->eval(x);}
private:
Expr *a, *b;
};
struct Product : Expr
{
Product(Expr* a, Expr* b):a(a), b(b) {}
virtual double eval(double x) {return a->eval(x) * b->eval(x);}
private:
Expr *a, *b;
};
struct VarX : Expr
{
virtual double eval(double x) {return x;}
};
struct Constant : Expr
{
Constant(double c):c(c) {}
virtual double eval(double x) {return c;}
private:
double c;
};
You can then parse your expression into an Expr object at runtime. For example, x^2+x would be Expr* e = new Sum(new Product(new VarX(), new VarX()), new VarX()). You can then evaluate that for a given value of x by using e->eval(x).
Note: in the above code, I have ignored const-correctness for clarity -- you should not :)
It is along the lines of self-modifying code, and it is possible—just not in "pure" C++. You would need to know some assembly and a few implementation details. Without going down this road, you could abstractly represent operations (e.g. with functors) and build an expression tree to be evaluated.
However, for the simple situation of just one variable that you've given, you'd only need to store coefficients, and you can evaluate those for a given value easily.
// store coefficients as vector in "reverse" order, e.g. 1x^2 - 2x + 3
// is stored as [3, -2, 1]
typedef double Num;
typedef vector<double> Coeffs;
Num eval(Coeffs c, Num x) {
assert(c.size()); // must not be empty
Num result = 0;
Num factor = 1;
for (Coeffs::const_iterator i = c.begin(); i != c.end(); ++i) {
result += *i * factor;
factor *= x;
}
return result;
}
int main() {
Coeffs c; // x^2 + x + 0
c.push_back(0);
c.push_back(1);
c.push_back(1);
cout << eval(c, 0) << '\n';
cout << eval(c, 1) << '\n';
cout << eval(c, 2) << '\n';
}
You don't really need self modifiying code for that. But you will be writing what comes down to an expression parser and interpreter. You write the code to parse your function into suitable data structures (e.g. trees). For a given input you now traverse the tree and calculate the result of the function. Calculation can be done through a visitor.
You don't need to know assembly. Write c++ code for the possible expressions, and then write a compiler which examines the expression and choose the appropriate code snippets. That could be done at runtime like an interpreter usually does, or it could be a compile phase which creates code to execute by copying the instructions from each expression evaluation into allocated memory and then sets it up as a function. The latter is harder to understand and code, but will perform better. But for the development time plus execution time to be less than an interpreted implementation, the compiled code would have to be used lots (billions) of times.
As others have mentioned. Writing self-modifying code isn't necessary at all and is painfull in a compiled language if you want it to be portable.
The hardest part of your work is parsing the input. I recommend muParser to evaluate your expressions. It should take away a lot of pain and you would be able to focus on the important part of your project.