I have some short defines in one of my headers like this:
#define ROUND_DOWN(a,b) (a)-(a)%(b)
e.g.
ROUND_DOWN(178,32) = 160
But if I pass this to it:
ROUND_DOWN(160*2, 32);
then it gets compiled like this?
(160*2)-(160*2)%(32),
which is just more processing as it does 160*2 twice..
I'm wondering if inline functions behave in the same way? e.g.
inline int RoundDown(int a, int b)
{
return (a)-(a)%(b)
}
Would 160*2 get stored in "int a" as 320 and then the calculation would work, or would it behave the same as the define?
A better example is calling:
RoundDown((x+x2)*zoom, tile_width);
Do “#define” and inline behave the same?
No they dont!
There are a number of differences between a macro and a inline function.
- No of times of Evaluation
Expressions passed as arguments to inline functions are evaluated once.
In some cases, expressions passed as arguments to macros can be evaluated more than once.
Every time you use an argument in a macro, that argument is evaluated.
A Code sample:
#define max(a,b) (a>b?a:b)
int main()
{
int a = 0;
int b = 1;
int c = max(a++, b++);
cout << a << endl << b << endl;
return 0;
}
The intention probably was to print 1 and 2, but macro expands to:
int c = a++ > b++ ? a++ : b++;
b gets incremented twice, and the program prints 1 and 3.
- Who evaluates them
Inline functions are evaluated by the compiler while Macros are evaluated at pre-compilation by precompiler.
- Type checking
Inline functions follow all the protocols of type safety enforced on normal functions.
Argument types are checked, and necessary conversions are performed correctly.
The compiler performs return type checking, function signature before putting inline function into symbol table.
They can be overloaded to perform the right kind of operation for the right kind of data.
Macros are more error prone as compared to inline functions. the The parameters are not typed (the macro works for any objects of arithmetic type).
No error checking is done during compilation.
A Code Sample:
#define MAX(a, b) ((a < b) ? b : a)
int main( void)
{
cout << "Maximum of 10 and 20 is " << MAX("20", "10") << endl;
return 0;
}
One can pass strings to a macro that does some integer arithmetic and a macro won't complain!
- Suggestion or Command?
Inline is just a suggestion to the compiler. It is the compiler’s decision whether to expand the function inline or not.
Macros will always be expanded.
- How about Debugging?
Inline functions can be debugged easily because you can put a break point at the inline function definition and step into the method for debugging step by step.
Macros can not be used for debugging as they are expanded at pre-compile time.
First, you should pretty much assume that all constant expressions are evaluated at compile-time, so that multiplication never survives to be executed when you run the program.
Second, you can't depend on inline having any effect at all, it's just a hint to the compiler, not a requirement.
But even if the function is not inlined, the expression would not be evaluated twice since argument passing requires it to be evaluated before the body of the function runs.
#defines are simple textual substitutions, so (as you noticed) you may need to be careful with parentheses, etc. inline parameters are parsed normally.
There's a related issue with respect to conditions.
Nominally, the function argument 160*2 is evaluated exactly once, and the result is then used in the body of the function, whereas the macro evaluates 160*2 twice. If the argument expression has side-effects, then you can see this[*]: ROUND_DOWN(printf("hi!\n"), 1); vs RoundDown(printf("hi!\n"), 1);
In practice, whether the function is inlined or the macro expanded, it's just integer arithmetic in the expression, with no side-effects. An optimizing compiler can work out the result of the whole macro/function call, and just stick the answer in the emitted code. So you might find that your macro and your inline function result in exactly the same code being executed, and so int a = ROUND_DOWN(160*2, 32); and int a = RoundDown(160*2, 32); might both be the same as int a = 320;.
Where there are no side-effects, optimization can also store and re-use intermediate results. So int c = ROUND_DONW(a*2, b); might end up emitting code that looks as though you've written:
int tmp = a*2;
int c = tmp - tmp % b;
Note that whether to actually inline a function is a decision made by the compiler based on its own optimization rules. Those rules might take account of whether the function is marked inline or not, but quite likely don't unless you're using compiler options to force inlining or whatever.
So, assuming a decent compiler there's no reason to use a macro for this - for your macro in particular you're just begging for someone to come along and write:
int a = ROUND_DOWN(321, 32) * 2;
and then waste a few minutes wondering why the result is 319.
[*] Although don't get carried away - for some expressions with side-effects, for example i++ where i is an integer, the macro has undefined behavior due to lack of sequence points.
Well in the example you've given with constants, on any reasonable compiler both versions will compute the constant at compile time.
Assuming you're actually asking about cases where variables are passed in, I would expect the compiler's optimizer to generate the same code in both cases (it wouldn't do the multiplication twice if it's more efficient to save off the result. Finally, the inline function does give the compiler the option to make an actual function call if it would improve performance.
Finally note that I wouldn't worry about micro-optimizations like this because 99% it's just going to have no effect on the performance of your program - I/O will be your bottleneck.
Related
This is something I've always wondered: is it easier for the compiler to optimise functions where existing variables are re-used, where new (ideally const) intermediate variables are created, or where creating variables is avoided in favour of directly using expressions?
For example, consider the functions below:
// 1. Use expression as and when needed, no new variables
void MyFunction1(int a, int b)
{
SubFunction1(a + b);
SubFunction2(a + b);
SubFunction3(a + b);
}
// 2. Re-use existing function parameter variable to compute
// result once, and use result multiple times.
// (I've seen this approach most in old-school C code)
void MyFunction2(int a, int b)
{
a += b;
SubFunction1(a);
SubFunction2(a);
SubFunction3(a);
}
// 3. Use a new variable to compute result once,
// and use result multiple times.
void MyFunction3(int a, int b)
{
int sum = a + b;
SubFunction1(sum);
SubFunction2(sum);
SubFunction3(sum);
}
// 4. Use a new const variable to compute result once,
// and use result multiple times.
void MyFunction4(int a, int b)
{
const int sum = a + b;
SubFunction1(sum);
SubFunction2(sum);
SubFunction3(sum);
}
My intuition is that:
In this particular situation, function 4 is easiest to optimise because it explicitly states the intention for the use of the data. It is telling the compiler: "We are summing the two input arguments, the result of which will not be modified, and we are passing on the result in an identical way to each subsequent function call." I expect that the value of the sum variable will just be put into a register, and no actual underlying memory access will occur.
Function 1 is the next easiest to optimise, though it requires more inference on the part of the compiler. The compiler must spot that a + b is used in an identical way for each function call, and it must know that the result of a + b is identical each time that expression is used. I would still expect the result of a + b to be put into a register rather than committed to memory. However, if the input arguments were more complicated than plain ints, I can see this being more difficult to optimise (rules on temporaries would apply for C++).
Function 3 is the next easiest after that: the result is not put into a const variable, but the compiler can see that sum is not modified anywhere in the function (assuming that the subsequent functions do not take a mutable reference to it), so it can just store the value in a register similarly to before. This is less likely than in function 4's case, though.
Function 4 gives the least assistance for optimisations, since it directly modifies an incoming function argument. I'm not 100% sure what the compiler would do here: I don't think it's unreasonable to expect it to be intelligent enough to spot that a is not used anywhere else in the function (similarly to sum in function 3), but I wouldn't guarantee it. This could require modifying stack memory depending on how the function arguments are passed in (I'm not too familiar with the ins and outs of how function calls work at that level of detail).
Are my assumptions here correct? Are there more factors to take into account?
EDIT: A couple of clarifications in response to comments:
If C and C++ compilers would approach the above examples in different ways, I'd be interested to know why. I can understand that C++ would optimise things differently depending on what constraints there are on whichever objects might be inputs to these functions, but for primitive types like int I would expect them to use identical heuristics.
Yes, I could compile with optimisations and look at the assembly output, but I don't know assembly, hence I'm asking here instead.
Good modern compilers generally do not “care” about the names you use to store values. They perform lifetime analyses of the values and generate code based on that. For example, given:
int x = complicated expression 0;
... code using x
x = complicated expression 1;
... code using x
the compiler will see that complicated expression 0 is used in the first section of code and complicated expression 1 is used in the second section of code, and the name x is irrelevant. The result will be the same as if the code used different names:
int x0 = complicated expression 0;
... code using x0
int x1 = complicated expression 1;
... code using x1
So there is no point in reusing a variable for a different purpose; it will not help the compiler save memory or otherwise optimize.
Even if the code were in a loop, such as:
int x;
while (some condition)
{
x = complicated expression;
... code using x
}
the compiler will see that complicated expression is born at the beginning of the loop body and ends by the end of the loop body.
What this means is you do not have to worry about what the compiler will do with the code. Instead, your decisions should be guided mostly by what is clearer to write and more likely to avoid bugs:
Avoid reusing a variable for more than one purpose. For example, if somebody is later updating your function to add a new feature, they might miss the fact you have changed the function parameter with a += b; and use a later in the code as if it still contained the original parameter.
Do freely create new variables to hold repeated expressions. int sum = a + b; is fine; it expresses the intent and makes it clearer to readers when the same expression is used in multiple places.
Limit the scope of variables (and identifiers generally). Declare them only in the innermost scope where they are needed, such as inside a loop rather than outside. The avoids a variable being used accidentally where it is no longer appropriate.
I was reading about constexpr here:
The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.
When I first read this sentence, it made perfect sense to me. However, recently I've come across some code that completely threw me off. I've reconstructed a simple example below:
#include <iostream>
void MysteryFunction(int *p);
constexpr int PlusOne(int input) {
return input + 1;
}
int main() {
int i = 0;
MysteryFunction(&i);
std::cout << PlusOne(i) << std::endl;
return 0;
}
Looking at this code, there is no way for me to say what the result of PlusOne(i) should be, however it actually compiles! (Of course linking will fail, but g++ -std=c++11 -c succeeds without error.)
What would be the correct interpretation of "possible to evaluate the value of the function at compile time"?
The quoted wording is a little misleading in a sense. If you just take PlusOne in isolation, and observe its logic, and assume that the inputs are known at compile-time, then the calculations therein can also be performed at compile-time. Slapping the constexpr keyword on it ensures that we maintain this lovely state and everything's fine.
But if the input isn't known at compile-time then it's still just a normal function and will be called at runtime.
So the constexpr is a property of the function ("possible to evaluate at compile time" for some input, not for all input) not of your function/input combination in this specific case (so not for this particular input either).
It's a bit like how a function could take a const int& but that doesn't mean the original object had to be const. Here, similarly, constexpr adds constraints onto the function, without adding constraints onto the function's input.
Admittedly it's all a giant, confusing, nebulous mess (C++! Yay!). Just remember, your code describes the meaning of a program! It's not a direct recipe for machine instructions at different phases of compilation.
(To really enforce this you'd have the integer be a template argument.)
A constexpr function may be called within a constant expression, provided that the other requirements for the evaluation of the constant expression are met. It may also be called within an expression that is not a constant expression, in which case it behaves the same as if it had not been declared with constexpr. As the code in your question demonstrates, the result of calling a constexpr function is not automatically a constant expression.
What would be the correct interpretation of "possible to evaluate the value of the function at compile time"?
If all the arguments to the function are evaluatable at compile time, then the return value of the function can be evaluated at compile time.
However, if the values of the arguments to the function are known only at run time, then the retun value of the function can only be evaluated at run time.
Hence, it possible to evaluate the value of the function at compile time but it is not a requirement.
All the answers are correct, but I want to give one short example that I use to explain how ridiculously unintuitive constexpr is.
#include <cstdlib>
constexpr int fun(int i){
if (i==42){
return 47;
} else {
return rand();
}
}
int main()
{
int arr[fun(42)];
}
As a side note:
some people find constexpr status unsatisfying so they proposed constexpr! keyword addition to the language.
How does an inline function differ from a preprocessor macro?
Preprocessor macros are just substitution patterns applied to your code. They can be used almost anywhere in your code because they are replaced with their expansions before any compilation starts.
Inline functions are actual functions whose body is directly injected into their call site. They can only be used where a function call is appropriate.
Now, as far as using macros vs. inline functions in a function-like context, be advised that:
Macros are not type safe, and can be expanded regardless of whether they are syntatically correct - the compile phase will report errors resulting from macro expansion problems.
Macros can be used in context where you don't expect, resulting in problems
Macros are more flexible, in that they can expand other macros - whereas inline functions don't necessarily do this.
Macros can result in side effects because of their expansion, since the input expressions are copied wherever they appear in the pattern.
Inline function are not always guaranteed to be inlined - some compilers only do this in release builds, or when they are specifically configured to do so. Also, in some cases inlining may not be possible.
Inline functions can provide scope for variables (particularly static ones), preprocessor macros can only do this in code blocks {...}, and static variables will not behave exactly the same way.
First, the preprocessor macros are just "copy paste" in the code before the compilation. So there is no type checking, and some side effects can appear
For example, if you want to compare 2 values:
#define max(a,b) ((a<b)?b:a)
The side effects appear if you use max(a++,b++) for example (a or b will be incremented twice).
Instead, use (for example)
inline int max( int a, int b) { return ((a<b)?b:a); }
The inline functions are expanded by the compiler where as the macros are expanded by the preprocessor which is a mere textual substitution.
Hence,
There is no type checking during macro invocation while type checking is done during function call.
Undesired results and inefficiency may occur during macro expansion due to reevaluation of arguments and order of operations. For example:
#define MAX(a,b) ((a)>(b) ? (a) : (b))
int i = 5, j = MAX(i++, 0);
would result in
int i = 5, j = ((i++)>(0) ? (i++) : (0));
The macro arguments are not evaluated before macro expansion
#include <stdio.h>
#define MUL(a, b) a*b
int main()
{
// The macro is expended as 2 + 3 * 3 + 5, not as 5*8
printf("%d", MUL(2+3, 3+5));
return 0;
}
// Output: 16
The return keyword cannot be used in macros to return values as in the case of functions.
Inline functions can be overloaded.
The tokens passed to macros can be concatenated using operator ## called Token-Pasting operator.
Macros are generally used for code reuse where as inline functions are used to eliminate the time overhead (excess time) during function call (avoiding a jump to a subroutine).
The key difference is type checking. The compiler will check whether what you pass as input values is of types that can be passed into the function. That's not true with preprocessor macros - they are expanded prior to any type checking and that can cause severe and hard to detect bugs.
Here are several other less obvious points outlined.
To add another difference to those already given: you can't step through a #define in the debugger, but you can step through an inline function.
Macros are ignoring namespaces. And that makes them evil.
inline functions are similar to macros (because the function code is expanded at the point of the call at compile time), inline functions are parsed by the compiler, whereas macros are expanded by the preprocessor. As a result, there are several important differences:
Inline functions follow all the protocols of type safety enforced on normal functions.
Inline functions are specified using the same syntax as any other function except that they include the inline keyword in the function declaration.
Expressions passed as arguments to inline functions are evaluated once.
In some cases, expressions passed as arguments to macros can be evaluated more than once.
http://msdn.microsoft.com/en-us/library/bf6bf4cf.aspx
macros are expanded at pre-compile time, you cannot use them for debugging, but you can use inline functions.
-- good article:
http://www.codeguru.com/forum/showpost.php?p=1093923&postcount=1
;
To know the difference between macros and inline functions, first we should know what exactly they are and when we should use them.
FUNCTIONS:
int Square(int x)
{
return(x*x);
}
int main()
{
int value = 5;
int result = Square(value);
cout << result << endl;
}
Function calls have overhead associated with them. After the function finishes executing it needs to know where to return to, so it stores the return address on the stack before calling the function. For small applications this might not be a problem, but in, say, a financial application, where thousands of transactions are happening every second, a function call might be too expensive.
MACROS:
# define Square(x) x*x;
int main()
{
int value = 5;
int result = Square(value);
cout << result << endl;
}
Macros are applied in the preprocessing stage. During this stage the statements written with #define keywords will be replaced or expanded
int result = Square(x*x)
But macros can cause unexpected behavior.
#define Square(x) x*x
int main()
{
int val = 5;
int result = Square(val + 1);
cout << result << endl;
}
Here the output is 11, not 36.
INLINE FUNCTIONS:
inline int Square(int x)
{
return x * x;
}
int main()
{
int val = 5;
int result = Square(val + 1);
cout << result << endl;
}
Output: 36
The inline keyword requests that the compiler replace the function call with the body of the function. Here the output is correct because it first evaluates the expression and then uses the result to perform the body of the function. Inline functions reduce the function call overhead as there is no need to store a return address or function arguments to the stack.
Comparison Between Macros and Inline Functions:
Macros work through text substitution, whereas inline functions duplicate the logic of a function.
Macros are error prone due to substitution while inline functions are safe to use.
Macros can't be assigned to function pointers; inline functions can.
Macros are difficult to use with multiple lines of code, whereas inline functions are not.
In C++ macros cannot be used with member functions whereas inline function could be.
CONCLUSION:
Inline functions are sometimes more useful than macros, as they are safe to use, but can also reduce function call overhead.
The inline keyword is a request to the compiler, certain functions won't be inlined like:
large functions
functions having too many conditional arguments
recursive code and code with loops etc.
which is a good thing, because it allows the compiler to determine whether it would be better to do things another way.
An inline function will maintain value semantics, whereas a preprocessor macro just copies the syntax. You can get very subtle bugs with a preprocessor macro if you use the argument multiple times - for example if the argument contains mutation like "i++" having that execute twice is quite a surprise. An inline function will not have this problem.
A inline functuion behaves syntactically just like a normal function, providing type safety and a scope for function local variables and access to class-members if it is a method.
Also when calling inline methods you must adhere to private/protected restrictions.
In GCC (I'm not sure about others), declaring a function inline, is just a hint to the compiler. It is still up to the compiler at the end of the day to decide whether or not it includes the body of the function whenever it is called.
The difference between in-line functions and preprocessor macros is relatively large. Preprocessor macros are just text replacement at the end of the day. You give up a lot of the ability for the compiler to perform checking on type checking on the arguments and return type. Evaluation of the arguments is much different (if the expressions you pass into the functions have side-effects you'll have a very fun time debugging). There are subtle differences about where functions and macros can be used. For example if I had:
#define MACRO_FUNC(X) ...
Where MACRO_FUNC obviously defines the body of the function. Special care needs to be taken so it runs correctly in all cases a function can be used, for example a poorly written MACRO_FUNC would cause an error in
if(MACRO_FUNC(y)) {
...body
}
A normal function could be used with no problem there.
From the perspective of coding, an inline function is like a function. Thus, the differences between an inline function and a macro are the same as the differences between a function and a macro.
From the perspective of compiling, an inline function is similar to a macro. It is injected directly into the code, not called.
In general, you should consider inline functions to be regular functions with some minor optimization mixed in. And like most optimizations, it is up to the compiler to decide if it actually cares to apply it. Often the compiler will happily ignore any attempts by the programmer to inline a function, for various reasons.
inline functions will behave as a function call if there exists any iterative or recursive statement in it, so as to prevent repeated execution of instructions. Its quite helpful to save the overall memory of your program.
#include<iostream>
using namespace std;
#define NUMBER 10 //macros are preprocessed while functions are not.
int number()
{
return 10;
}
/*In macros, no type checking(incompatible operand, etc.) is done and thus use of micros can lead to errors/side-effects in some cases.
However, this is not the case with functions.
Also, macros do not check for compilation error (if any). Consider:- */
#define CUBE(b) b*b*b
int cube(int a)
{
return a*a*a;
}
int main()
{
cout<<NUMBER<<endl<<number()<<endl;
cout<<CUBE(1+3); //Unexpected output 10
cout<<endl<<cube(1+3);// As expected 64
return 0;
}
Macros are typically faster than functions as they don’t involve actual function call overhead.
Some Disadvantages of macros:
There is no type checking.Difficult to debug as they cause simple replacement.Macro don’t have namespace, so a macro in one section of code can affect other section. Macros can cause side effects as shown in above CUBE() example.
Macros are usually one liner. However, they can consist of more than one line.There are no such constraints in functions.
In my c++ program, I have this function,
char MostFrequentCharacter(ifstream &ifs, int &numOccurances);
and in main(), is this code,
ifstream in("file.htm");
int maxOccurances = 0;
cout <<"Most freq char is "<<MostFrequentCharacter(in, maxOccurances)<<" : "<<maxOccurances;
But this is not working, though I am getting the correct char, the maxOccurance remains zero.
But if I replace the above code in main with this,
ifstream in("file.htm");
int maxOccurances = 0;
char maxFreq = MostFrequentCharacter(in, maxOccurances);
cout <<"Most freq char is "<<maxFreq<<" : "<<maxOccurances;
Then, it is working correctly. My question is why is it not working in first case.
In C++,
cout << a << b
By Associativity evaluates to:
(cout << a) << b
but the compiler is free to evaluate them in any order.
i.e, the compiler can evaluate b first, then a, then the first << operation and the the second << operation. This because there is no sequence point associated with <<
For the sake of simplicity let us consider the following code, which is equivalent:
#include<iostream>
int main()
{
int i = 0;
std::cout<<i<<i++;
return 0;
}
In the above source code:
std::cout<<i<<i++;
evaluates to the function call:
operator<<(operator<<(std::cout,i),i++);
In this function call whether operator<<(std::cout,i) or i++ gets evaluated first is Unspecified. i.e:
operator<<(std::cout,i) maybe evaluated first Or
i++ maybe evaluated first Or
Some Magic Ordering implemented by the compiler
Given the above, that there is no way to define this ordering and hence no explanation is possible either.
Relevant Quote from the C++03 Standard:
Section 1.9
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Because in the first case, the value of maxOccurances in the expression is being resolved before the call to MostFrequentCharacter. It doesn't have to be that way though, it is unspecified behavior.
You may experience different results with different compilers, or compiler options. If you try that same thing on VC++ for example, I believe you will see different results.
You just have to note that where you see << you are actually calling the operator<< method - so the compiler is working out the value of the arguments to pass into that function before your variable is modified.
In other words, what you have is similar to
operator<<(operator<<(cout, f(x)), x);
...and since the evaluation order of function arguments is undefined, it depends on the compiler.
Cout works right to left in your compiler so first rightmost is evaluated then left one. :)
So the value of referenced variable isn't changed.
If the TEST macro is not defined, I would like to know whether there is a performance difference in these two pieces of code:
void Func1(int a) {
// ...
}
#ifdef TEST
Func1(123);
#endif
and:
void Func2(int a) {
#ifdef TEST
// ...
#endif
}
Func2(123);
With TEST not defined, Func2 would become an empty function that the compiler should not call at all, isn't it?
It pretty much comes down to whether that particular call to Func2 is inlined or not. If it is, then an optimizing compiler ought to be able to make an inlined call to an empty function the same as not calling it at all. If it isn't inlined, then it's called and returns immediately.
As long as the function definition is available in the TU containing the call to Func2, there's no obvious reason it won't be inlined.
This all relies on the fact that 123 is a literal, so evaluating the arguments of your call has no side-effects. The args have to be evaluated even if the function call has no effect, so:
int i = 0;
/* 'i' is incremented, even if the call is optimized out */
Func2(++i);
/* 'i' is not incremented when 'TEST' is undefined */
#ifdef TEST
Func1(++i);
#endif
Optimizations are generally compiler-specific. The language standard does not, AFAIK, make any statements about what should and what should not be optimized away. (Although I admit I haven't read the language specification itself.)
Each compiler has its own set of options by which to enable / disable specific optimization steps.
So the answer is a definite "it depends", and "you would have to check yourself to be sure".
(However, I'd be quite surprised if a halfway decent optimizer would leave such a construct unoptimized.)