Does c optimize the check portion of for loops? - c++

With the following code how many times would the min function actually be called
for (int i = 0; i < min(size, max_size); i++) {
//Do something cool that does not involve changing the value of size or max size
}
Would the compiler notice that they could just calculate the minimum and register it or should I explicitly create a variable to hold the value before entering the loop? What kinds of languages would be able to optimize this?
As an extension if I were in an object oriented language with a similar loop except it looked more like this
for (int i = 0; i < object.coolFunc(); i++) {
//Code that may change parameters and state of object but does not change the return value of coolFunc()
}
What would be optimized?

Any good compiler will optimize the controlling expression of a for loop by evaluating visibly invariant subexpressions in it just once, provided optimization is enabled. Here, “invariant” means the value of the subexpression does not change while the loop is executing. “Visibly” means the compiler can see that the expression is invariant. There are things that can interfere with this:
Suppose, inside the loop, some function is called and the address of size is passed as an argument. Since the function has the address of size, it could change the contents of size. Maybe the function does not do this, but the compiler might not be able to see the contents of the function. Its source code could be in another file. Or the function could be so complicated the compiler cannot analyze it. Then the compiler cannot see that size does not change.
min is not a standard C function, so your program must define it somewhere. As above, if the compiler does not know what min does or if it is too complicated for the compiler to analyze (not likely in this particular case, but in general), the compiler might not be able to see that it is a pure function.
Of course, the C standard does not guarantee this optimization. However, as you become experienced in programming, your knowledge of compilers and other tools should grow, and you will become familiar with what is expected of good tools, and you will also learn to beware of issues such as those above. For simple expressions, you can expect the compiler to optimize. But you need to remain alert to things that can interfere with optimization.

Related

How to instruct VC++ compiler to not inline a constant?

I have the following global constant in my C++ program:
const int K = 123456 ;
When I compile the program, the resulting executable contains the literal value 123456 in all the places where the value is used (dozens of times).
But, if I remove the const qualifier, the value 123456 appears only once in the entire executable (in the .data section).
This is the result I'm looking for. I want the value 123456 to appear only once so that it can be changed simply by editing the .exe file with a HEX editor.
However, I don't want to remove the const qualifier because I want the compiler to prevent me from accidentally modifying the constant in the source code.
Is it possible to instruct the compiler somehow to not inline the value of said constant?
The reason I need to do this is so that the executable is easily modifiable by students who will be tasked with "cracking" an example program to alter its behavior. The exercise must be simple enough for inexperienced people.
If you don't want K to be inlined then put this in a header file:
extern const int K;
This means "K is defined somewhere else". Then put this in a cpp file:
const int K = 123456;
In all the places where K is used, the compiler only knows that K is a const int declared externally. The compiler doesn't know the value of K so it cannot be inlined. The linker will find the definition of K in the cpp file put it in the .data section of the executable.
Alternatively, you could define K like this:
const volatile int K = 123456;
This means "K might magically change so you better not assume its value". It has a similar effect to the previous approach as the compiler won't inline K because it can't assume that K will always be 123456. The previous approach would fail if LTO was enabled but using volatile should work in that case.
I must say, this is a really weird thing to do. If you want to make your program configurable, you should put the value of K into a text file and then read the file at startup.
The simplest option is probably to declare it as global without const, so the compiler can't assume that it still has the value of the static initializer.
int K = 123456;
Even link-time optimization can't know that a library function doesn't access this global, assuming you call any in your program.
If your used static int K = 123456;, the compiler could notice that no functions in the compilation unit write the value, and none of them pass or return its address, so escape analysis for the whole compilation unit could discover that it was effectively a constant and could be optimized away.
(If you really wanted it to be static int K;, include a global function like void setK(int x){K=x;} that you never actually call. Without Link-Time Optimization, the compiler will have to assume that something outside this compilation unit could have called this function and changed K, and that any call to a function whose definition isn't visible might result in such a call.)
Beware that volatile const int K = 123456; can hurt optimization significantly more than making it non-const, especially if you have expressions that use K multiple times.
(But either of these can hurt a lot, depending on what optimizations were possible. Constant-propagation can be a huge win.)
The compiler is required to emit asm that loads exactly K once for each time the C abstract machine reads it. (e.g. reading K is considered a visible side-effect, like a read from an MMIO port or a location you have a hardware watchpoint on.)
If you want to let a compiler load it once per loop, and assume K is a loop invariant, then code that uses it should do int local_k = K;. It's up to you how often you want to re-read K, i.e. what scope you do / redo local_k = K at.
On x86, using a memory source operand that stays hot in L1d cache is probably not much of a performance problem, but it will prevent auto-vectorization.
The reason I need to do this is so that the executable is easily modifiable by students who will be tasked with "cracking" an example program to alter its behavior. The exercise must be simple enough for inexperienced people.
For this use-case, yes volatile is exactly what you want. Having all uses re-read from memory on the spot makes it slightly simpler than following the value cached in a register.
And performance is essentially irrelevant, and you won't want auto-vectorization. Probably just light optimization so the students don't have to wade through store/reload of everything after every C++ statement. Like gcc's -Og would be ideal.
With MSVC, maybe try -O1 or -O2 and see if it does anything confusing. I don't think it has options for some but not too aggressive optimization, it might be either debug build (nice for single-stepping the C++ source, bad for reading asm), or fully optimized for size or speed.
Try declaring the constant as volatile. That should result in a single and changeable value that won't be inlined.

Efficiency of member function returning std::vector.size() in object oriented framework

I have a class declared in prob.h thus:
struct A_s{
int a, b;
}
class A_c{
private:
std::vector<A_s> vec_of_A_s;
public:
int vec_of_A_s_size() const{return static_cast<int>(vec_of_A_s.size());}
}
With A_c A;//A is an object of class A_c, somewhere else in my implementation .cpp file, I have the following line:
for(int i = 0; i < A.vec_of_A_s_size(); i++) {...//do loop stuff}
I KNOW from my program's design that A.vec_of_A_s_size() is loop invariant. However, I really want to avoid the following (it is cumbersome):
int sz = A.vec_of_A_s_size();
for(int i = 0; i < sz; i++) {...//do loop stuff}
Can I sufficiently and consistently rely on a compiler that a release build with optimization turned on (-O2) will not evaluate vec_of_A_s.size() each time?
Here is what I have already tried along with my questions:
(1) (Pleae see edit update below) Even with a debug build, with options -fPIC -fno-strict-aliasing -fexceptions -g -std=c++14, looking at the disassembler output, vec_of_A_s.size() is evaluated only once. However, will the compiler do this optimization reliably consistently? Are there any known exceptions? Part of the reason for my skepticism and need for assurance stems from question (2) that follows.
(2)I looked at a related question on SO: Performance issue for vector::size() in a loop. The question there directly evaluates a vector's size in the loop like such:
for(int i = 0; i < vec_of_A_s.size(); i++) {...//do loop stuff}
In my case, the vector is not directly accessible. It is a private member of A_c and its size can only be accessed via the public member function A.vec_of_A_s_size(). So, there is an additional layer of indirection/redirection that has to happen within the for loop. The answers on that thread seem to suggest that the compiler will indeed optimize the loop invariant. But in the case (like above) where a vector's size is not directly and publicly available, will the compiler reliably guarantee the loop invariant optimization?
(3)In other related questions on such issues, a common answer seems to be to profile the program. If and when I do the profiling, what exactly should I be looking for to verify this specific optimization? This code is part of a larger numerical analysis code and this is definitely NOT the current bottleneck. Yet, it would be nice to know how this can be verified in a profiler. Apologies if this question (3) is too broad. I am relatively new to profiling. But do profilers allow profiling a single function, say the function that contains the for loop above? That way, I can know for sure where the bottleneck is as it pertains to this function.
Edit update:
On (1), it is NOT true that a debug build with the said compiler options optimizes the loop invariant. I was wrong. On deeper digging, it turns out that the function is indeed called twice.
If your loop calls a non constant method on the vector then I’d pretty much say with certainty that all bets are off.
If you only call const methods on the vector then you could hope for optimisations, but since the standard does not require them, you can’t really blame a compiler for not making an optimisation that may appear obvious to you,
Given that you can declare more than one variable in the for loop so long as they are of the same type, bringing sz into the loop seems the obvious thing to do. Or, can you run the loop backwards?

Why is it impossible to build a compiler that can determine if a C++ function will change the value of a particular variable?

I read this line in a book:
It is provably impossible to build a compiler that can actually
determine whether or not a C++ function will change the value of a
particular variable.
The paragraph was talking about why the compiler is conservative when checking for const-ness.
Why is it impossible to build such a compiler?
The compiler can always check if a variable is reassigned, a non-const function is being invoked on it, or if it is being passed in as a non-const parameter...
Why is it impossible to build such a compiler?
For the same reason that you can't write a program that will determine whether any given program will terminate. This is known as the halting problem, and it's one of those things that's not computable.
To be clear, you can write a compiler that can determine that a function does change the variable in some cases, but you can't write one that reliably tells you that the function will or won't change the variable (or halt) for every possible function.
Here's an easy example:
void foo() {
if (bar() == 0) this->a = 1;
}
How can a compiler determine, just from looking at that code, whether foo will ever change a? Whether it does or doesn't depends on conditions external to the function, namely the implementation of bar. There's more than that to the proof that the halting problem isn't computable, but it's already nicely explained at the linked Wikipedia article (and in every computation theory textbook), so I'll not attempt to explain it correctly here.
Imagine such compiler exists. Let's also assume that for convenience it provides a library function that returns 1 if the passed function modifies a given variable and 0 when the function doesn't. Then what should this program print?
int variable = 0;
void f() {
if (modifies_variable(f, variable)) {
/* do nothing */
} else {
/* modify variable */
variable = 1;
}
}
int main(int argc, char **argv) {
if (modifies_variable(f, variable)) {
printf("Modifies variable\n");
} else {
printf("Does not modify variable\n");
}
return 0;
}
Don't confuse "will or will not modify a variable given these inputs" for "has an execution path which modifies a variable."
The former is called opaque predicate determination, and is trivially impossible to decide - aside from reduction from the halting problem, you could just point out the inputs might come from an unknown source (eg. the user). This is true of all languages, not just C++.
The latter statement, however, can be determined by looking at the parse tree, which is something that all optimizing compilers do. The reason they do is that pure functions (and referentially transparent functions, for some definition of referentially transparent) have all sorts of nice optimizations that can be applied, like being easily inlinable or having their values determined at compile-time; but to know if a function is pure, we need to know if it can ever modify a variable.
So, what appears to be a surprising statement about C++ is actually a trivial statement about all languages.
I think the key word in "whether or not a C++ function will change the value of a particular variable" is "will". It is certainly possible to build a compiler that checks whether or not a C++ function is allowed to change the value of a particular variable, you cannot say with certainty that the change is going to happen:
void maybe(int& val) {
cout << "Should I change value? [Y/N] >";
string reply;
cin >> reply;
if (reply == "Y") {
val = 42;
}
}
I don't think it's necessary to invoke the halting problem to explain that you can't algorithmically know at compile time whether a given function will modify a certain variable or not.
Instead, it's sufficient to point out that a function's behavior often depends on run-time conditions, which the compiler can't know about in advance. E.g.
int y;
int main(int argc, char *argv[]) {
if (argc > 2) y++;
}
How could the compiler predict with certainty whether y will be modified?
It can be done and compilers are doing it all the time for some functions, this is for instance a trivial optimisation for simple inline accessors or many pure functions.
What is impossible is to know it in the general case.
Whenever there is a system call or a function call coming from another module, or a call to a potentially overriden method, anything could happen, included hostile takeover from some hacker's use of a stack overflow to change an unrelated variable.
However you should use const, avoid globals, prefer references to pointers, avoid reusing variables for unrelated tasks, etc. that will makes the compiler's life easier when performing aggressive optimisations.
There are multiple avenues to explaining this, one of which is the Halting Problem:
In computability theory, the halting problem can be stated as follows: "Given a description of an arbitrary computer program, decide whether the program finishes running or continues to run forever". This is equivalent to the problem of deciding, given a program and an input, whether the program will eventually halt when run with that input, or will run forever.
Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist.
If I write a program that looks like this:
do tons of complex stuff
if (condition on result of complex stuff)
{
change value of x
}
else
{
do not change value of x
}
Does the value of x change? To determine this, you would first have to determine whether the do tons of complex stuff part causes the condition to fire - or even more basic, whether it halts. That's something the compiler can't do.
Really surprised that there isn't an answer that using the halting problem directly! There's a very straightforward reduction from this problem to the halting problem.
Imagine that the compiler could tell whether or not a function changed the value of a variable. Then it would certainly be able to tell whether the following function changes the value of y or not, assuming that the value of x can be tracked in all the calls throughout the rest of the program:
foo(int x){
if(x)
y=1;
}
Now, for any program we like, let's rewrite it as:
int y;
main(){
int x;
...
run the program normally
...
foo(x);
}
Notice that, if, and only if, our program changes the value of y, does it then terminate - foo() is the last thing it does before exiting. This means we've solved the halting problem!
What the above reduction shows us is that the problem of determining whether a variable's value changes is at least as hard as the halting problem. The halting problem is known to be incomputable, so this one must be also.
As soon as a function calls another function that the compiler doesn't "see" the source of, it either has to assume that the variable is changed, or things may well go wrong further below. For example, say we have this in "foo.cpp":
void foo(int& x)
{
ifstream f("f.dat", ifstream::binary);
f.read((char *)&x, sizeof(x));
}
and we have this in "bar.cpp":
void bar(int& x)
{
foo(x);
}
How can the compiler "know" that x is not changing (or IS changing, more appropriately) in bar?
I'm sure we can come up with something more complex, if this isn't complex enough.
It is impossible in general to for the compiler to determine if the variable will be changed, as have been pointed out.
When checking const-ness, the question of interest seems to be if the variable can be changed by a function. Even this is hard in languages that support pointers. You can't control what other code does with a pointer, it could even be read from an external source (though unlikely). In languages that restrict access to memory, these types of guarantees can be possible and allows for more aggressive optimization than C++ does.
To make the question more specific I suggest the following set of constraints may have been what the author of the book may have had in mind:
Assume the compiler is examining the behavior of a specific function with respect to const-ness of a variable. For correctness a compiler would have to assume (because of aliasing as explained below) if the function called another function the variable is changed, so assumption #1 only applies to code fragments that don't make function calls.
Assume the variable isn't modified by an asynchronous or concurrent activity.
Assume the compiler is only determining if the variable can be modified, not whether it will be modified. In other words the compiler is only performing static analysis.
Assume the compiler is only considering correctly functioning code (not considering array overruns/underruns, bad pointers, etc.)
In the context of compiler design, I think assumptions 1,3,4 make perfect sense in the view of a compiler writer in the context of code gen correctness and/or code optimization. Assumption 2 makes sense in the absence of the volatile keyword. And these assumptions also focus the question enough to make judging a proposed answer much more definitive :-)
Given those assumptions, a key reason why const-ness can't be assumed is due to variable aliasing. The compiler can't know whether another variable points to the const variable. Aliasing could be due to another function in the same compilation unit, in which case the compiler could look across functions and use a call tree to statically determine that aliasing could occur. But if the aliasing is due to a library or other foreign code, then the compiler has no way to know upon function entry whether variables are aliased.
You could argue that if a variable/argument is marked const then it shouldn't be subject to change via aliasing, but for a compiler writer that's pretty risky. It can even be risky for a human programmer to declare a variable const as part of, say a large project where he doesn't know the behavior of the whole system, or the OS, or a library, to really know a variable won't change.
Even if a variable is declared const, doesn't mean some badly written code can overwrite it.
// g++ -o foo foo.cc
#include <iostream>
void const_func(const int&a, int* b)
{
b[0] = 2;
b[1] = 2;
}
int main() {
int a = 1;
int b = 3;
std::cout << a << std::endl;
const_func(a,&b);
std::cout << a << std::endl;
}
output:
1
2
To expand on my comments, that book's text is unclear which obfuscates the issue.
As I commented, that book is trying to say, "let's get an infinite number of monkeys to write every conceivable C++ function which could ever be written. There will be cases where if we pick a variable that (some particular function the monkeys wrote) uses, we can't work out whether the function will change that variable."
Of course for some (even many) functions in any given application, this can be determined by the compiler, and very easily. But not for all (or necessarily most).
This function can be easily so analysed:
static int global;
void foo()
{
}
"foo" clearly does not modify "global". It doesn't modify anything at all, and a compiler can work this out very easily.
This function cannot be so analysed:
static int global;
int foo()
{
if ((rand() % 100) > 50)
{
global = 1;
}
return 1;
Since "foo"'s actions depends on a value which can change at runtime, it patently cannot be determined at compile time whether it will modify "global".
This whole concept is far simpler to understand than computer scientists make it out to be. If the function can do something different based on things can change at runtime, then you can't work out what it'll do until it runs, and each time it runs it may do something different. Whether it's provably impossible or not, it's obviously impossible.

Compiler instruction reordering optimizations in C++ (and what inhibits them)

I've reduced my code down to the following, which is as simple as I could make it whilst retaining the compiler output that interests me.
void foo(const uint64_t used)
{
uint64_t ar[100];
for(int i = 0; i < 100; ++i)
{
ar[i] = some_global_array[i];
}
const uint64_t mask = ar[0];
if((used & mask) != 0)
{
return;
}
bar(ar); // Not inlined
}
Using VC10 with /O2 and /Ob1, the generated assembly pretty much reflects the order of instructions in the above C++ code. Since the local array ar is only passed to bar() when the condition fails, and is otherwise unused, I would have expected the compiler to optimize to something like the following.
if((used & some_global_array[0]) != 0)
{
return;
}
// Now do the copying to ar and call bar(ar)...
Is the compiler not doing this because it's simply too hard for it to identify such optimizations in the general case? Or is it following some strict rule that forbids it from doing so? If so, why, and is there some way I can give it a hint that doing so wouldn't change the semantics of my program?
Note: obviously it would be trivial to obtain the optimized output by just rearranging the code, but I'm interested in why the compiler won't optimize in such cases, not how to do so in this (intentionally simplified) case.
Probably the reason why this is not getting optimized is the global array. The compiler can't know beforehand if, say, accessing some_global_array[99] will result in some kind of exception/signal being generated so it has to execute the whole loop. Things would be pretty different if the global array was statically defined in the same compilation unit.
For example, in LLVM, the following three definitions of the global array will yield wildly differing outputs of that function:
// this yields pretty much what you're seeing
uint64_t *some_global_array;
// this calls memcpy and then performs the conditional check
uint64_t some_global_array[100] = {0};
// this calls memset (not memcpy!) on the ar array and then bar directly (no
// conditional checks since the array is const and filled with 0s, so the if
// is always false)
const uint64_t some_global_array[100] = {0};
The second is pretty puzzling, but it may simply be a missed optimization (or maybe I'm missing something else).
There are no "strict rules" controlling what kind of assembly language the compiler is permitted to output. If the compiler can be certain that a block of code does not need to be executed (because it has no side effects) due to some precondition, then it is absolutely permitted to short-circuit the whole thing.
This sort of optimisation can be fairly complex in the general case, and your compiler might not go to all that effort. If this is performance critical code, then you can fine-tune your source code (as you suggest) to help the compiler generate the best assembly code. This is a trial-and-error process though, and you might have to do it again for the next version of the compiler.

Two questions about inline functions in C++

I have question when I compile an inline function in C++.
Can a recursive function work with inline. If yes then please describe how.
I am sure about loop can't work with it but I have read somewhere recursive would work, If we pass constant values.
My friend send me some inline recursive function as constant parameter and told me that would be work but that not work on my laptop, no error at compile time but at run time display nothing and I have to terminate it by force break.
inline f(int n) {
if(n<=1)
return 1;
else {
n=n*f(n-1);
return n;
}
}
how does this work?
I am using turbo 3.2
Also, if an inline function code is too large then, can the compiler change it automatically in normal function?
thanks
This particular function definitely can be inlined. That is because the compiler can figure out that this particular form of recursion (tail-recursion) can be trivially turned into a normal loop. And with a normal loop it has no problem inlining it at all.
Not only can the compiler inline it, it can even calculate the result for a compile-time constant without generating any code for the function.
With GCC 4.4
int fac = f(10);
produced this instruction:
movl $3628800, 4(%esp)
You can easily verify when checking assembly output, that the function is indeed inlined for input that is not known at compile-time.
I suppose your friend was trying to say that if given a constant, the compiler could calculate the result entirely at compile time and just inline the answer at the call site. c++0x actually has a mechanism for this called constexpr, but there are limits to how complex the code is allowed to be. But even with the current version of c++, it is possible. It depends entirely on the compiler.
This function may be a good candidate given that it clearly only references the parameter to calculate the result. Some compilers even have non-portable attributes to help the compiler decide this. For example, gcc has pure and const attributes (listed on that page I just linked) that inform the compiler that this code only operates on the parameters and has no side effects, making it more likely to be calculated at compile time.
Even without this, it will still compile! The reason why is that the compiler is allowed to not inline a function if it decides. Think of the inline keyword more of a suggestion than an instruction.
Assuming that the compiler doesn't calculate the whole thing at compile time, inlining is not completely possible without other optimizations applied (see EDIT below) since it must have an actual function to call. However, it may get partially inlined. In that case the compiler will inline the initial call, but also emit a regular version of the function which will get called during recursion.
As for your second question, yes, size is one of the factors that compilers use to decide if it is appropriate to inline something.
If running this code on your laptop takes a very long time, then it is possible that you just gave it very large values and it is simply taking a long time to calculate the answer... The code look ok, but keep in mind that values above 13! are going to overflow a 32-bit int. What value did you attempt to pass?
The only way to know what actually happens is to compile it an look at the assembly generated.
PS: you may want to look into a more modern compiler if you are concerned with optimizations. For windows there is MingW and free versions of Visual C++. For *NIX there is of course g++.
EDIT: There is also a thing called Tail Recursion Optimization which allows compilers to convert certain types of recursive algorithms to iterative, making them better candidates for inlining. (In addition to making them more stack space efficient).
Recursive function can be inlined to certain limited depth of recursion. Some compilers have an option that lets you to specify how deep you want to go when inlining recursive functions. Basically, the compiler "flattens" several nested levels of recursion. If the execution reaches the end of "flattened" code, the code calls itself in usual recursive fashion and so on. Of course, if the depth of recursion is a run-time value, the compiler has to check the corresponding condition every time before executing each original recursive step inside the "flattened" code. In other words, there's nothing too unusual about inlining a recursive function. It is like unrolling a loop. There's no requirement for the parameters to be constant.
What you mean by "I am sure about loop can't work" is not clear. It doesn't seem to make much sense. Functions with a loop can be easily inlined and there's nothing strange about it.
What are you trying to say about your example that "displays nothing" is not clear either. There is nothing in the code that would "display" anything. No wonder it "displays nothing". On top of that, you posted invalid code. C++ language does not allow function declarations without an explicit return type.
As for your last question, yes, the compiler is completely free to implement an inline function as "normal" function. It has nothing to do with function being "too large" though. It has everything to do with more-or-less complex heuristic criteria used by that specific compiler to make the decision about inlining a function. It can take the size into account. It can take other things into account.
You can inline recursive functions. The compiler normally unrolls them to a certain depth- in VS you can even have a pragma for this, and the compiler can also do partial inlining. It essentially converts it into loops. Also, as #Evan Teran said, the compiler is not forced to inline a function that you suggest at all. It might totally ignore you and that's perfectly valid.
The problem with the code is not in that inline function. The constantness or not of the argument is pretty irrelevant, I'm sure.
Also, seriously, get a new compiler. There's modern free compilers for whatever OS your laptop runs.
One thing to keep in mind - according to the standard, inline is a suggestion, not an absolute guarantee. In the case of a recursive function, the compiler would not always be able to compute the recursion limit - modern compilers are getting extremely smart, a previous response shows the compiler evaluating a constant inline and simply generating the result, but consider
bigint fac = factorialOf(userInput)
there's no way the compiler can figure that one out........
As a side note, most compilers tend to ignore inlines in debug builds unless specifically instructed not to do so - makes debugging easier
Tail recursions can be converted to loops as long as the compiler can satisfactorily rearrange the internal representation to get the recursion conditional test at the end. In this case it can do the code generation to re-express the recursive function as a simple loop
As far as issues like tail recursion rewrites, partial expansions of recursive functions, etc, these are usually controlled by the optimization switches - all modern compilers are capable of pretty signficant optimization, but sometimes things do go wrong.
Remember that the inline key word merely sends a request, not a command to the compiler. The compliler may ignore yhis request if the function definition is too long or too complicated and compile the function as normal function.
in some of the cases where inline functions may not work are
For functions returning values, if a loop, a switch or a goto exists.
For functions not returning values, if a return statement exists.
If function contains static variables.
If in line functions are recursive.
hence in C++ inline recursive functions may not work.