Which compiler evaluate left most parameter first - c++

I know the order that function parameters are evaluated is unspecified in C++, see below,
// The simple obvious one.
callFunc(getA(),getB());
Can be equivalent to this:
int a = getA();
int b = getB();
callFunc(a,b);
Or this:
int b = getB();
int a = getA();
callFunc(a,b);
This is perfect fine & I think most ppl know this.
But I have tried VC10, gcc 4.72 and they all evaluate b first (from right to left), meaning b got pushed into stack frame first then a.
I am just wondering which c++ compiler should I try to make the code above to evalute a first ?
So a got pushed to stack before b.
Thanks

The parameter evaluation order substantially depends from the calling convention used for calling the given function - if parameters are pushed on the stack RTL it's usually more convenient to elaborate the rightmost parameters first.
According to this table, on x86 the only calling convention available on IA32 with LTR parameter order on the stack is fastcall on Borland, that however passes the first three integer/pointer parameters in registers. So you should write a function that takes more than three integers, mark it as fastcall and compile it with a Borland compiler; in that case probably the other parameters besides the first three should be evaluated in LTR order.
Going on other platforms probably you'll find other calling conventions with LTR parameter passing (and so probably LTR parameters evaluation).
Notice that the parameter passing order <=> parameter evaluation order are logically bound, but if for some reason the compiler finds that it's better to evaluate some parameter before the others there's nothing in the standard preventing it to do so.

Related

What does it mean to say "Function Calls are Resolved at Compile Time"

I am trying to understand the meaning of the saying "Function calls are resolved at compiled time". I know about the concept of polymorphism and there i know that virtual functions are called(or resolve or whichever is the correct phrase i don't know the difference if any) at runtime.
I want to know what does it mean to say "compile time" or "runtime" in this(function call) context.
Below is an example and explanation of my current unerstanding which most probably is wrong.
const int n = 10; //STATEMENT 1
void func(int x)
{
//..
}
int main()
{
int p = 0; //this is dynamic initialization.
func(p);//STATEMENT 2
}
To my current understanding,
In statement 1, n is a constant expression and is known at compile time. So the compiler might replace all occurrences of n with 10. But it(compiler) is not required to do(replace all occurences) so. Is my explanation of this statement correct?
I used to think that p is passed at runtime(whatever that means) but then i read someone say that all function calls are resolved at compile time. And i got confused. So i am not sure if this function is called/resolved at runtime or compile time. I think the reason i got confused is because
a) I don't know the fundamental difference between compile time and runtime
b) I don't know the difference between the phrase function is "resolved" and p is passed at "runtime".
So can someone clear these concepts and provide necessary links so that i can learn more about them.
dynamic initialization happens at runtime and static initialization happens at compile time.
compile time means hardcoded/embedded into the executable. But then when i wrote int p = 0; which is dynamic initialization and happens at runtime. But where does the integer literal 0 comes from so that it can be used to initialize the variable p at runtime. Was 0 also hardcoded into the binary and if so then why can't this statement happen(that is p can be statically initialized) at compile time.
The question might be too broad so i will appreciate(upvote) if anyone can answer any of the question.
As to the main question (and question 2) about the resolution of function calls: this means that, while compiling your statement func(p);, the compiler will immediately look for a previous definition of such a function. It must find one unambiguous match, or stop compiling and complain to you. That's it!
If this resolution succeeds then yes, the call would normally occur at runtime, while executing your program. However, as mentioned, in truth you only get the deliberately loose promise that your program will behave as if this is the case.
So, to the question in point 4: no, "compile time" by itself does not necessarily imply that something will become embedded in the executable. If it does, for int p = 0; there will be an instruction to store a 0 somewhere. The value 0 comes from you, because you asked for it! Similarly, you asked for p to be a local variable, so the compiler will treat it as such. But if it can deduce that you don't actually use it, it will happily ignore it. You can see this for your own example at Compiler Explorer, where on the right is the result with optimisations enabled. You could say the value is (potentially) hardcoded, but only as part of an instruction to store it during execution.
However, while p is initialised at runtime, it's not the same as dynamic initialisation, which applies specifically to non-local variables. Even static initialisation is not actually required to happen at compile time (point 3), though it can be.
This relates back to point 1, where yes, your explanation is correct! It can be done at compile time, or as part of runtime static initialisation. You can observe in Compiler Explorer that GCC does the former: you don't see any instructions for storing n even if you use its value somewhere. In those cases, you will just see the value 10 being stored directly, as if initialised by a literal. If you initialise it with some non-constant expression, you will instead see instructions to store and retrieve that value.
Finally, unless you plan to develop compilers or other software "close to the metal", you will be fine without knowing everything the compiler does! Not that curiosity is bad, of course. But on this topic I think it's more than sufficient to know about storage duration and the "static initialisation order problem".

How not specify an exact order of evaluation of function argument helps C & C++ compiler to generate optimized code?

#include <iostream>
int foo() {
std::cout<<"foo() is called\n";
return 9;
}
int bar() {
std::cout<<"bar() is called\n";
return 18;
}
int main() {
std::cout<<foo()<<' '<<bar()<<' '<<'\n';
}
// Above program's behaviour is unspecified
// clang++ evaluates function arguments from left to right: http://melpon.org/wandbox/permlink/STnvMm1YVrrSRSsB
// g++ & MSVC++ evaluates function arguments from right to left
// so either foo() or bar() can be called first depending upon compiler.
Output of above program is compiler dependent. Order in which function arguments are evaluated is unspecified. The reason I've read about this is that it can result in highly optimized code. How not specify an exact order of evaluation of function argument helps compiler to generate optimized code?
AFAIK, the order of evaluation is strictly specified in languages such as Java, C#, D etc.
I think the whole premise of the question is wrong:
How not specify an exact order of evaluation of function argument helps C & C++ compiler to generate optimized code?
It is not about optimizing code (though it does allow that). It is about not penalizing compilers because the underlying hardware has certain ABI constraints.
Some systems depend on parameters being pushed to stack in reverse order while others depend on forward order. C++ runs on all kinds of systems with all kinds on constraints. If you enforce an order at the language level you will require some systems to pay a penalty to enforce that order.
The first rule of C++ is "If you don't use it then you should not have to pay for it". So enforcing an order would be a violation of the prime directive of C++.
It doesn't. At least, it doesn't today. Maybe it did in the past.
A proposal for C++17 suggests defining left-right evaluation order for function calls, operator<< and so on.
As described in Section 7 of that paper, this proposal was tested by compiling the Windows NT kernel, and it actually led to a speed increase in some benchmarks. The authors' comment:
It is worth noting that these results are for the worst case scenario where the optimizers have not yet been updated to be aware of, and take advantage of the new evaluation rules and they are blindly forced to evaluate function calls from left to right.
suggests that there is further room for speed improvement.
The order of evaluation is related to the way arguments are passed. If stack is used to pass the arguments, evaluating right to left helps performance, since this is how arguments are pushed into the stack.
For example, with the following code:
void foo(bar(), baz());
Assuming calling conevention is 'passing arguments through the stack', C calling convention requires arguments to be pushed into stack starting from the last one - so that when callee function reads it, it would pop the first argument first and be able to support variadic functions. If order of evaluation was left to right, a result of bar() would have to be saved in the temporary, than baz() called, it's result pushed, following by temporary push. However, right-to-left evaluation allows compiler to avoid the temporary.
If arguments are passed through registers, order of evaluation is not overly important.
The original reason that the C and C++ standards didn't specify and an order of evaluation for function arguments is to provide more optimization opportunities for the compiler. Unfortunately, this rationale has not been backed up by extensive experimentation at the time these languages were initially designed. But it made sense.
This issue has been raised in the past few years. See this blog post by Herb Sutter and don't forget to go through the comments.
Proposal P0145R1 suggests that it's better to specify an order of evaluation for function arguments and for other operators. It says:
The order of expression evaluation, as
it is currently specified in the standard, undermines advices,
popular programming idioms, or the relative safety of standard library
facilities. The traps aren’t just for novices or the careless
programmer. They affect all of us indiscriminately, even when we know
the rules.
You can find more information about how this affects optimization opportunities in that document.
In the past few months, there has been a very extensive discussion about how this change in the language affects optimization, compatibility and portability. The thread begins here and continues here. You can find there numerous examples.

calling convention and evaluation order

I know that C++ doesn't specify the order in which parameters are passed to a function. But if we write the following code:
void __cdecl func(int a, int b, int c)
{
printf("%d,%d,%d", a,b,c);
}
int main()
{
int i=10;
func(++i, i, ++i);
}
Can we reliably say the output would be 12,11,11 since the __cdecl ensures that argument-passing order is right to left?
As per the Standard, there are two things you need to understand and differentiate:
C++ doesn't specify the order in which parameters are passed to a
function (as you said yourself, that is true!)
C++ doesn't specify the order in which the function arguments are evaluated [expr.call].
Now, please note, __cdecl ensures only the first, not the second. Calling conventions decide only how the functions arguments will be passed, left-to-right or right-to-left; they can still be evaluated in ANY order!
Hope this clarifies your doubts regarding the calling conventions.
However, since these conventions are Microsoft compiler extension to C++, so your code is non-portable. In that case, you can see how MSVC++ compiler evaluates function arguments and be relax IF you don't want to run the same code on other platform!
func(++i, i, ++i);
Note that this particular code invokes undefined behavior, because i is incremented more than once without any intervening any sequence point.
No, you cannot assume that.
An optimizing compiler will inline short functions. All that __stdcall will garantee is that the __stdcall version of function will be generated, but this does not mean that compiler cannot also inline it in the same time.
If you really want to be sure it's not inlined, you have to declare it in another compilation unit, alough linker optimizations could inline even in this case.
Also, the order of parameters on the stack have nothing to do with the order they are evaluated. For example, for a function call fn(a, b, c) GCC usually won't do
push c
push b
push a
call fn
but rather
sub esp, 0xC
mov [esp+8], c
mov [esp+4], b
mov [esp], a
call fn
Note that in the second case it has no restrictions on the order.
It is possible for a particular C implementation to define what a compiler will do in certain cases which would, per the standard, be "undefined behavior". For example, setting an int variable to ~0U would constitute undefined behavior, but there's nothing in the C standard that wouldn't allow a compiler to evaluate the int as -1 (or -493, for that matter). Nor is there anything that would forbid a particular compiler vendor from stating that their particular compiler will in fact set the variable to -1. Since __cdecl isn't defined in the C standard, and is only applicable to certain compilers, the question of how its semantics are defined is up to those vendors; since the C standard lists it as undocumented behavior, it will only be documented to the extent particular vendors document it.
You are changing the same variable more than once between sequence points (function argument evaluation is one sequence point), which cause undefined behaviour regardless of calling convention.

Compiler Optimization with Parameters

Lets say you have some functions in some classes are called together like this
myclass::render(int offset_x, int offset_y)
{
otherClass.render(offset_x, offset_y)
}
This pattern will repeat for a while possibly through 10+ classes, so my question is:
Are modern C++ compilers smart enough to recognise that wherever the program stores function parameters - From what wikipedia tells me it seems to vary depending on parameter size, but that for a 2 parameter function the processor register seems likely - doesn't need to be overridden with new values?
If not I might need to look at implementing my own methods
I think it's more likely that the compiler will make a larger-scale optimization. You'd have to examine the actual machine code produced, but for example the following trivial attempt:
#include <iostream>
class B {
public:
void F( int x, int y ) {
std::cout << x << ", " << y << std::endl;
}
};
class A {
B b;
public:
void F( int x, int y ) {
b.F( x, y );
}
};
int main() {
A a;
a.F( 32, 64 );
}
causes the compiler (cl.exe from VS 2010, empty project, vanilla 'Release' configuration) to produce assembly that completely inlines the call tree; you basically get "push 40h, push 20h, call std::operator<<."
Abusing __declspec(noinline) causes cl.exe to realize that A::F just forwards to B::F and the definition of A::F is nothing but "call A::F" without stack or register manipulation at all (so in that case, it has performed the optimization you're asking about). But do note that my example is extremely contrived and so says nothing about the compiler's ability to do this well in general, only that it can be done.
In your real-world scenario, you'll have to examine the disassembly yourself. In particular, the 'this' parameter needs to be accounted for (cl.exe usually passes it via the ECX register) -- if you do any manipulation of the class member variables that may impact the results.
Yes, it is. The compiler performs dataflow analysis before register allocation, keeping track of which data is where at which time. And it will see that the arg0 location contains the value that needs to be in the arg0 location in order to call the next function, and so it doesn't need to move the data around.
I'm not a specialist, but it looks a lot like the perfect forwarding problem that will be solved in the next standard (C++0x) by using rvalue-references.
Currently I'd say it depend on the compiler, but I guess if the function and the parametters are simple enough then yes the function will serve as a shortcut.
If this function is imlpemented directly in the class definition (and then becoming implicitely candidate for inlining) it might be inligned, making the call directly call the wanted function instead of having two runtime calls.
In spite of your comment, I think that inlining is germane to this discussion. I don't believe that C++ compilers will do what you're asking (reuse parameters on the stack) UNLESS it also inlines the method completely.
The reason is that if it's making a real function call it still has to put the return address onto the stack, thus making the previous call's parameters no longer at the expected place on the stack. Thus in turn is has to put the parameters back on the stack a second time.
However I really wouldn't worry about that. Unless you're making a ridiculous number of function calls like this AND profiling shows that it's spending a large proportion of its time on these calls they're probably extremely minimal overhead and you shouldn't worry about it. For a function that small however, mark it inline and let the compiler decide if it can inline it away completely.
If I understand the question correctly, you are asking "Are most compilers smart enough to inline a simple function like this", and the answer to that question is yes. Note however the implicit this paremeter which is part of your function (because your function is part of a class), so it might not be completely inlineable if the call level is deep enough.
The problem with inlining is that the compiler will probably only be able to do this for a given compilation unit. The linker is probably less likely to be clever enough to inline from one compilation unit to another.
But given the total trivial nature of the function and that both functions have exactly the same arguments in the same order, the cost of the function call will probably be only one machine instruction viz. an additional branch (or jump) to the true implementation. There is no need to even push the return address onto the stack.

In C++, do variadic functions (those with ... at the end of the parameter list) necessarily follow the __cdecl calling convention?

I know that __stdcall functions can't have ellipses, but I want to be sure there are no platforms that support the stdarg.h functions for calling conventions other than __cdecl or __stdcall.
The calling convention has to be one where the caller clears the arguments from the stack (because the callee doesn't know what will be passed).
That doesn't necessarily correspond to what Microsoft calls "__cdecl" though. Just for example, on a SPARC, it'll normally pass the arguments in registers, because that's how the SPARC is designed to work -- its registers basically act as a call stack that gets spilled to main memory if the calls get deep enough that they won't fit into register anymore.
Though I'm less certain about it, I'd expect roughly the same on IA64 (Itanium) -- it also has a huge register set (a couple hundred if memory serves). If I'm not mistaken, it's a bit more permissive about how you use the registers, but I'd expect it to be used similarly at least a lot of the time.
Why does this matter to you? The point of using stdarg.h and its macros is to hide differences in calling convention from your code, so it can work with variable arguments portably.
Edit, based on comments: Okay, now I understand what you're doing (at least enough to improve the answer). Given that you already (apparently) have code to handle the variations in the default ABI, things are simpler. That only leaves the question of whether variadic functions always use the "default ABI", whatever that happens to be for the platform at hand. With "stdcall" and "default" as the only options, I think the answer to that is yes. Just for example, on Windows, wsprintf and wprintf break the rule of thumb, and uses cdecl calling convention instead of stdcall.
The most definitive way that you can determine this is to analyze the calling conventions. For variadic functions to work, your calling convention needs a couple of attributes:
The callee must be able to access the parameters that aren't part of the variable argument list from a fixed offset from the top of the stack. This requires that the compiler push the parameters onto the stack from right to left. (This includes such things as the first parameter to printf, the format specification. Also, the address of the variable argument list itself must also be derived from a known location.)
The caller must be responsible for removing the parameters off the stack once the function has returned, because only the compiler, while generating the code for the caller, knows how many parameters were pushed onto the stack in the first place. The variadic function itself does not have this information.
stdcall won't work because the callee is responsible for popping parameters off the stack. In the old 16-bit Windows days, pascal wouldn't work because it pushed parameters onto the stack from left to right.
Of course, as the other answers have alluded to, many platforms don't give you any choice in terms of calling convention, making this question irrelevant for those ones.
Consider the following function on an x86 system:
void __stdcall something(char *, ...);
The function declares itself as __stdcall, which is a callee-clean convention. But a variadic function cannot be callee-clean since the callee does not know how many parameters were passed, so it doesn’t know how many it should clean.
The Microsoft Visual Studio C/C++ compiler resolves this conflict by silently converting the calling convention to __cdecl, which is the only supported variadic calling convention for functions that do not take a hidden this parameter.
Why does this conversion take place silently rather than generating a warning or error?
My guess is that it’s to make the compiler options /Gr (set default calling convention to __fastcall) and /Gz (set default calling convention to __stdcall) less annoying.
Automatic conversion of variadic functions to __cdecl means that you can just add the /Gr or /Gz command line switch to your compiler options, and everything will still compile and run (just with the new calling convention).
Another way of looking at this is not by thinking of the compiler as converting variadic __stdcall to __cdecl but rather by simply saying “for variadic functions, __stdcall is caller-clean.”
click here
AFAIK, the diversity of calling conventions is unique to DOS/Windows on x86. Most other platforms had compilers come with the OS and standardize the convention.
Do you mean 'platforms supported by MSVC" or as a general rule? Even if you confine yourself to the platforms supported by MSVC, you still have situations like IA64 and AMD64 where there is only "one" calling convention, and that calling convention is called __stdcall, but it's certainly not the same __stdcall you get on x86.