Function argument evaluation order [duplicate] - c++

This question already has answers here:
Order of evaluation in C++ function parameters
(6 answers)
Closed 7 years ago.
I'm confused about in what order function arguments are evaluated when calling a C++ function. I have probably interepreted something wrong, so please explain if that is the case.
As an example, the legendary book "Programming Windows" by Charles Petzold contains code like this:
// hdc = handle to device context
// x, y = coordinates of where to output text
char szBuffer[64];
TextOut(hdc, x, y, szBuffer, snprintf(szBuffer, 64, "My text goes here"));
Now, the last argument is
snprintf(szBuffer, 64, "My text goes here")
which returns the number of characters written to the char[] szBuffer. It also writes the text "My text goes here" to the char[] szBuffer.
The fourth argument is szBuffer, which contains the text to be written. However, we can see that szBuffer is filled in the fifth argument, telling us that somehow is the expression
// argument 5
snprintf(szBuffer, 64, "My text goes here")
evaluated before
// argument 4
szBuffer
Okay, fine. Is this always the case? Evaluation is always done from right to left? Looking at the default calling convention __cdecl:
The main characteristics of __cdecl calling convention are:
Arguments are passed from right to left, and placed on the stack.
Stack cleanup is performed by the caller.
Function name is decorated by prefixing it with an underscore character '_' .
(Source: Calling conventions demystified)
(Source: MSDN on __cdecl)
It says "Arguments are passed from right to left, and placed on the stack".
Does this mean that the rightmost/last argument in a function call is always evaluated first? Then the next to last etc? The same goes for the calling convention __stdcall, it also specified a right-to-left argument passing order.
At the same time, I came across posts like this:
How are arguments evaluated in a function call?
In that post the answers say (and they're quoting the standard) that the order is unspecified.
Finally, when Charles Petzold writes
TextOut(hdc, x, y, szBuffer, snprintf(szBuffer, 64, "My text goes here"));
maybe it doesn't matter? Because even if
szBuffer
is evaluated before
snprintf(szBuffer, 64, "My text goes here")
the function TextOut is called with a char* (pointing to the first character in szBuffer), and since all arguments are evaluated before the TextOut function proceeds it doesn't matter in this particular case which gets evaluated first.

In this case it does not matter.
By passing szBuffer to a function that accepts a char * (or char const *) argument, the array decays to a pointer. The pointer value is independent of the actual data stored in the array, and the pointer value will be the same in both cases no matter whether the fourth or fifth argument to TextOut() gets fully evaluated first. Even if the fourth argument is fully evaluated first, it will evaluate as a pointer to data -- the pointed-to data is what gets changed, not the pointer itself.
To answer your posed question: the actual order of argument evaluation is unspecified. For example, in the statement f(g(), h()), a compliant compiler can execute g() and h() in any order. Further, in the statement f(g(h()), i()), the compiler can execute the three functions g, h, and i in any order with the constraint that h() gets executed before g() -- so it could execute h(), then i(), then g().
It just happens that in this specific case, evaluation order of arguments is wholly irrelevant.
(None of this behavior is dependent on calling convention, which only deals with how the arguments are communicated to the called function. The calling convention does not address in any way the order in which those arguments are evaluated.)

I would agree that it depends on the calling convention, because the standard does not specify the order.
See also: Compilers and argument order of evaluation in C++
And I would also agree that is does not matter in this case, because the snprintf is always evaluated before the TextOut - and the buffer gets filled.

Related

How are the argc and argv values passed to main() set up?

I want to better understand what's going on under the hood with the command line arguments when a C or C++ program is launched. I know, of course, that argc and argv, when passed to main(), represent the argument count and argument vector, respectively.
What I'm trying to figure out is how the compiler knows to interpret int argc as the number of arguments passed from the command line. If I write a simple function that attempts to mimic main() (e.g. int testfunc(int argc, char* argv[])), and pass in a string, the compiler complains, "Expected 'int' but argument is of type char*" as I would expect. How is this interpreted differently when command line arguments are passed to main()?
In common C implementations, main is not the first routine called when your process starts. Usually, it is some special entry point like _start that is provided by the C library built into your program when you link it. The code at this special entry point examines the command line information that is passed to it (in some way outside of C, particular to the operating system) and constructs an argument list for main. After that and other work, it calls main.
You don't pass argc value on your own (from the command line, for example), it is supplied by your environment (runtime), just like the exact content for argc.[Note below]
To elaborate, C11, chapter §5.1.2.2.1, (indicators mine)
The value of argc shall be nonnegative.
argv[argc] shall be a null pointer.
If the value of argc is greater than zero, the array members argv[0] through
argv[argc-1] inclusive shall contain pointers to strings, which are given
implementation-defined values by the host environment prior to program startup. The
intent is to supply to the program information determined prior to program startup
from elsewhere in the hosted environment. [Note start]If the host environment is not capable of
supplying strings with letters in both uppercase and lowercase, the implementation
shall ensure that the strings are received in lowercase.[Note end]

What is the name of this C++ functionality?

I was writing some C++ code and mistakenly omitted the name of a function WSASocket. However, my compiler did not raise an error and associated my SOCKET with the integer value 1 instead of a valid socket.
The code in question should have looked like this:
this->listener = WSASocket(address->ai_family, address->ai_socktype, address->ai_protocol, NULL, NULL, WSA_FLAG_OVERLAPPED);
But instead, it looked like this:
this->listener = (address->ai_family, address->ai_socktype, address->ai_protocol, NULL, NULL, WSA_FLAG_OVERLAPPED);
Coming from other languages, this looks like it may be some kind of anonymous type. What is the name of the feature, in the case it is really a feature?
What is its purpose?
It's difficult to search for it, when you don't know where to begin.
The comma operator† evaluates the left hand side, discards its value, and as a result yields the right hand side. WSA_FLAG_OVERLAPPED is 1, and that is the result of the expression; all the other values are discarded. No socket is ever created.
† Unless overloaded. Yes, it can be overloaded. No, you should not overload it. Step away from the keyboard, right now!
The comma operator is making sense of your code.
You are effectively setting this->listener = WSA_FLAG_OVERLAPPED; which just happens to be syntatically valid.
The compiler is evaluating each sequence point in turn within the parenthesis and the result is the final expression, WSA_FLAG_OVERLAPPED in the expression.
The comma operator , is a sequence point in C++. The expression to the left of the comma is fully evaluated before the expression to the right is. The result is always the value to the right. When you've got an expression of the form (x1, x2, x3, ..., xn) the result of the expression is always xn.

va_start as array

According to what I read about the va_arg macro, it that retrieves the next argument pointed by the argument list. Is there any way to choose the index of the argument I want to get, like an array index?
For example I need to do an operation where I need to call at least 3 times the va_arg macro but I want those 3 times to retrieve the same argument and not the next one on the list. One solution could be using a function and passing the argument, but I don't want that.
Also if there is no other macros able to do this, how can I reference to the start of the array arguments by a pointer? I know its not portable and not type safe, etc, etc. Just for the sake of learning.
Here is an example code of how i want to implement it:
bool SQLBase::BindQuery (char* query, int NumArgs, ...)
{
va_list argList;
va_start(argList, NumArgs);
SQLPrepare (hstmt, query, SQL_NTS);
for (int x = 0; x < NumArgs; x++)
{
SQLBindParameter (hstmt, (x+1), GetTypeParameter (va_arg(argList, SQLPOINTER*), SQL_C_CHAR, SQL_CHAR, 10, 0, va_arg(argList, SQLPOINTER*), va_arg(argList, SQLLEN), &recvsize[x]);
}
The va_arg is called 3 times for the SQLBindParameter function and i want the first 2 times to point to the same argument, not increasing the count member on the argument list.
First of all, calling va_arg multiple times in your function invocation is hairy, since you don't know in which order these calls happen. You need to do this beforehand, so your arguments are retrieved in the correct order.
Second, no: there is no array-style usage auf va_list. This is because va_list doesn't know a thing about the arguments on the stack; you are supplying the type in your va_arg calls, and va_arg can then increase the (internal/conceptual) pointer contained in the va_list because it knows the size of that argument. Getting to the third argument would require you to supply the types of the first two.
If all the arguments are the same size (like "void*") you can always just make a loop that calls va_arg the appropiate number of times. This is "kind of" portable if you can be reasonably sure that your arguments are in fact the same size. I'm not too confident that doing this would be the best course of action, though -- the need to do it might indicate that a different setup would be more appropiate, like passing an array in the first place instead of using a variable argument function.
You can also just take the address of a function argument and assume they are on the stack in some order. This is horribly unportable since you need to know about calling conventions which can vary between compilers, and may even change based on compilation options. I would definitely advise to NOT do something like this.

Is this undefined behavior in C/C++ (Part 2) [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Undefined behavior and sequence points
(5 answers)
Closed 5 years ago.
What does the rule about sequence points say about the following code?
int main(void) {
int i = 5;
printf("%d", ++i, i); /* Statement 1 */
}
There is just one %d. I am confused because I am getting 6 as output in compilers GCC, Turbo C++ and Visual C++. Is the behavior well defined or what?
This is related to my last question.
It's undefined because of 2 reasons:
The value of i is twice used without an intervening sequence point (the comma in argument lists is not the comma operator and does not introduce a sequence point).
You're calling a variadic function without a prototype in scope.
The number of arguments passed to printf() are not compatible with the format string.
the default output stream is usually line buffered. Without a '\n' there is no guarantee the output will be effectively output.
All arguments get evaluated when calling a function, even if they are not used, so, since the order of evaluation of function arguments is undefined, you have UB again.
I think it's well defined. The printf matches the first % placeholder to the first argument, which in this instance is a preincremented variable.
All arguments are evaluated. Order not defined. All implementations of C/C++ (that I know of) evaluate function arguments from right to left. Thus i is usually evaluated before ++i.
In printf, %d maps to the first argument. The rest are ignored.
So printing 6 is the correct behaviior.
I believe that the right-to-left evaluation order has been very very old (since the first C compilers). Certainly way before C++ was invented, and most implementations of C++ would be keeping the same evaluation order because early C++ implementations simply translates into C.
There are some technical reasons for evaluating function arguments right-to-left. In stack architectures, arguments are typically pushed onto the stack. In C, you can call a function with more arguments than actually specified -- the extra arguments are simiply ignored. If arguments are evaluated left-to-right, and pushed left-to-right, then the stack slot right under the stack pointer will hold the last argument, and there is no way for the function to get at the offset of any particular argument (because the actual number of arguments pushed depends on the caller).
In a right-to-left push order, the stack slot right under the stack pointer will always hold the first argument, and the next slot holds the second argument etc. Argument offsets will always be deterministic for the function (which may be written and compiled elsewhere into a library, separately from where it is called).
Now, right-to-left push order does not mandate right-to-left evaluation order, but in early compilers, memory is scarce. In right-to-left evaluation order, the same stack can be used in-place (essentially, after evaluating the argument -- which may be an expression or a funciton call! -- the return value is already at the right position on the stack). In left-to-right evaluation, the argument values must be stored separately and the pushed back to the stack in reverse order.
Would be interested to know the true history behind right-to-left evaluation though.
According to this documentation, any additional arguments passed to a format string shall be ignored. It also mentions for fprintf that the argument will be evaluated then ignored. I'm not sure if this is the case with printf.

Implementing a stack based virtual machine for a subset of C

Hello everyone I'm currently implementing a simple programming language for learning experience but I'm in need of some advice. Currently I'm designing my Interpreter and I've come into a problem.
My language is a subset of C and I'm having a problem regarding the stack interpreter implementation. In the language the following will compile:
somefunc ()
{
1 + 2;
}
main ()
{
somefunc ();
}
Now this is alright but when "1+2" is computed the result is pushed onto a stack and then the function returns but there's still a number on the stack, and there shouldn't be. How can I get around this problem?
I've thought about saving a "state" of the stack before a function call and restoring the "state" after the function call. For example saving the number of elements on the stack, then execute the function code, return, and then pop from the stack until we have the same number of elements as before (or maybe +1 if the function returned something).
Any ideas? Thanks for any tips!
Great question! One of my hobbies is writing compilers for toy languages, so kudos for your excellent programming taste.
An expression statement is one where the code in the statement is simply an expression. This means anything of the form <expression> ;, which includes things like assignments and function calls, but not ifs, whiles, or returns. Any expression statement will have a left over value on the stack at the end, which you should discard.
1 + 2 is an expression statement, but so are these:
x = 5;
The assignment expression leaves the value 5 on the stack since the result of an assignment is the value of the left-hand operand. After the statement is finished you pop off the unused value 5.
printf("hello world!\n");
printf() returns the number of characters output. You will have this value left over on the stack, so pop it when the statement finishes.
Effectively every expression statement will leave a value on the stack unless the expression's type is void. In that case you either special-case void statements and don't pop anything afterwards, or push a pretend "void" value onto the stack so you can always pop a value.
You'll need a smarter parser. When you see an expression whose value isn't being used then you need to emit a POP.
This is an important opportunity on learning optimization. you have a function that does number but integer math, the int math result isn't even used in any way, shape, or form.
Having your compiler optimize the function away would reduce alot of bytecode being generated and executed for nothing!