Variable arguments weirdness - c++

Ok, so this piece of code works fine in the Debug but not in Release build.
int AddString(LPCTSTR lpszString, ...)
{
RArray<LPCTSTR> strings;
va_list args;
va_start(args, lpszString);
do
{
strings.Add(lpszString);
} while (lpszString = va_arg(args, LPCTSTR));
va_end(args);
// ... rest of code ...
}
It seems in Release, va_arg just returns an extra value containing rubbish. So if I pass on 3 parameters: I fetch 3 in Debug and miraculously 4 in Release... How is this possible? Using VS2010 btw.
(RArray is just a simple template class comparable to MFC's CArray, does not influence results)
Thanks!
Edit: I call it like this
AddString(_T("Hello, world!"), _T("Hallo, wereld!"), _T("Hallo, Welt!"));

You're doing it the wrong way and you're just lucky with the debug build.
Notice that va_arg does not determine either whether the retrieved argument
is the last argument passed to the function (or even if it is an element
past the end of that list). The function should be designed in such a way
that the amount of parameters can be inferred in some way by the values of
either the named parameters or the additional arguments already read.
Supply either the length of the list in an integer or pass a NULL at the end of the list.

You're not supplying the final NULL argument your function expects. You have to do this yourself at the point where you call AddString:
AddString(_T("Hello, world!"), _T("Hallo, wereld!"), _T("Hallo, Welt!"), NULL);
It is likely that the debug build zeros out some memory that the release build doesn't. This could explain why your code works in debug but not in release.
Also, you might want to convert the do {} while (...) loop into while (...) {}, to make sure your code doesn't malfunction if no optional arguments are given.

va_arg does not offer any guarantee to return 0 after last real argument.
If you're going to use the C style variable arguments, then you need to establish some way of determining the number of arguments, e.g. a count or a terminating zero.
For example, printf determines the arguments from the format specification argument.
In C++ you can often, instead, use chained calls, such as the operator<< calls used for standard iostreams.
The simple, basic idea is that the operator or function returns a reference to the object that it is called on, so that further operator or function calls can be appended.
Cheers,

Related

Handling "incompatibily" overloaded names in Cppcheck

I'm stuck with a "conflict" between with AnsiStrings sprintfmember function and Cppcheck's built-insprintf` knowledge.
In cases like this,
const char* name = "X";
int version = 1;
return AnsiString().sprintf("%s.H%02d", name, version); // <-- HERE
I'm getting this warning in the Cppcheck GUI
Id: wrongPrintfScanfArgNum
Summary: sprintf format string requires 0 parameters but 1 is given.
Message: sprintf format string requires 0 parameters but 1 is given.
which shows that Cppcheck is talking about the sprintf function, but I'm using a member function of the VCL class AnsiString with the same name.
As to get rid of this false positive, I could use
an inline suppression: // cppcheck-suppress wrongPrintfScanfArgNum
an intermediate variable: AnsiString result; result.printf(...); return result;
the sprintf function, which means handle the buffer space manually
But all these options work local, and make the code harder to read/maintain.
How can I teach Cppcheck to differentiate between overloaded names?
Edits:
I wrote override, but meant overloading, I corrected that in the current text.
added literal initialization of variables, which is important for name
Interesting.
Yes I agree this should be reported in http://trac.cppcheck.net. Looks like bugs.
I can see 2 bugs.
The AST does not show proper type information for the 'AnsiString()' even when I add a AnsiString class.
The Library should not match sprintf in that code. It is clear that some method is called.
It really is a bug.[1] Cppcheck 1.75 is smart in checking format strings, but obviously only in some cases, one of them is the second parameter of every function called printf, so the problem has nothing to do with AnsiString::sprintf but with every alternate implementation.
[1]
#7726 (False positive: format string checked for every function called 'sprintf') – Cppcheck

Technically, how do variadic functions work? How does printf work?

I know I can use va_arg to write my own variadic functions, but how do variadic functions work under the hood, i.e. on the assembly instruction level?
E.g., how is it possible that printf takes a variable number of arguments?
* No rule without exception. There is no language C/C++, however, this question can be answered for both of them
* Note: Answer originally given to How can printf function can take variable parameters in number while output them?, but it seems it did not apply to the questioner
The C and C++ standard do not have any requirement on how it has to work. A complying compiler may well decide to emit chained lists, std::stack<boost::any> or even magical pony dust (as per #Xeo's comment) under the hood.
However, it is usually implemented as follows, even though transformations like inlining or passing arguments in the CPU registers may not leave anything of the discussed code.
Please also note that this answer specifically describes a downwards growing stack in the visuals below; also, this answer is a simplification just to demonstrate the scheme (please see https://en.wikipedia.org/wiki/Stack_frame).
How can a function be called with a non-fixed number of arguments
This is possible because the underlying machine architecture has a so-called "stack" for every thread. The stack is used to pass arguments to functions. For example, when you have:
foobar("%d%d%d", 3,2,1);
Then this compiles to an assembler code like this (exemplary and schematically, actual code might look different); note that the arguments are passed from right to left:
push 1
push 2
push 3
push "%d%d%d"
call foobar
Those push-operations fill up the stack:
[] // empty stack
-------------------------------
push 1: [1]
-------------------------------
push 2: [1]
[2]
-------------------------------
push 3: [1]
[2]
[3] // there is now 1, 2, 3 in the stack
-------------------------------
push "%d%d%d":[1]
[2]
[3]
["%d%d%d"]
-------------------------------
call foobar ... // foobar uses the same stack!
The bottom stack element is called the "Top of Stack", often abbreviated "TOS".
The foobar function would now access the stack, beginning at the TOS, i.e. the format string, which as you remember was pushed last. Imagine stack is your stack pointer , stack[0] is the value at the TOS, stack[1] is one above the TOS, and so forth:
format_string <- stack[0]
... and then parses the format-string. While parsing, it recognozies the %d-tokens, and for each, loads one more value from the stack:
format_string <- stack[0]
offset <- 1
while (parsing):
token = tokenize_one_more(format_string)
if (needs_integer (token)):
value <- stack[offset]
offset = offset + 1
...
This is of course a very incomplete pseudo-code that demonstrates how the function has to rely on the arguments passed to find out how much it has to load and remove from the stack.
Security
This reliance on user-provided arguments is also one of the biggest security issues present (see https://cwe.mitre.org/top25/). Users may easily use a variadic function wrongly, either because they did not read the documentation, or forgot to adjust the format string or argument list, or because they are plain evil, or whatever. See also Format String Attack.
C Implementation
In C and C++, variadic functions are used together with the va_list interface. While the pushing onto the stack is intrinsic to those languages (in K+R C you could even forward-declare a function without stating its arguments, but still call it with any number and kind arguments), reading from such an unknown argument list is interfaced through the va_...-macros and va_list-type, which basically abstracts the low-level stack-frame access.
Variadic functions are defined by the standard, with very few explicit restrictions. Here is an example, lifted from cplusplus.com.
/* va_start example */
#include <stdio.h> /* printf */
#include <stdarg.h> /* va_list, va_start, va_arg, va_end */
void PrintFloats (int n, ...)
{
int i;
double val;
printf ("Printing floats:");
va_list vl;
va_start(vl,n);
for (i=0;i<n;i++)
{
val=va_arg(vl,double);
printf (" [%.2f]",val);
}
va_end(vl);
printf ("\n");
}
int main ()
{
PrintFloats (3,3.14159,2.71828,1.41421);
return 0;
}
The assumptions are roughly as follows.
There must be (at least one) first, fixed, named argument. The ... actually does nothing, except tell the compiler to do the right thing.
The fixed argument(s) provide information about how many variadic arguments there are, by an unspecified mechanism.
From the fixed argument it is possible for the va_start macro to return an object that allows arguments to be retrieved. The type is va_list.
From the va_list object it is possible for va_arg to iterate over each variadic argument, and coerce its value it into a compatible type.
Something weird might have happened in va_start so va_end makes things right again.
In the most usual stack-based situation, the va_list is merely a pointer to the arguments sitting on the stack, and va_arg increments the pointer, casts it and dereferences it to a value. Then va_start initialises that pointer by some simple arithmetic (and inside knowledge) and va_end does nothing. There is no strange assembly language, just some inside knowledge of where things lie on the stack. Read the macros in the standard headers to find out what that is.
Some compilers (MSVC) will require a specific calling sequence, whereby the caller will release the stack rather than the callee.
Functions like printf work exactly like this. The fixed argument is a format string, which allows the number of arguments to be calculated.
Functions like vsprintf pass the va_list object as a normal argument type.
If you need more or lower level detail, please add to the question.

Name variable Lua

I have the following code in Lua:
ABC:
test (X)
The test function is implemented in C + +. My problem is this: I need to know what the variable name passed as parameter (in this case X). In C + + only have access to the value of this variable, but I must know her name.
Help please
Functions are not passed variables; they are passed values. Variables are just locations that store values.
When you say X somewhere in your Lua code, that means to get the value from the variable X (note: it's actually more complicated than that, but I won't get into that here).
So when you say test(X), you're saying, "Get the value from the variable X and pass that value as the first parameter to the function test."
What it seems like you want to do is change the contents of X, right? You want to have the test function modify X in some way. Well, you can't really do that directly in Lua. Nor should you.
See, in Lua, you can return values from functions. And you can return multiple values. Even from C++ code, you can return multiple values. So whatever it is you wanted to store in X can just be returned:
X = test(X)
This way, the caller of the function decides what to do with the value, not the function itself. If the caller wants to modify the variable, that's fine. If the caller wants to stick it somewhere else, that's also fine. Your function should not care one way or the other.
Also, this allows the user to do things like test(5). Here, there is no variable; you just pass a value directly. That's one reason why functions cannot modify the "variable" that is passed; because it doesn't have to be a variable. Only values are passed, so the user could simply pass a literal value rather than one stored in a variable.
In short: you can't do it, and you shouldn't want to.
The correct answer is that Lua doesn't really support this, but there is the debug interface. See this question for the solution you're looking for. If you can't get a call to debug to work directly from C++, then wrap your function call with a Lua function that first extracts the debug results and then calls your C++ function.
If what you're after is a string representation of the argument, then you're kind of stuck in lua.
I'm thinking something like in C:
assert( x==y );
Which generates a nice message on failure. In C this is done through macros.
Something like this (untested and probably broken).
#define assert(X) if(!(X)) { printf("ASSERION FAILED: %s\n", #X ); abort(); }
Here #X means the string form of the arguments. In the example above that is "x==y". Note that this is subtly different from a variable name - its just the string used in the parser when expanding the macro.
Unfortunately there's no such corresponding functionality in lua. For my lua testing libraries I end up passing the stringified version as part of the expression, so in lua my code looks something like this:
assert( x==y, "x==y")
There may be ways to make this work as assert("x==y") using some kind of string evaluation and closure mechanism, but it seemed to tricky to be worth doing to me.
EDIT:
While this doesn't appear to be possible in pure lua, there's a patched version that does seem to support macros: http://lua-users.org/wiki/LuaMacros . They even have an example of a nicer assert.

c++ va_arg typecast issue

All,
I am writing a small c++ app and have been stumped by this issue. Is there a way to create (and later catch ) the error while accessing element from va_list macro using va_arg if element type is not expected. Eg:-
count=va_arg(argp,int);
if (count <= 0 || count > 30)
{
reportParamError(); return;
}
Now, if I am passing a typedef instead of int, I get garbage value on MS compiler but 95% of time count gets value 0 on gcc (on 64 bit sles10 sys). Is there a way I can enforce some typechecking, so that I get an error that can be caught in a catch block?
Any ideas on this would be very helpful to me. Or is there a better way to do this. The function prototype is:-
void process(App_Context * pActx, ...)
The function is called as
process(pAtctx,3,type1,type2,type3);
It is essential for pActx to be passed as 1st parameter and hence cannot pass count as 1st parameter.
Update-1
Ok, this sounds strange but nargs does not seem to part of va_list on sles10 gcc. I had to put in
#ifdef _WIN32
tempCount=va_arg(argp,int)
#endif
After using this, parameters following nargs do not get garbage values. However, this introduces compiler/platform based #ifdefs....Thanks Chris and Kristopher
If you know a count will always be passed as the second argument, then you could always change the signature to this:
void process(App_Context * pActx, int count, ...)
If that's not an option, then there is really no way to catch it. That's just how the variable-argument-list stuff works: there is no way for the callee to know what arguments are being passed, other than whatever information the caller passes.
If you look into how the va_arg macro and related macros are implemented, you may be able to figure out how to inspect all the stuff on the stack. However, this would not be portable, and it is not recommended except as a debugging aid.
You also might want to look into alternatives to variable-arguments, like function overloading, templates, or passing a vector or list of arguments.
No, there is no way. varargs doesn't provide any way to check the types of parameters passed in. You must only read them with the correct type which means that you need another way of communicating type information.
You are likely to be better off avoiding varargs functionality unless you really need it. It's only really a C++ feature for the sake of legacy functions such as printf and friends.

Function pointers and unknown number of arguments in C++

I came across the following weird chunk of code.Imagine you have the following typedef:
typedef int (*MyFunctionPointer)(int param_1, int param_2);
And then , in a function , we are trying to run a function from a DLL in the following way:
LPCWSTR DllFileName; //Path to the dll stored here
LPCSTR _FunctionName; // (mangled) name of the function I want to test
MyFunctionPointer functionPointer;
HINSTANCE hInstLibrary = LoadLibrary( DllFileName );
FARPROC functionAddress = GetProcAddress( hInstLibrary, _FunctionName );
functionPointer = (MyFunctionPointer) functionAddress;
//The values are arbitrary
int a = 5;
int b = 10;
int result = 0;
result = functionPointer( a, b ); //Possible error?
The problem is, that there isn't any way of knowing if the functon whose address we got with LoadLibrary takes two integer arguments.The dll name is provided by the user at runtime, then the names of the exported functions are listed and the user selects the one to test ( again, at runtime :S:S ).
So, by doing the function call in the last line, aren't we opening the door to possible stack corruption? I know that this compiles, but what sort of run-time error is going to occur in the case that we are passing wrong arguments to the function we are pointing to?
There are three errors I can think of if the expected and used number or type of parameters and calling convention differ:
if the calling convention is different, wrong parameter values will be read
if the function actually expects more parameters than given, random values will be used as parameters (I'll let you imagine the consequences if pointers are involved)
in any case, the return address will be complete garbage, so random code with random data will be run as soon as the function returns.
In two words: Undefined behavior
I'm afraid there is no way to know - the programmer is required to know the prototype beforehand when getting the function pointer and using it.
If you don't know the prototype beforehand then I guess you need to implement some sort of protocol with the DLL where you can enumerate any function names and their parameters by calling known functions in the DLL. Of course, the DLL needs to be written to comply with this protocol.
If it's a __stdcall function and they've left the name mangling intact (both big ifs, but certainly possible nonetheless) the name will have #nn at the end, where nn is a number. That number is the number of bytes the function expects as arguments, and will clear off the stack before it returns.
So, if it's a major concern, you can look at the raw name of the function and check that the amount of data you're putting onto the stack matches the amount of data it's going to clear off the stack.
Note that this is still only a protection against Murphy, not Machiavelli. When you're creating a DLL, you can use an export file to change the names of functions. This is frequently used to strip off the name mangling -- but I'm pretty sure it would also let you rename a function from xxx#12 to xxx#16 (or whatever) to mislead the reader about the parameters it expects.
Edit: (primarily in reply to msalters's comment): it's true that you can't apply __stdcall to something like a member function, but you can certainly use it on things like global functions, whether they're written in C or C++.
For things like member functions, the exported name of the function will be mangled. In that case, you can use UndecorateSymbolName to get its full signature. Using that is somewhat nontrivial, but not outrageously complex either.
I do not think so, it is a good question, the only provision is that you MUST know what the parameters are for the function pointer to work, if you don't and blindly stuff the parameters and call it, it will crash or jump off into the woods never to be seen again... It is up to the programmer to convey the message on what the function expects and the type of parameters, luckily you could disassemble it and find out from looking at the stack pointer and expected address by way of the 'stack pointer' (sp) to find out the type of parameters.
Using PE Explorer for instance, you can find out what functions are used and examine the disassembly dump...
Hope this helps,
Best regards,
Tom.
It will either crash in the DLL code (since it got passed corrupt data), or: I think Visual C++ adds code in debug builds to detect this type of problem. It will say something like: "The value of ESP was not saved across a function call", and will point to code near the call. It helps but isn't totally robust - I don't think it'll stop you passing in the wrong but same-sized argument (eg. int instead of a char* parameter on x86). As other answers say, you just have to know, really.
There is no general answer. The Standard mandates that certain exceptions be thrown in certain circumstances, but aside from that describes how a conforming program will be executed, and sometimes says that certain violations must result in a diagnostic. (There may be something more specific here or there, but I certainly don't remember one.)
What the code is doing there isn't according to the Standard, and since there is a cast the compiler is entitled to go ahead and do whatever stupid thing the programmer wants without complaint. This would therefore be an implementation issue.
You could check your implementation documentation, but it's probably not there either. You could experiment, or study how function calls are done on your implementation.
Unfortunately, the answer is very likely to be that it'll screw something up without being immediately obvious.
Generally if you are calling LoadLibrary and GetProcByAddrees you have documentation that tells you the prototype. Even more commonly like with all of the windows.dll you are provided a header file. While this will cause an error if wrong its usually very easy to observe and not the kind of error that will sneak into production.
Most C/C++ compilers have the caller set up the stack before the call, and readjust the stack pointer afterwards. If the called function does not use pointer or reference arguments, there will be no memory corruption, although the results will be worthless. And as rerun says, pointer/reference mistakes almost always show up with a modicum of testing.